Becoming Human: From pointing gestures to syntax (Advances in Consciousness Research)

Becoming Human Advances in Consciousness Research (AiCR) Provides a forum for scholars from different scientific disc...

Author: Teresa Bejarano

101 downloads 585 Views 2MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Becoming Human

Advances in Consciousness Research (AiCR) Provides a forum for scholars from different scientific disciplines and fields of knowledge who study consciousness in its multifaceted aspects. Thus the Series includes (but is not limited to) the various areas of cognitive science, including cognitive psychology, brain science, philosophy and linguistics. The orientation of the series is toward developing new interdisciplinary and integrative approaches for the investigation, description and theory of consciousness, as well as the practical consequences of this research for the individual in society. From 1999 the Series consists of two subseries that cover the most important types of contributions to consciousness studies: Series A: Theory and Method. Contributions to the development of theory and method in the study of consciousness; Series B: Research in Progress. Experimental, descriptive and clinical research in consciousness. This book is a contribution to Series A. For an overview of all books published in this series, please see http://benjamins.com/catalog/aicr

Editor Maxim I. Stamenov

Bulgarian Academy of Sciences

Editorial Board David J. Chalmers

Steven Laureys

Axel Cleeremans

George Mandler

Gordon G. Globus

John R. Searle

Christof Koch

Petra Stoerig

Australian National University Université Libre de Bruxelles University of California Irvine California Institute of Technology

University of Liège University of California at San Diego University of California at Berkeley Universität Düsseldorf

Stephen M. Kosslyn Harvard University

Volume 81 Becoming Human. From pointing gestures to syntax by Teresa Bejarano

Becoming Human From pointing gestures to syntax

Teresa Bejarano University of Sevilla

John Benjamins Publishing Company Amsterdamâ•›/â•›Philadelphia

8

TM

The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984.

Library of Congress Cataloging-in-Publication Data Bejarano, Teresa. Becoming human : from pointing gestures to syntax / Teresa Bejarano. p. cm. (Advances in Consciousness Research, issn 1381-589X ; v. 81) Includes bibliographical references and index. 1. Language acquisition. 2. Gesture. 3. Grammar, Comparative and general--Syntax. 4. Psycholinguistics. I. Title. P118.B44â•…â•… 2011 153--dc22 isbn 978 90 272 5217 3 (Hb ; alk. paper) isbn 978 90 272 8679 6 (Eb)

2011009711

© 2011 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Co. · P.O. Box 36224 · 1020 me Amsterdam · The Netherlands John Benjamins North America · P.O. Box 27519 · Philadelphia pa 19118-0519 · usa

In memoriam Víctor Sánchez de Zavala

Table of contents Introduction 1. On the nature of an hypothesis on human abilities â•… 1 2. Developments over the last 20 yearsâ•… 2 3. Outlining the proposal put forward hereâ•… 4 4. A brief description of the sections of this bookâ•… 6

1

section one.â•‡ Evolutionary precursors chapter 1

Monkeys’ mirror neurons 1.1 Mirror neurons in macaques, a significant discovery and a controversial interpretationâ•… 13 1.2 On supposed ‘social’ utility: Is the role of macaques’ mirror neurons to understand and predict behaviour of conspecifics?â•… 14 1.3 Mirror neurons, a secondary effect of self-perceptible movementsâ•… 15 1.3.1 Self-visible hands: Connecting Keysers & Perrett with Piagetâ•… 15 1.3.2 Neonatal imitation and mirror neurons associated with the mouth: An open question â•… 16 1.4 Simulation or expectation? The crucial question about the abilities of non-human primatesâ•… 20 1.4.1 Animal behaviour and expectationâ•… 21 1.4.2 The new type of expectation which appeared alongside mirroring: Describing the difference between my hypothesis and that of Keysers & Perrett (2004) â•… 24 1.5 An adaptive but ‘non-social’ role? A speculation which would act as an argument in favour, were it to enjoy slightly more supportâ•… 28 1.6 The relationship between the central hypothesis of this chapter and the above speculation â•… 34 1.7 Summarizing the hypothesis defended in this chapter â•… 35

13

chapter 2

Chimpanzees and the visual field of the conspecific 2.1 From mirror neurons to the ability to reckon the visual perceptions of the conspecificâ•… 37

37

 Becoming Human

2.2

2.3

2.1.1 From the perception of the matching between another’s and one’s own body to the ascription of visual perceptions to conspecifics â•… 37 2.1.2 The side-effect of the perception of the matching between one’s own body and another’s body â•… 38 Are chimpanzees merely exploiting visual findings of the conspecific? Introducing the current debateâ•… 39 2.2.1 Ascribing visual perceptions: The experiments carried out by Hare, Call & Tomasello â•… 39 2.2.2 Povinelli’s argument: The conspecific’s blindfolded eyesâ•… 41 2.2.3 Blindfolded eyes and adaptive usefulness â•… 42 2.2.4 Ravens and chimpanzeesâ•… 43 What is involved in ascribing visual perceptions to conspecifics?â•… 44

section two.â•‡ The basic human ability chapter 3

The three modes of processing the eyes of others 49 3.1 The progressive convergence of this issue and the ‘theory of mind’â•… 49 3.2 What are the three different modes of processing the eyes of others to be proposed here?â•… 50 3.2.1 The difference between the first mode and the second one â•… 50 3.2.2 The third mode. The most basic and primaeval exclusively human capability â•… 52 3.3 Why would the ‘third mode’ be so demanding? â•… 53 3.3.1 The peculiarity of visual perceptions ascribed in the ‘third mode’â•… 53 3.3.2 Radically not-own visual field, expectation, simulationâ•… 54 3.3.3 Self-recognition in the mirror and perception of a radically not-own visual field: Facing a potential objectionâ•… 55 3.4 The communicative use of sight directionâ•… 57 3.5 The white of the eyeâ•… 58 chapter 4

Pointing gestures 61 4.1 Pointing gestures in children â•… 61 4.2 Why don’t apes point? Distinguishing the indirect cause from the direct cause â•… 63 4.3 The requesting gestures of the apes of Gómez and Leavens â•… 65 4.4 Communicative action versus communicatively shaped action â•… 66 4.5 Commenting about Grice and also about triadic communication â•… 68 4.6 Some unavoidable issues which must be dealt with â•… 70 4.6.1 Wild chimpanzees that extend their arm in the direction of an object: How could those gestures really happen and yet be so scarce? â•… 70

Table of contents 

4.6.2 Dogs and chimpanzees compared to the human pointing gestureâ•… 70 4.7 True pointing in chimpanzees brought up by humans? â•… 71 4.8 Lack of motivation in chimpanzees? Seeing more in detail the difference between Tomasello’s proposal and the one which is being defended in this chapter â•… 72 4.9 What are the requirements for a genuine understanding of pointing gestures? A closer look at the expectation/simulation dichotomyâ•… 74 4.9.1 Going back to the expectation/simulation dichotomy â•… 74 4.9.2 The different manners in which somebody else’s body may be informative â•… 76 chapter 5

Four-hand co-operative actions and children’s interpersonal co-ordination games 5.1 Co-operative actions â•… 79 5.1.1 Four-hand actionâ•… 79 5.1.2 A comparison with co-operation among chimpanzees â•… 81 5.2 The interpersonal motor co-ordination gameâ•… 84 5.2.1 From the adaptive advantages of play in general to the interweaving of evolution and culture â•… 84 5.2.2 An interpersonal motor co-ordination playâ•… 84 5.2.3 The ‘tea, chocolate and coffee’ game: The learning processâ•… 85 5.3 Enjoyable communicative imitation â•… 87

79

section three.â•‡ Specifying some necessary requisites of language chapter 6

Saussurean parity and the perception of a radically not-own self 6.1 Toward a formulation of the problemâ•… 91 6.1.1 Production and reception in animal communicationâ•… 91 6.1.2 The problem involved in Saussurean parityâ•… 94 6.2 Saussurean parity and the second mental line: Our suggestion for resolving the problemâ•… 95 6.3 ‘Motor reception during the learning stage’, the reliable core of the ‘motor theory of speech perception’â•… 97 6.3.1 Liberman’s theory: The ‘motor theory of speech perception’â•… 97 6.3.2 Piagetian premotor planâ•… 98 6.3.3 What happens once acquisition has come to an end? A proposal for a reformulation of the motor theory of speech receptionâ•… 99 6.4 The comprehension of deictics which ‘cannot be repeated as an echo’: What about the egocentrism of deixis? â•… 101

91



Becoming Human

6.4.1 Deixis â•… 101 6.4.2 The egocentrism of deixis, and deictics ‘which cannot be repeated as an echo’ â•… 102 chapter 7 About evocation 7.1 What is it we mean by “evocation”? â•… 105 7.2 Do animals have the ability to evoke absent objects as such? â•… 106 7.2.1 Some potentially relevant dataâ•… 106 7.2.2 Outlining one possibility â•… 107 7.2.3 Is a clear answer to be found in research with chimpanzees? â•… 110 7.3 How then would evocation have originated?â•… 111

105

chapter 8 Symbolic play: Developments in the simulatory centre 113 8.1 Describing symbolic playâ•… 113 8.2 How movements have adapted throughout evolution â•… 115 8.3 Is simulation linked to the real movements of symbolic play? â•… 116 8.4 What is repeated in vacuo is a previously perceived model, not one’s own behaviour â•… 117 8.5 The big extension of the simulatory centre (i.e., the new function that got to be performed by this centre): How would a truly simulatory interpretation of a non-interacting model have been achieved? â•… 118 8.6 An indication in favour: Comparing symbolic play and adult-feeding gameâ•… 121 8.6.1 Why might it be helpful to pay attention to this type of game now?â•… 121 8.6.2 Similarities and differences between symbolic play and the adult-feeding gameâ•… 122 8.6.3 The simulatory centre in the feeding game and in symbolic playâ•… 124 8.7 When might the kinaesthetic interpretation of a non-interacting model become dependent on the basic human ability? Insisting on the ideas presented in 8.5 â•… 125 8.7.1 The core of my hypothesis: Fictionalisation of postures is required by latent sequential imitationâ•… 125 8.7.2 Can forward models refute the previous proposal?â•… 127 8.8 Motor learning and symbolic play: A convenient comparisonâ•… 128 8.9 How does symbolic play come to be symbolic? â•… 129 8.9.1 From motor simulation to the evocation of absent objects: Completing the description of ‘the big extension’ â•… 129 8.9.2 Addressing a seemingly vicious circle: What is in the mind of the producer when he or she decides to bring about a certain specific evocation?â•… 131

Table of contents 

8.10 From the basic ability to symbolic play: A proposal on ontogenesis and a question about evolutionary-historical origins â•… 133 chapter 9

From symbolic play to linguistic symbol 137 9.1 Adaptation to the model, a feature shared by the movements of symbolic play and the articulatory-phonetic movements of language â•… 137 9.1.1 Evidence that encourages us to search for a similarity â•… 137 9.1.2 The similarity in the respective motor aspects â•… 139 9.1.3 Back to Saussurean parity once moreâ•… 139 9.2 The special character of articulatory-phonetic imitations: Arbitrary and with no adaptation to the environment â•… 140 9.2.1 Articulatory-phonetic patterns and imitation of movements: The social or phonemic modelâ•… 140 9.2.2 The fuller the control exercised by the ‘motor adaptation to a model’, the more stable the model â•… 142 9.3 Articulatory-phonetic pattern and evocation: The analogy with symbolic playâ•… 144 9.4 Is the great difference between playful symbol and linguistic symbol a genuine obstacle to the connection between them? â•… 147 9.4.1 What does this difference consist of?â•… 147 9.4.2 Children and the comprehension of displaced speech: The possible role of echolalic repetitions â•… 148 9.4.3 Latent, displayed, latent: Is my hypothesis on simulation overly complex? â•… 149 9.4.4 Why might the latent route to evocation occur more readily in language? â•… 150 9.5 The phonemic-social model, the expansion of working memory and inner speech â•… 151 9.6 Language and the adaptive advantages of symbolic play â•… 154 9.7 Taking a step towards the next section: ‘Linguistic symbol’ versus linguistic meaningâ•… 157 section four.â•‡ The origin of predication and syntax chapter 10

From the general exposition to the crucial requisite achieved by the protodeclarative 10.1 The origin of syntax: Overview of the hypothesisâ•… 161 10.2 Some initial clarificationsâ•… 162 10.2.1 Biology and historyâ•… 162 10.2.2 Ontogenesis and historical originâ•… 164

161

 Becoming Human

10.3 Communicative functions and the need for syntax: Presyntactic conative messages? â•… 165 10.3.1 A first (invalid) responseâ•… 165 10.3.2 Toward a second responseâ•… 166 10.4 Learning by imitation and the protodeclarative: Toward genuine linguistic meaningâ•… 169 10.4.1 The importance of the imitative learning of signsâ•… 169 10.4.2 Articulatory-phonetic patterns, intonation, and (inserting a digression in the argument) manual versus vocal origin â•… 172 10.4.3 Bleaching and the protodeclarative â•… 175 10.5 Toward our next questionâ•… 176 chapter 11

Toward the original perception of false beliefs of others: The importance of the learned sign 177 11.1 The perception of false beliefs of others: The debates in the bibliographyâ•… 177 11.1.1 The classic experiments and the first attempts at lowering the age of perceptionâ•… 177 11.1.2 The most recent attempts to lower the age of perception of false beliefs of othersâ•… 179 11.2 False belief and symbolic play. A question to be slotted inâ•… 180 11.3 The origins of the concept of ‘truth’: Inserting an even more peripheral point than the previous oneâ•… 182 11.4 The most easily-perceived false beliefs of others: ‘Second-person’ and received through languageâ•… 183 11.5 What types of linguistic messages are able to reveal their producer’s false beliefs to hearers? â•… 185 11.5.1 Weighing up the candidatesâ•… 185 11.5.2 On historical origins and added constrictions: A methodological reflection â•… 185 11.5.3 Conative messages and what their producer believesâ•… 186 11.6 Developing the importance of protodeclaratives: The bridge between imitation of complex new motor patterns and predicative communication â•… 186 chapter 12

Between motor learning and the perception of beliefs of others: The crucial role of the protodeclarative 12.1 From learned signs to the protodeclarative â•… 189 12.2 Vocal Saussurean parity and motor imitation of complex and new patterns â•… 190 12.3 From ‘choral holophrase’ to the Saussurean vocal signâ•… 192

189

Table of contents 

12.4 Why did vocal signs based on complex motor imitation come into being? A question which can no longer be put offâ•… 194 12.5 From motor imitation to the consequences of the protodeclarative: What questions arise? â•… 196 12.6 Excessive emphasis on the motor component? Making explicit the anthropological approach underlying the hypothesisâ•… 197 12.7 Language, the pointing gesture and symbolic play, three modes of involvement of the simulatory centre â•… 198 12.8 The unity of the ‘theory of mind’â•… 200 12.9 Derivations from the perception of beliefs of others: Introducing the following sectionâ•… 201

section five.â•‡ Pregrammatical, theme-rheme syntax: Revisiting Frege and Vygotsky chapter 13

From beliefs of others to communicative predication 13.1 From beliefs of others to predication: A relationship that can be interpreted in two very different waysâ•… 205 13.2 Predications of response or reply and quotingâ•… 208 13.3 The embedding of the interlocutor’s message inside the child’s message: The interpersonal origin of recursivity â•… 210 13.4 Syntactic recursivity and human exclusivity: My disagreement with Hauser, Chomsky & Fitch â•… 212 13.5 The predicate ‘no’ which occurs in the child’s first predicationsâ•… 214 13.6 Initial predications and the sophisticated task of opaque contexts: A comparison that leads us to stress the origin of forked archives â•… 216 13.7 The non-redundant predicate: Our next stage of expositionâ•… 217

205

chapter 14

Revisiting Frege: How can a predication be at one and the same time true and not redundant? 14.1 Why did Frege coin the term Sinn?â•… 219 14.2 Is the problem really solved simply by coining the Sinn? â•… 220 14.3 Moving away from the route taken by Frege â•… 223 14.3.1 Where then would the reference of the term be? The bleaching of the subject term and Carstairs-McCarthy’s questionâ•… 223 14.3.2 The reference of the complete statements: Stressing beyond Frege the difference between within and outside the speaker’s point of viewâ•… 224 14.3.3 The predicate of judgement of identity: Another point where we move away from Fregeâ•… 226

219

 Becoming Human

14.4 What exactly have we taken from Frege?â•… 227 14.5 Toward other authors and other aspectsâ•… 227 chapter 15 Communicative functions, Vygotskian ‘pure predicate’ and conceptual semantics: Various questions about predication 15.1 Comparing predications and ordersâ•… 229 15.1.1 From Jakobson to Searle’s ‘direction of fit’ â•… 229 15.1.2 Predication as an order to be obeyed in the mindâ•… 230 15.2 Pre-linguistic semantics: What do we do with it in our hypothesis about predication?â•… 232 15.2.1 Richness of details versus syntactic articulationâ•… 232 15.2.2 Vygotsky’s presumed ‘pure predicate’: Why are the presumed subject and presumed predicate interchangeable? â•… 233 15.2.3 A pause and a reflection: The basic human capacity and predicationâ•… 236 15.2.4 Studies on conceptual or pre-linguistic semantics: Their value and limitsâ•… 237

229

chapter 16 Connecting with the concepts of theme (or topic) and rheme (or comment) 239 16.1 The Prague school hypothesisâ•… 239 16.2 Difficulties raised by the initial definition of ‘theme’: A hypothesis for a reformulationâ•… 242 16.2.1 Presenting the hypothesis: A first example â•… 242 16.2.2 On the heels of another example. The different positions that can be taken up faced with this type of difficultiesâ•… 244 16.3 Dishonest predication: An interesting clarificationâ•… 247 16.4 Is my reformulation of the concept of ‘theme’ too complex and challenging?â•… 249 16.4.1 Predication and metabelief: Is the second of this pair really the more complex element? â•… 249 16.4.2 The hearer’s belief: Included, but not displayed, in the predicationâ•… 249 16.5 Looking toward the next sectionâ•… 251

section six.â•‡ From original to present-day predication” Links and grammatical syntax chapter 17

Meaning and the different types of link 17.1 Opening out the contrast between word and symbolâ•… 257

257

Table of contents 

17.2 The innumerable speech episodes and the brain â•… 259 17.3 Meaning as a giant unconscious cerebral edificeâ•… 261 17.4 ‘Typical links’: How can these intervene in a unified explanation of different phenomena? â•… 263 17.4.1 Metaphors and tautologies â•… 264 17.4.2 Why are proper nouns so difficult to remember?â•… 265 17.4.3 Links in ‘langue syntagms’ and in speech episodes: Identifying degrees of the same phenomenonâ•… 265 17.4.4 The other side of the coin: Some unwanted secondary effects of linksâ•… 267 17.5 Links and the perspectivist nature of meaningâ•… 268 17.6 The peculiarity of properly grammatical or abstract links: Raising the question of their historical originâ•… 269 chapter 18 Expressive speech and syntactic links: A hypothesis on the historic origins of those links, and on some other questions, along the way 271 18.1 General overview of the chapterâ•… 271 18.2 The ‘own past beliefs’ addressed by the Theory of Mindâ•… 272 18.2.1 The child forgets its own past beliefs: An absurd result? â•… 272 18.2.2 Remembering one’s own linguistic message versus remembering the linguistic message of othersâ•… 274 18.2.3 Surprise, Davidson and the ‘concept of belief ’ â•… 274 18.3 False belief and out-of-date true belief: Taking the question beyond the Theory of Mindâ•… 276 18.4 Out-of-date perceptions and chimpanzees: Interpreting the results of some old experimentsâ•… 279 18.5 Do adults always maintain their own past belief? Recycling an argument from Dennettâ•… 281 18.6 Halting temporarily this chapter’s progress: How much have we achieved and how much is still to be done? â•… 283 18.7 Expressive inner speechâ•… 283 18.7.1 Inner speech, emotional reactions and sudden changes of beliefâ•… 283 18.7.2 With no syntactic process, but with syntactic linksâ•… 285 18.7.3 Syntactic links and symbolic evocation: How are these related? â•… 287 18.7.4 Which is really the important characteristic: Fully-constituted meanings or inner speech? â•… 288 18.8 From simply a secondary effect to a useful resource: The relationship between own past beliefs and tracks or numbersâ•… 290 18.9 Past beliefs and the composition of predications not based in theme and rhemeâ•… 295

 Becoming Human

18.10 Expressive speech and disoriented recipients: The point of historical origin of grammatical links, at lastâ•… 297 chapter 19

Historical grammaticalisation: The answers are lacking, but the questions are good 303 19.1 Theme/rheme syntax, and grammaticalised syntax: Suggesting two historic stagesâ•… 303 19.2 The influence of cultural learning on the cognitive abilities themselves â•… 305 19.3 Evolutionary precursors for links? â•… 306 19.4 Conative holophrases and verbal imperatives: Arthur Diamond’s hurried identificationâ•… 307 19.5 How, then, did verbs originate? â•… 309 19.5.1 What can we say about verbs?â•… 309 19.5.2 What linguistic signs would be chosen for this egocentric use? An unavoidable issue which must be dealt withâ•… 312 19.6 Conjunctions and relatives: Repeating the classic suggestion that they originated as a result of deictics â•… 313 19.7 Heralding a return to firmer groundâ•… 314 section seven.â•‡ Syntax beyond predication chapter 20

Interrogative communication 20.1 Characterising the interrogative communicative functionâ•… 317 20.1.1 The successive definitions of the interrogativeâ•… 317 20.1.2 Questioning, predication and syntaxâ•… 318 20.2 Animal curiosity and human questioning â•… 318 20.3 The need to be able to say what one does not knowâ•… 319 20.4 Predication, questioning and ‘Theory of Mind’â•… 322 20.4.1 Merely insufficient, but not false, beliefs â•… 322 20.4.2 The required contrast with the longed-for sufficiency: Going deeper into the difficulty of partial questioningâ•… 324 20.5 Echo questions â•… 327 20.5.1 The lack of requirements in echo questions: From echo questions to predicationâ•… 327 20.5.2 Echo questions, choral holophrases, Saussurean signs: Three different modes of imitation â•… 329 20.5.3 Echo questions on the auditory and real-world levels â•… 330 20.5.4 From echo questioning to normal questionsâ•… 331 20.6 And what of syntactic subordination?â•… 335

317

Table of contents 

chapter 21

Toward complex syntax: The crucial role of reported speech 21.1 The beliefs of others are not only perceived but also explicitated: The first fruit of reported speechâ•… 337 21.2 Deictic derivativesâ•… 339 21.3 Second-person allocentrism: An intermediate milestone which occurs in both consequences of the reported messageâ•… 343 21.4 Difficulties, advantages and consequences of the deictic derivativeâ•… 346 21.4.1 Postposition of the centre versus linguistic platform â•… 346 21.4.2 Dispensable perhaps, but certainly highly useful: Deictic derivatives outside ‘indirect reported speech’â•… 349 21.4.3 From vague adjectives to the pure dimensionâ•… 350 21.4.4 ‘More’, ‘another’, ‘next’: Deictics or deictic derivatives, as the case may be â•… 353 21.5 From the indirect reported message to writingâ•… 356

337

Preliminary conclusion and the main thesis recapitulated References Glossary Author index Subject index

359 363 391 395 401

Introduction 1.

On the nature of an hypothesis on human abilities

The ability to perceive other minds is the starting point from which we shall attempt our explanation of different human abilities.1 It is my contention that human beings are the only animal capable of conceiving the inner states of another individual looking at them. A second centre would thus be present in the human mind, and it is on this additional centre that the ability to perceive other selves depends. Of all the sources of arguments for this hypothesis, I will address just three. Firstly, there is the link with antecedents in evolution. We would have to explain why this very special animal, the human being, appeared among primates. Likewise, we would also have to clarify how human beings differ from apes. Secondly, whichever hypothesis about the nature of human beings will have to account for a set of specifically human characteristics and behaviours. From the outset, and as a bare minimum, we should expect such a hypothesis to focus on the ‘theory (which the subject has) of (another’s and of one’s own) mind’, or theory of mind (ToM), for short, as well as on finger-pointing and different aspects of language. Thirdly, any theoretical approach put forward to explain these phenomena would be obliged to (at very least) look at the child development. Certainly, I am not advocating an absurd recapitulationism. There is obviously no reason why either the paths of evolution or of historical development should be reflected with any precision in ontogeny. Nevertheless, we might find some similarities between the postulated emergence of human characteristics and the data on child development. Stating these requirements is easy; meeting them is another matter altogether. I have certainly tried very hard to do this, but I am well aware of the results of my labour. What we have here is a draft, still raw, a seed which will require endless correction and adding to, not to mention radical amendments to the whole. My hope, nevertheless, is that it will prove to be a timely study. There is no contradiction in this. My optimism is based on the increasing demand right now for more and more studies of this kind. I, for one, see this as a task for precisely our generation. It is only now that contributions on human abilities from many different fields are finally available, and as a result that it has become urgent to ask how all these contributions might

1. I thank to the director of the collection and to the two anonymous referees, who have substantially contributed to the improvement of this manuscript.



Becoming Human

fit together. In the following section I will try to show how we have reached this privileged situation.

2.

Developments over the last 20 years

For 20 years or so (from 1990, let us say) the hopes of clarifying the make-up and genesis of exclusively human abilities have taken on a radically new lease of life. In this, a role has been played by studies which show great variation both in their methodology and in their objects. Let us examine the main lines involved. First, mention must be made of research labelled as the ‘theory of mind’, or ToM. It is becoming ever more clear that, whereas having mental states is a characteristic shared by (at least) all mammals, the ability to think about mental states, either another’s or one’s own, appears only in humans, however. Some of the questions debated in this area are: do children succeed in perceiving the falsehood of beliefs of others first or do they, contrastingly, remember their own false past beliefs first? What is the role of language in perceiving beliefs as false? Or, with regard to one’s own current beliefs, can these be considered mental states, and no longer simply reality? Moreover, in the last few years, research has been addressing ever more intensely the perception of those states of others which, like visual attention, we might consider less intellectually demanding than belief. This new emphasis is extremely interesting. The new range of experiments with chimpanzees is a further promising area of progress. Attempts to teach animals language have been left behind, and the questions to which researchers are now seeking answers are chosen within very specific hypotheses derived from ‘theory of mind’ studies. For example, one of the main subjects for discussion is the following: when chimpanzees become aware of what a conspecific is (or is not) looking at, what exactly does this awareness consist of? Do they simply make use of the visual findings of their conspecific, or, in contrast, are they able to ascribe perceptions to their conspecific? Let us point out another important question. How is it that, on the one hand, chimpanzees in the wild try very sensibly not to make requesting gestures until the addressee is looking at them, and yet, on the other hand, they never use gestures which, as eye- or hand-pointing, can only be interpreted as communicative, i.e., can not be interpreted as ‘environmentally shaped’ behaviours? Children, by contrast, at about one year of age, understand and produce communicative gestures such as pointing at an object, or looking back and forth between the addressee and the object. What is the relationship between these achievements the child makes and the chimpanzee’s combination of capability and incapability? A contribution has also been made by new developments in neurophysiology, and by new methods of exploring the brain. The discovery of so-called ‘mirror neurons’ in macaques is a prime example. These neurons are triggered both when the animal itself makes grasping movements and when it sees those movements being made by another. Here I reject that, in their origin, these neurons have the adaptive function of

Introduction

understanding or predicting the behaviour of others. However, I stress the idea that these neurons are probably associated with self-visibility of the hand and, consequently, with a feature exclusive to primates. It is, therefore, tempting to see in these neurons a step (however remote) towards the exclusively human abilities the ‘theory of the mind’ has been investigating. Nevertheless, we must first answer this question: what is it that occurs in macaques while they observe hand movements of others? Is it really simulation, that is, latent motor imitation? Likewise, the study of language acquisition has become a rapidly advancing discipline where conclusions are more and more well-founded. In the early months after the holophrastic stage, the child only acquires very restricted constructions and links. We now have a fascinating image of how several of these links are progressively generalised through a long series of steps. These data are clear, but they lead us to issues that are considerably less clear. Where does the compositionality of meaning come from? Is it previous to language or not? If we opt for saying that there is no compositionality (of separately thought elements) either in perception or in prelinguistic thought, the conclusion will be that in the historical origin, meanings non-moulded by a previous compositionality must have been combined. Should this option be then rejected once and for all? In short, progress in each of these approaches has brought into view numerous questions to which thought could not previously be given and which we, in contrast, can now formulate very precisely. In my opinion, this is a great achievement. The agenda formed by these questions is perhaps finally offering us the much-needed guide programme for research into human abilities. For this reason, I would say that we have right now probably our first ever opportunity to research human nature in depth. Of course, philosophers have struggled with this subject for centuries. However, given the enormous amount of information they lacked, it should hardly surprise us that they had no chance of success. Until the middle of the19th century there was not any scientific proposal that human beings had descended from other primates. Until about the same date nobody thought that the systematic study of child development could throw light on the human constitution. And only much later did these ideas really begin to yield fruit. This bears no comparison whatsoever with our situation today. The practical outworking of all this is that we, the generation at work today, must take advantage of this magnificent convergence of factors flowing in our favour. We have to address the question of why it was precisely among primates that this exceptional animal, the human, appeared. A second question seeks to clarify why human beings differ from non-human primates. We must also try to determine the relationships between the different exclusively human abilities, the order in which they would have appeared, the role played by language or the place and sequence of the different linguistic phases. When considering the first question, that is, why it was among primates that humans appeared, the following objection could be raised: is this the right question? Nowadays (even if we dispense with the increasing insistence on the idea that the bird





Becoming Human

model will tell us how to build a brain that can do vocal learning), the primatocentric bias is the object of fierce criticism and ever increasing opposition. The decisive factors would be neither genes nor evolutionary proximity but habitat and niche, and hominids’ habitats have been dramatically different from those of apes: these are the most recent academic trends and, in my opinion, they have made extremely valuable contributions. However, I think that we must still try to answer the above question. Or, more precisely, we must distinguish between two levels. In my view, on the one hand, the indirect cause of human peculiarities is to be found in habitat-related changes and the ensuing changes in life style. On the other hand, the direct cause must be traced back to the cognitive novelties which, against a primate background, evolution moulded. There is no doubt that those cognitive novelties were shaped to cope with the urgent needs imposed by a new and more cooperative life style. However, the very nature of what eventually emerged would have been radically different if the starting point had not been the last common ancestor of chimpanzees and humans. It is essential to distinguish between these two causes. And since I am going to concentrate on the cognitive novelty or direct cause, the question of why it was among primates that humans appeared is still a useful and relevant question.

3.

Outlining the proposal put forward here

The general aim of this book is to stress the crucial role played by the ability to perceive the minds of others. Undoubtedly, interpersonality is involved both in linguistic communication and in all other kinds of cultural transmission. And, also undoubtedly, it is these communicative processes which, by allowing the historic accumulation of successive findings, gave rise to the unfolding of human abilities. However, what will be proposed here is not this but that the human mind itself, and not just its fruits or results, would have originated in the perception of the minds of others. How does all this relate to the simulationist*2 subtrend of the studies of the ‘theory of mind’*? In my opinion, simulation is often incorrectly invoked these days. As a result, monkeys’ ‘mirror neurons’ have become the flagship of simulationism. However, I will suggest that the self-visible hand of primates would merely have given rise to a new kind of expectation. An improvement in this kind of expectation would also suffice to explain the ability (which chimpanzees probably have) to ascribe visual perceptions to a conspecific. These two capabilities, contrary to what many researchers have stated, would not require any latent imitation at all. I quite agree that in these abilities of non-human primates, a correspondence between a body seen (i.e., a body of a conspecific) and an internal state began to emerge. However, these two features of primates could be accounted for by expectation, that is, within the general mechanism of animal behaviour. 2. Asterisks point out that the previous word is included in the Glossary.

Introduction

But if this important new step in macaques and chimpanzees does not yet require simulation, the key question is this: when would the process of genuine simulation have come into being? My answer is that expectation is no longer useful when a relationship to oneself is attributed to the conspecific’s interiority. If I attribute to someone else a visual percept that includes myself (or if, when I detect the kinaesthetic-postural interiority associated to someone else’s movement, I interpret this movement as an attempt at communication with me), then I have had to replace the old procedure of expectation by that of simulation. The interiority of the conspecific’s body would then be, at last and for the very first time, a radically other self, one which could only be conceived in a ‘second centre’ inside one’s own mind. This second centre would be the site of true simulation. There may be some who raise objections to this last sentence – e. g., “I can recognize that someone is angry without ‘simulating’ their anger”. For this reason, I am going to give a brief summary of my more elaborate answer which comes later. Somebody’s anger can be recognized at two different levels. On the one hand, an animal can adequately recognize and immediately respond to another animal’s anger signals without ascribing any mental state to the angry conspecific. At this level there is no simulation at all. On the other hand, by simulation I can imagine another person’s anger without myself experiencing that anger, but as the anger of someone else. In my view, it is thus that we should interpret the exclusively human process involved in simulation. But let us return to our topic. Once this second centre had thus emerged, it could begin to sustain other processes different from those we have already mentioned. One of these processes would be the perception of beliefs different to one’s own. The suggested branching of one’s own mind into two different lines seems perfectly suited to preventing interferences between a given reality and false beliefs of others about such a reality. It is well known that the perception of false beliefs of others has become the flagship of ‘theory of the mind’ research. In fact, the focus on this question has, on occasion, been excessive. However, despite all the research on the subject, I believe its crucial role has not yet been fully appreciated. What then is this crucial role? Let us look at the perception of the belief of a ‘second person’, that is, the perception of the incomplete, incorrect or out of date knowledge a conversation partner has shown to possess on an issue. This kind of perception of beliefs of others is different from that observed in the classic tests (for example, in those tests such as the displaced toy test, where I have to infer the false belief of someone who, as a character in a sketch, is neither interacting with nor speaking to me). Compared with this, ‘second person’ perception would not only be easier, but also more decisive, inasmuch as it would have to do with predicative communication. It is true that many authors have stated that the communicative function of predication would have come into being precisely in order to complete, correct or update the belief of addressees. I fully agree. However, what I am suggesting is not this, but that the belief of addressees would constitute the content which the element subject to predication (or rather, the theme*) has for the speaker. Correlatively, the predicate





Becoming Human

(or rather, the rheme*) would be the element which, according to the speaker, must be added to complete, correct or update the addressee’s belief. As a result, the perception of beliefs of others would have given rise not only to the need for predicative communication, but also to the very form of predication, that is, to the hinge (or composition) between the element subject to predication and the predicate. According to this, a key issue is how someone comes to understand that someone else’s belief conflicts with reality. I suggest that that understanding is not inevitably the product of a predicative statement, that is, in no way has it to be derived from the thought ‘That belief is false’. The opposite is more likely to occur, i.e., the perception of someone else’s belief would actually be the triggering point for genuine predication, both at language level and at thought level. After presenting this hypothesis, I will defend the necessary theoretical complement, namely, the idea that the profusion of details peculiar to prelinguistic perception does not involve any composition of elements which are attended to separately. According to this view, the first, pre-grammatical, syntax would have originated as a consequence of the perception of (false or incomplete or out of date) beliefs of others. But what about the inverse relationship? What might I say about the more classic debate about whether or not language is the original cause of the perception of beliefs of others? I am inclined to answer that it is. But this does not lead me into a vicious circle: my idea is that the perception of false beliefs of others would have at its origin a language which is prior to syntax and predication, that is, a language of holophrases* which ask and call. The perception of beliefs of others would have given rise to the theme/rheme hinge, and consequently opened the door to compositionality. Is this hinge the original syntactic composition? It is certainly an empirically universal structure, one that is audible in intonation. There is, however, an enormous distance between this simple structure and the syntax observed in all languages. In all languages, the form of predication survives independently of its communicative uses, and there are grammatical links and syntactic categories. Could the genesis of the theme/rheme combination have been sufficient for all this to emerge from historical development? I will mention one last issue in this summary. After looking at the origin of the syntax of predication, or simple sentence syntax, we will have to address the origin of subordinate sentences. We will suggest that subordination became necessary only with ‘indirect reported speech’. The distinction between reported false beliefs (that is, the beliefs of a third person) and real information must be made clear since it cannot be left in the hands of the recipient and her optimisation of its relevance.

4.

A brief description of the sections of this book

Section One will address the nearest evolutionary antecedents of the human basic ability we are proposing. In the chapter on mirror neurons I suggest that they are not

Introduction

involved in any kind of simulation. Starting from the emphasis that other authors have placed on self-perceptible movements, I propose that mirroring is only a very special kind of expectation. It may be special, but, like all expectations, it can only refer to contents which, under certain circumstances, would be experienced by the subject having the expectation. In the next chapter I take this hypothesis beyond the mirroring observed in monkeys in order to explain a special capability found in chimpanzees. Chimpanzees successfully reckon the visual field of conspecifics and probably also attribute visual perceptions to them. This is indeed a notable achievement. However, if we accept that the ability of the ascribing chimpanzee involves no real simulation – no emergence of a second mental centre – and only relies on a kind of expectation, then that ability will have the insurmountable limit mentioned above: it will never be able to refer to contents inherently impossible for the subject. Or, more precisely, from the very moment a chimpanzee perceives any gaze (or, more generally, any behaviour) directed at him, the animal would cease to ascribe any kind of internal states to the gazing individual (or to the individual whose conduct is directed at him). In Section Two we begin to discuss the basic ability of the exclusively human mind, in other words, the second mental centre. We will have to study the different modes for processing eyes of others, beginning with the phylogenetically very ancient sensitivity to an eye fixed on us. The second, and much more recent, mode (for processing, in this case not exactly the eyes, but the visual perceptions of others) would be that of chimpanzees: These, as we said in Section One, reckon the visual field of conspecifics and probably also ascribe visual perceptions to them. With the third mode, that is, the ascription of visual perceptions to an eye that is looking at me, we would have before us the launch pad for exclusively human characteristics. Human beings would share with chimpanzees the ability to ascribe visual perceptions, but would have moved a step further. This step, which appears to add only a small difference, actually involves a highly demanding change which also has enormous consequences: the internal states of the fellow would now be perceived as a radically different self, and would, therefore, have to be located in a second centre of the mind. The observing subject could now conceive the self of the fellow as a centre in whose periphery he, the observer, is located. At this point, one inevitably thinks of the classic analogy with heliocentrism. Pointing gestures, including both the gestures made with the eyes or using a finger, as well as four-hand co-operative actions derive directly, I will suggest, from this basic ability (or more exactly, they constitute the original adaptive advantage which would have been responsible for it). Likewise, some children’s play, for example play involving motor co-ordination with a playmate or the fun of mutual imitation, are exercises that aim to boost this ability. Section Three considers the relationship between this basic ability and some requisites of language. It addresses, specifically, Saussurean parity or deep identity between production and reception of language, and also the ability to evoke absent objects as such. On the question of parity, I will bring in especially the ‘motor theory of speech perception’, or more concretely, its reliable core, ‘the motor theory of the





Becoming Human

observational phase in the learning of complex motor patterns’. With respect to evocation, we will begin by asking if animals are able to evoke. We will then study symbolic play and its adaptive usefulness. In the end, it will be suggested that the big extension (or, in other words, the new, added function) of the ‘second mental centre’ would consist in the imitation of new and complex motor patterns. In short, what this section aims to show is how those requirements of language might have derived from the human basic ability (that is, from the second mental centre). However, learned signs, Saussurean parity and evocation are by no means resources powerful enough to produce true words. A word is inherently a part of syntax. Section Four thus begins to turn toward the question of the origin of syntax. I ask how false beliefs of others could be perceived prior to syntactic language. To this end, I will analyse the protodeclarative holophrase*. This kind of communication, which is only useful for linguistic learning and would be, therefore, inconceivable if there were no signs learned through imitation, produces a sign with a precise referential connection in which there will no longer be any ambiguity between request and calling (or between giving a command and asking for an object). This is a dramatically new step when compared to animal communicative signals. But the subsequent consequences of protodeclaratives would have been of greater importance, at least in the historical origins of language. Let us note that the reception of an imperative-vocative message can reveal its producer’s false belief if and only if that message has used a sign of the kind provided by the use of protodeclaratives. In Section Five, I try to show that the theme of predication coincides with the (false, or insufficient or outdated) cognitive content the speaker ascribes to the hearer, or, using the terminology of the studies of the Theory of Mind, with the ‘false belief ’ of the addressee. I will begin by stressing that a child’s first predications are always reply predications. In addition, I will apply my proposal (that the theme is the perceived or conjectured mental state of the addressee) to two classic difficulties: firstly how there can be true and simultaneously non-redundant predicates, and secondly to provide a definition of theme suitable for all cases. The subject of Section Six is the development of predication beyond its suggested origin – that is, beyond theme and rheme combinations. We shall attempt to dig deeper into the well-known thesis that word links are part of their very meaning, and, focussing on these links, we shall differentiate between three levels of abstraction: all the episodes or uses of the word, typical links, and grammatically syntactic links. With regard to the latter, we will reject the explanation of their historical origin as a decanting of the links in discourse syntax, or theme and rheme syntax. What, then, might this origin have been? In this regard, we will address the underuse of language constituted by the merely expressive, non-communicative, speech: what repercussion would this have on the behaviour of casual listeners? The end of this section continues to deal with the origin of grammaticalised syntax, but in a more general way. Can we place anything specific under this label?

Introduction

Finally, Section Seven deals with syntax unrelated to predication. Firstly I analyse questions and their cognitive and linguistic requirements: can we say that all questioning depends on syntax? The final chapter will be devoted to reported speech. In addition to the issue mentioned above – namely, the genesis of subordination, in reported speech –, we will see the breakdown of deixis and its subsequent consequences.



section one

Evolutionary precursors This section aims mainly at providing an interpretation of the capability which chimpanzees probably possess to ascribe visual perceptions to other individuals. Although I accept that chimpanzees are capable of doing this, I will try to prove, nevertheless, that their ability is not tantamount to a simulation of states of others. Consequently, my argument on this issue will have two different parts. The first one will focus on the abilities possessed by non-human primates; the second one will focus on the differences between human beings and chimpanzees. Before we go on, let’s admit that we cannot assume a progression from monkey to chimp to human. Present-day chimpanzees and monkeys need not portray an accurate picture of our common ancestry with these species. In this situation, the study of chimpanzees and monkeys will be a possible road to take (or, at least, not completely illegitimate) if and only if we always remain aware of their limitations. I consider the probable ability of chimpanzees to be an achievement beyond the mere tracking of someone else’s gaze and also beyond the conditioned learning about the relationship of the direction of another individual’s torso to its later behaviour. I suggest that this achievement derives from the power to perceive the matching between one’s own body and another individual’s body. The chimpanzee would reckon what he himself would see from the location, posture and orientation shown by another individual. However, we humans can ascribe visual perceptions and mental states to individuals whose circumstances we, by definition, will never experience, namely, to individuals who are looking at us or interacting with us. If we assume that the chimpanzee is not capable of ascribing these perceptions that are radically and intrinsically different from his own perceptions, then we will be able to explain his inability to point (and, likewise, his inability for cooperative, ‘four-hand’ actions) in terms of cognitive incapacity, not only in terms of lack of cooperative motivation. Certainly, I fully agree that among chimpanzees cooperation never became sufficiently adaptive for evolution to give rise to pointing gestures. However, I also think that the increase in cooperation was only the indirect cause of the evolutionary emergence of pointing gestures. The direct cause for this emergence is to be found in a cognitive ability beyond the reach of chimpanzees. The explanation of the capability which chimpanzees probably possess to ascribe visual perceptions will occupy the entire first Section. At this moment, a possible sceptical and, at the same time, pragmatic objection could be: is the effort really worth it? If, as I have said, my main concern here is to prove that chimpanzees’ inability to point



Becoming Human

is directly and immediately caused by a cognitive incapacity, why do I try to show that chimpanzees are able to perform those ascriptions? ‘Why look for problems?’ might be the comment. But for me this is not an unnecessary problem. I think that chimpanzees really possess that ability. In addition, and at a deeper level, I find that the opportunity to compare pointing gestures with an ability so closely related to pointing (which is the case of the ability to ascribe one’s own visual perceptions to conspecifics) is extremely interesting. Needless to say, all other things being equal, the closer to one another the terms of this comparison are, the more accurate the description of the human peculiarity will be.

chapter 1

Monkeys’ mirror neurons A quick glance at the title of this chapter may have provoked a reaction within the reader: why am I devoting all this space to monkeys’ mirror neurons? This ability of monkeys is, or at least that is my opinion, quite far from what is exclusively human. Why focusing on them then? Certainly, what interests me, I insist, is the constitution of exclusively human abilities. However, there is a reason why it is pertinent to include macaques’ mirror neurons in this first section. Studying them will allow me, above all, to strengthen a specific interpretation of chimpanzees’ abilities. This interpretation places non-human primates completely outside genuine simulation (or, in other words, latent imitation).

1.1

Mirror neurons in macaques, a significant discovery and a controversial interpretation

At the beginning of the 90s, a team of researchers from the University of Parma discovered neurons in the cortex of macaques which were activated both when the animal (with its eyes closed or open, it made no difference) grasped an object with its hand and when it saw hands of others grasping an object (Gallese et al. [1996]; Rizzolatti et al. [1996]). They called these ‘mirror neurons’. Mirror neurons quickly attracted the attention of philosophers and psychologists. At first, some authors related them to imitative learning, or, in other words, to the imitation of complex and new motor patterns. (In Stamenov & Gallese, eds. [2002], we see works supporting these opinions as well as others rejecting them.) Although it is most likely true – I am convinced it is – that mirror neurons represent a landmark in the evolutionary line which will ultimately lead to human imitative abilities, this is no justification for attributing this function to the neurons of animals which are not at all capable of motor learning. If these neurons have an adaptive function, such a function would have to be adaptive for macaques themselves. As a result, those voices which immediately related them to imitative learning have tended to fade away. But when we remember these attempts we can derive a more general lesson. This lesson, which would perhaps still be useful today, is that, for now, it would be wise to clearly separate mirroring in monkeys from mirroring in humans. There would thus be no risk of the controversial question of their adaptive function in macaques being hidden behind solutions or suggestions that are only admissible for the study of mirroring in humans.



Becoming Human

There is a tendency, nowadays, to associate these neurons with the subtrend in the ‘theory of mind’ which has been labelled simulationism, or with the interpretation and prediction of behaviour of others, or even with empathy. My own view differs from these also. I am much more in accord with authors who stress above all the role played by self-perceptible movements in mirroring. But first I want to critically analyse the widely spread view of the social utility of mirroring.

1.2

On supposed ‘social’ utility: Is the role of macaques’ mirror neurons to understand and predict behaviour of conspecifics?

Contrary to what today is perhaps still the most widespread opinion, I believe that mirror neurons would not originally have the role of understanding or predicting the conspecific’s behaviour. Clearly, if a macaque observes that a conspecific is grasping a fruit, the observing macaque is informed that that fruit is now less available than before. Equally clear is that this information will, on occasion, be useful to it. My point is, however, that in order to acquire that useful information, visual perception of movement and goal would be sufficient. In other words, the acquisition of this information would not require any association whatsoever with one’s own grasping. In addition, if the role played by mirror neurons were to understand or predict behaviour of others, then a piece of data which has been firmly established since mirror neurons were first discovered would remain unexplained. I am referring to the fact that mirror neurons – or, at least, the mirror neurons which have been discovered up to now – are never activated when the hand that is seen is not grasping an object. No matter how much movement this hand makes, the observer’s mirror neurons will not fire. Were we to accept that the information that can be associated with mirror neurons is the meaning of the behaviour of others, this well-established piece of data would, I repeat, be inexplicable. A hand which moves forward to hit, or (in chimpanzees) to beg from, whoever is observing it would undoubtedly be performing a behaviour which is relevant for the observer. However, the observer’s mirror neurons are still not activated in this case, no matter how interesting the perception of the behavioural meaning of these hand movements may be to the observer. We have, thus, followed two routes with the aim of raising doubts about whether the role of mirror neurons is to interpret or predict behaviour of conspecifics. On one hand, we have seen that the firing of mirror neurons – or, at least, the mirror neurons which have been discovered up to now – does not occur when faced with some behaviours of others which are indisputably relevant to the observer. On the other hand, we have argued that, when the firing of mirror neurons actually does accompany some behaviour of others, it cannot be said that this firing is useful in interpreting the behaviour nor in predicting the following step. On the basis of these points, we have some support to reject this as the function of mirror neurons.

Chapter 1.â•‡ Monkeys’ mirror neurons

In summary, it does not appear to be at all clear that, at their origin, mirror neurons have the adaptive function of understanding or predicting the behaviour of others. Fortunately, there is no need to insist on this pars destruens of my proposal. Hickok (2009) has sufficiently insisted on it.1 In addition, we should bear in mind that the hypothesis of the social origin of primate intelligence, which, beginning with Jolly (1966) and Humphrey (1976), had completely dominated until now, is beginning to be viewed more reticently. If wolves and hyenas also live in packs, why would the simple fact that primates live in groups have special consequences for them? This objection, about which many of us have given thought, has been set out authoritatively by Holekamp (2007). Do not get me wrong. I am not denying the crucial importance of social living for the evolutionary emergence of the human basic ability. But for that emergence to take place, other factors must have concurred. The following chapters deal specifically with this issue.

1.3

Mirror neurons, a secondary effect of self-perceptible movements

As a result, the view, for the moment a minority one, which sees the origin of mirror neurons as merely a secondary effect of self-perceptible movements, seems to me to be preferable – preferable, at least, to the famous ‘social utility’. Only later would there have been an exaptation* of this secondary effect for a useful purpose. So far as I am aware, Oztop & Arbib (2002) were the first to class the later playing of ‘social’ functions by mirror neurons as ‘exaptation’. As Hurley (2005) said, monkeys’ mirror neurons would originally be only a ‘secondary effect’ in evolution. The most developed hypothesis within this line is by Keysers & Perrett (2004).

1.3.1

Self-visible hands: Connecting Keysers & Perrett with Piaget

In Keysers & Perrett’s proposal, the self-visibility of the hand is a decisive factor. Motor orders to the hand are normally simultaneous with the viewing of hand movements. The direction of this association, from the motor to the visual, may be inverted later, and this leads to mirror neurons’ characteristic function. Keysers and Perrett invoke a Hebbian type of learning (“Cells that fire together, wire together”). By turning to the visibility of one’s own hand, their hypothesis succeeds in explaining why the function of mirror neurons relates in the first instance to hand movements. In the same way, Keysers and Perrett might succeed in justifying, at least in some degree, the fact that 1. Although they study human subjects (and although, as I have said, it is wise to clearly separate mirroring in monkeys from mirroring in humans), see also Brass et al. (2007, p. 2117): “We show that brain areas that are part of a network involved in inferential interpretive processes of rationalization and mentalization but that lack mirror properties are more active when the action occurs in an implausible context. However, no differential activation was found in the mirror network.”





Becoming Human

this function is confined to grasping movements. Given that sight is crucial if grasping is to be controlled, the association between motor orders and seeing the hand would take place, first and foremost, at the moment the grasping occurs. Self-visibility is indeed an exceptional feature. In this sense, the primates’ hand is a clear evolutionary novelty. While any primate is capable of clearly perceiving the similarities between its own hand and that of another individual, nothing like this occurs in other animals. A cat can see only a small part of its own tail or feet, from which it is impossible for it to see any similarity with the corresponding parts of another cat. Even an elephant’s trunk differs greatly from primates’ hands. The base of an elephant’s trunk is so close to its eyes that, whenever it sees its trunk, the matching with another elephant’s trunk will only be partial and imperfect. In the case of primates’ hands, moreover, it is not only that one’s own hands can be seen from the appropriate perspective, but also and alongside this, that the conspicuous, characteristic shape of the group of fingers is ideal for enabling the establishment of a correspondence between own hand and the hands of others. Apart from these data, we must also examine the development of children. The starting point for understanding the matching between one’s own body and the body of others are self-perceptible movements: Piaget (1945) had already said this when he addressed the first stages of the development of motor imitation in children. (Of course, this should not be taken to mean that I am identifying macaques’ mirror neurons with Piaget’s third stage of imitation in children. I have already alluded above to just how different we should consider these to be. In children, the third stage has the role of preparing the complete development of the motor imitation ability. In contrast, we cannot attribute this function to any mechanism in macaques.) The Piagetian ‘starting point’ can be found also in Heyes. Let us see, for example, Catmur, Walsh & Heyes (2007, p. 1529): “In hand movements, which yield similar sensory inputs when observed and executed, watching one’s own actions gives us perfectly correlated sensorimotor experience of those actions”. As we can see, Keysers & Perrett’s proposal on mirroring is in some way connected to the Piagetian model of the development of motor imitation. However, beyond this clear-cut and comfortable image, there are still problems to be solved. The issue here is how we are to understand the mirror neurons associated with the mouth.

1.3.2 Neonatal imitation and mirror neurons associated with the mouth: An open question Keysers & Perrett (2004, p. 505) also discuss the emergence of mirror neurons associated with the mouth: “The intense facial imitation occurring between parent and child could also be essential for Hebbian training of mirror neurons responding to the sight of mouth movements. Despite the fact that we do not see our own mouth movements, the sight of the parent’s mouth movements will become trained in a Hebbian way with the infant’s matching motor program”. This suggestion is certainly a well chosen strategy.

Chapter 1.â•‡ Monkeys’ mirror neurons 

We find it in other authors, as Heyes (2001, p. 258): “Co-activation of sensory and motor representations of the same perceptually transparent movement occurs whenever the individual observes, unaided, their own motor output, and, in the case of perceptually opaque movements, through experience of being imitated, and of socially synchronous movement in response to a common stimulus” (my emphasis). In addition, there have been subsequent observations which seem to support that. “Infant macaques, similar to human infants, are able to respond to their mothers’ lipsmacking by lipsmacking back at them” (Ferrari et al. [2009, p. 1771]). Ferrari et al. (2009), having mentioned (p. 1768) Trevarthen and human primary intersubjectivity, say (p. 1770) that their “data provide evidence about emotional communication between mother and infant in macaques. (...) Mother-infant pairs communicate intersubjectively via complex forms of emotional exchanges including exaggerated lipsmacking, sustained mutual gaze, mouth-mouth contacts, and neonatal imitation. However, this form of communication disappears within the infant’s first month of life.” (We should, incidentally, investigate if anything similar happens to any non-primate mammal.2) However, when Keysers & Perrett explained the emergence of mirror neurons associated with the mouth in this way they were introducing an internal division into their proposal. Let me explain. The visuomotor connection seems much more simultaneous and necessary when it is the motor and visual dimensions of one’s own grasping that are associated. By contrast, in the mouth-related mutual imitation, the mother’s response can often be delayed for a short time or be absent. These occasional delays would certainly not prevent the eventual consolidation of the association. However, remember the remark made by Ferrari et al. (2009, p. 1770): “This form of communication disappears within the infant’s first month of life”. Will mouth-related learning be, in due course, as strong as the hand-related Hebbian learning? (Of course, it is well known that, if you wish to make a desired behavior last, it is best to switch to an intermittent schedule of reinforcement: Skinner [1953, for example]. However, as I will repeat in the next paragraphs, I am interested in the understanding of the correspondence between mouths, not in behaviour.) This consideration raises the question of whether the concept of mirroring could in some way be unified. In reference to mouth-related mirroring, Keysers & Perrett make two statements which, in my view, we should perhaps not put in the same bag. These authors consider, firstly, that the intense facial imitation occurring between mother and infant occurs prior to any true mirroring and, secondly, that that neonatal imitation is sufficient to cause true mouth-mirroring. I think that the first one of these two statements is more 2. See Bard (2009, p. R941, R942): Certainly, this author in first instance speaks in terms of “ presence of intersubjectivity in extant great apes and Old World monkeys, but not in New World monkeys”, to later make more specific and detailed claims: “Perhaps only some components will be found in some primates. For example, during neurobehavioral testing with capuchin infants (a New World monkey), a brief bout of responsive calling with turn-taking occurred, but there was no mutual gaze.”



Becoming Human

reliable than the other. Before further inquiry into the mouth-related mirror neurons of macaques, we should perhaps have a look at other data. The internal division that I have referred to in Keysers and Perrett’s proposal echoes a debate which has been going on since the early 80s, when the discovery that newborn babies are able to imitate facial gestures upset the Piagetian model of motor imitation development. According to Piaget, the correspondence between mouths would emerge from two different routes, both related, nevertheless, to self-perceptible movements. On the one hand, the sounds of actions such as licking, which can be self-audible, i.e. audible to the agent performing them, not only to an observer, would be extremely important. On the other hand, the matching between the mouths could also be derived from the matching between self-visible hands: One’s own hand carries the food to a hole that is felt but not seen, and the hand of others carries the food to a hole that is seen but not felt. In this respect, Piaget speaks of intelligent mistakes which reveal how hard it is for children to acquire the matching of the mouths: numerous observations confirm that the child sometimes tries to coordinate someone else’s mouth opening and closing with the opening and closing of his or her own eyes, or conversely. Let us note that there is an important difference between Heyes, who (in 1.3.1) we introduced above as close to Piaget, and Piaget himself. Ray & Heyes (2011, p. 101) say: “Imitation of mouth opening and lip smacking when accompanied by sounds (...) is a perceptually opaque action.” I do not agree with this classification. Mouth opening and lip smacking when accompanied by sounds are self-perceptible (more concretely, self-audible) movements, or, in other words, perceptually transparent actions. But the whole edifice of the Piagetian model was shaken when Meltzoff & Moore (1983) got a very high percentage of new-borns, only a few hours old, to imitate movements as opening their mouths or sticking out their tongue. The question then arises: in the face of these data, can we still maintain that the child’s understanding of matching between mouths takes place only later and relies on either of the two routes (self-audible sounds and relation to self-visible hands) or both, as suggested by Piaget? (Again I stress that, in my view, the child’s perception of the matching between mouths is the key issue. I am focusing on the ability to see somebody else’ mouth as a counterpart to one’s own mouth, not on behaviour. By contrast, Ray & Heyes [2011] say: “Infants do not need to detect or recognize that they are being imitated in order for this experience – which appears to be plentifully available in typical development – to support the learning of matching vertical associations.”3 Thus, I ask the reader to keep in mind that my use of ‘correspondence’ or ‘matching’ is different to that of these authors.) 3. In order to play down the importance of understanding of homology as the base for imitation, Heyes et al. (2005) showed that a brief period of incompatible sensorimotor training – in which participants responded to hand opening stimuli by closing their hands, and to hand closing stimuli by opening their hands – abolished automatic imitation, e.g. the involuntary tendency to make an open hand response faster to an opening than a closing hand stimulus. But I reply that the strenght of verbal instructions in adult humans (or of conditioning in animals) is beyond all doubt. Therefore, in my view, those experiments do not refute at all the idea that such

Chapter 1.â•‡ Monkeys’ mirror neurons 

Since the experiments of Meltzoff & Moore, a lot of research effort has been concentrated on this issue. After Anisfeld (1991) (1996, p. 60: “only tongue protrusion modeling is matched by neonates”), two attractive and mutually compatible explanations have arisen. Jones (1996) shows not only that tongue protrusion occurs to some extent in response to any interesting visual display, but also that the ‘virtual’ exploration with the tongue stops as soon as infants begin to reach with their hands: “Infants produced tongue protrusions in response to objects within reach before but not after reaching developed. Our results suggest that infants’ tongue protrusions in response to a tongue-protruding adult reflect very early attempts at oral exploration of interesting objects” (p. 1970). Nagy & Molnar (2004) focus on the very frequent ‘delayed imitations’ observed in neonates, and they suggest that such ‘delayed imitations’ were actually serving as provocations for interaction. These ‘delayed imitations’ were accompanied by a totally different heart rate pattern than the ‘imitations’: the heart decelerated, indicating greater preparatory attentional focus on the acts of the other, in contrast to the clear acceleration characterising ‘imitations’. Obviously, Nagy & Molnar are close to what Ferrari et al. (2006, p. 302) suggest about macaques (“Macaques’ neonatal imitation may serve to tune infants’ affiliative responses to the social world”). However, in spite of all these research efforts, the question has still not been fully answered. Neonatal imitation is not the only problematic issue, however. Other facts could easily raise a similar question. Let us consider, for example, the suggestions made by authors like Hurford (2004, p. 305) or Hurley (2005), who (although they also stress the contrast between activation of mirror-neurons and open behaviour) link mirroring with the almost certainly innate mechanism responsible for birds banding together in flocks, or fishes in shoals. Here there is no self-perceptible movement at all. Is this behaviour, nevertheless, akin to mirroring? Or is it something radically different? Against this background, we should ask ourselves a more radical question. Is monkeys’ mirroring caused by self-perceptible movements? We need to ask with Zentall (2003, p. 94): Are mirror neurons’ visual-motor (or audio-motor: infra, 1.5) connections “prewired neural pathways or, on the contrary, do they have to be trained?” In this regard, the following experiment would be helpful. From the moment of birth, a macaque is prevented from seeing its own hands. Without modifying this prevention in any way, it is given the opportunity to see hands of a conspecific. Would the mirror neurons be activated in this case?4 Admittedly, there is no clear-cut evidence on this issue so far. understanding is actively present in primates (although in a different way and degree in macaques, chimpanzees or humans). 4. Certainly some may say: It is well established that mirror neurons are the result of learning in that, e.g., there is mirroring for paper tearing and tool use (see these data in Kohler et al. [2002]; Umiltà et al. [2008]). However, this answer does not persuade me. The fact that the peculiarities of the paper or the tool must obviously be learned does not necessarily imply that the basic connections cannot be innate. Consequently I think that the suggested experiment would be helpful.



Becoming Human

In short, despite much research, we do not yet have answers for our questions or, more precisely, we still cannot decide definitely which alternative is the valid one. According to one of these alternative responses, mirroring is close to social innate coordinations and not necessarily linked to self-perceptible movements. In other words, according to this first alternative response (I am going to refer to it in this way, regardless of the chronological emergence of the theories), unification takes place because selfperceptible movements are no longer viewed as the core and paradigm of mirroring. By contrast, the second, alternative response suggests that genuine mirroring begins with the learning prompted by self-perceptibility and, on the other, that neonatal imitation does not connect with the perception of the matching between one’s own mouth and someone else’s mouth until a later stage in ontogenetic development. More precisely, neonatal imitation would only become genuine mouth-related mirroring after one or both of the two routes suggested by Piaget were fully covered. In this second response, the key element of mirroring rests, as in the first part of Keysers and Perrett’s proposal, on self-perceptible movements and on the need for consistent learning. This also implies a unification of mirroring, but this time unification takes place because neonatal imitation is neither interpreted as genuine mirroring nor as being sufficient to cause true mirroring. (Keysers & Perrett [2004] cannot be ascribed to any of those choices. As I said before, I think that there is an inner division in these authors’ proposal.) I insist that, in my view, so far there are not sufficient data to make a decision. Both rival hypotheses can be defended. Obviously, this has a restrictive consequence: neither hypothesis can be taken as evidence. But it also has a consequence in the opposite direction. As part of a hypothetical-deductive approach a researcher can take either of the two as his or her starting point. Thus, I will opt for the hypothesis which relies on selfperceptible movements. More precisely, I will be (relatively) close to the first part of Keysers and Perrett’s proposal. Consequently, my explanation of mirroring, relying heavily on the importance of self-visible hands, could be described as primatocentric. (But self-audible learned songs might be a different source of mirroring: see infra, 1.6.) Let us sum up. I have rejected the adaptive utility that Rizzolatti, Gallese and the other members of the Parma group ascribed to macaques’ mirror neurons, and, by contrast, I agree with Keysers and Perrett in ascribing an important role to the selfperceptibility of the hand. However, the question which most interests me, as I have said, is not this, but the question of the mechanism which underlies the activation of mirror neurons when faced with action of others. Until now, the peculiar function of monkeys’ mirror neurons has been understood as off-line or latent motor imitation (or, in other words, as simulation). However, I believe there are reasons to doubt this.

1.4

Simulation or expectation? The crucial question about the abilities of non-human primates

The purpose that will guide us in the rest of this chapter has already been clearly outlined. I want to look for arguments against the possibility that there is simulation involved in

Chapter 1.â•‡ Monkeys’ mirror neurons 

macaques’ mirror neurons.5 Although I accept that mirroring involves some kind of connection with someone else’s inner state, I do not consider that genuine simulation or, in other words, latent imitation is involved. As can be seen, I am about to embark on a journey in a direction opposite (and complementary) to the one imposed by my choice at the end of Subsection 1.3.2. Then, I separated mirroring from other types of behaviour unrelated to the hand, behaviour which is earlier in the course of ontogeny. By contrast, now I am going to describe mirroring as being different from true simulation, which probably is peculiar to humans. In Section Two we will see why this difference from simulation is an important issue. However, the immediate task can be defined more specifically. We shall set the foundations of an alternative explanation piece by piece.

1.4.1 Animal behaviour and expectation To begin with, we must stress how the expectation of results is a key concept for animal behaviour. The brain emerges primarily to provide the animal with the best choice of behaviour at every moment. Once we accept this, the expectation of results is foregrounded. It should be remembered that the brain has outlined the effects it wishes to achieve not only before it undertakes any behaviour, but also before it selects the appropriate movements and means at every moment along the way: it is due to a precise expectation of results that animals can do without the inefficient preset motor plan resource (cf. e. g. Thelen & Smith [1993]).6 This expectation of effects is at once the motor and the guide for behaviour; in other words, it not only generates but also selects behaviour. The expectation wishing to be satisfied marks the beginning of any behaviour, and the satisfaction of such expectation marks its end.7 5. A possible objection could be that, since monkeys do not imitate (to any great extent), these arguments seem unnecessary. I disagree, however. We must bear in mind that any imitation of simple movements (something completely different from motor learning, i.e., from imitation of complex and new motor patterns) is of no use whatsoever, and I will comment on this later in 2.2. Consequently, the fact that no external imitation has ever been observed in macaques does not necessarily dismiss the interpretation of mirroring as internal simulation. A defender of such an interpretation could argue that no researcher can ever make macaques understand the command to perform a completely useless external imitation. (See also Paukner et al. [2009], who suggest that capuchins monkeys display affiliation toward humans who imitate them.) Consequently, if we do not accept, and I certainly do not accept, the interpretation of mirror neurons in terms of simulation (i.e. latent imitation), then we must provide another reason. More precisely, we must provide an alternative interpretation for mirroring. 6. An extreme case in this sense (i.e. an extreme argument against innate, preset motor plans) can be found in some ‘freaks of nature’ (Blumberg [2009]). Of course, we humans perform preset motor plans when we imitate new and complex motor patterns. But these complex plans are not the issue here. 7. An extended concept of homeostasis could involve all animal goal-driven behaviours (Richter [1943], cited e. g. in Bechtel [2009, p. 165–166].



Becoming Human

Expectation – of effects not sought, in this case – is also invoked to explain socalled ‘attenuation’. The concept of attenuation was coined in relation to the two types of movement possible in retinal images: movement caused by the environment, and movement which depends on the movements of one’s own body. As has been known for a long time (Von Holst [1954] or Sperry [1950]) the brain only takes note of the first; the second is thus attenuated.8 Attenuation has been applied most recently to the tickle sensation, or, more specifically, to explaining why it is not possible to tickle oneself: Blakemore et al. (1998) and (1999). In every case, the sensations caused by movements of one’s own body are attenuated because they respond to a previous expectation in the brain, to a mechanism prior to the real movement (see also Bompas & O’Regan [2006a], [2006b]). As regards all types of attenuation, Wolpert’s group has undertaken in recent years to disprove the hypothesis of a postdictive mechanism, i.e. a mechanism subsequent to movement: Voss et al. (2006), Bays et al. (2006). As a result, we can conclude that expectation of the movement’s effects, aside from its principal function as a guide for behaviour, is also capable of explaining the phenomenon of attenuation. “Test – Operate – Test – Exit” was a motto of the first cognitive revolution (Miller et al. [1960]). I don’t like what I have; I act; I like what I get; end. Whichever terminology we may adopt, this sequence of steps describes goal-driven animal behaviour. Naturally, however, we must place the profile of the state we are seeking in front of these, as the first step. If this profile did not exist in the animal beforehand, there could be no behaviour. (For a brief review of this matter, see Carver [2005].) Expectations which seek satisfaction (or, in other words, empty profiles which urge the organism to fill them) are absolutely necessary elements both for opening and for closing – or, rather, for satisfactorily closing – any behaviour. Lastly – that is, at the beginning, and, therefore, also at the end –, would come innate consummatory patterns, i.e., the “teaching mechanisms” about which Lorenz (1966) spoke. But with learning and conditioning, many other expectations, no longer innate, take root in the animal (see, for example, Bar [2007]); these learned expectations cannot be the end goals for animal behaviour, only subgoals. Moving beyond this general framework, and on finally to what we are interested in, let us look at the step posed, within any behaviour, by a simple movement. Here, too, the starting point would be an expectation outlining the effects sought. This expectation, which, like all expectations, would only be deactivated when it was satisfied, 8. Expectation and attention both facilitate the interpretation of perceptive contents, but attention strengthens sensations while expectation at times attenuates it. Cf. Summerfield & Egner (2009, p. 405): “Given that attention and expectation have similar facilitatory effect on visual object recognition, one might anticipate that expected (relative to unexpected) stimuli would also be associated with enhanced sensory responses. However – strikingly – the opposite is in fact typically the case: expected stimuli tend to elicit reduced visual responses, relative to their unexpected counterparts, and an extensive literature has documented the corresponding phenomenon in the auditory domain.”

Chapter 1.â•‡ Monkeys’ mirror neurons 

would be the expectation of results relating to location and posture. Cf. the ‘postural coding’ that is invoked by Graziano et al. (2002, p. 355). Motor orders will be selected on the basis of the results of this type that one wishes to obtain. What is useful is not movement in itself, but the results of movement. These expectations of the results of a simple movement must undoubtedly be considered as subgoals, or rather, sub-subsub...-goals. (Certainly, modern theories of motor control incorporate the forward model, which would serve to circumvent unavoidable neural delays associated with on-line feedback control. Cf. Jordan & Rumelhart [1992] or Grush [2004]. Certainly it is indisputable that this function has to be performed. However, we do not know how it is performed in animals. Engineers specialising in robotics make use of mocked retroafferences. “Mock input is generated from the operation of an internal emulator”: Grush [2004, p. 390].9 But it is very probable that in animals there is a more simple and peripheral mechanism to calculate retroafferences. See the criticisms of Grush that have come from biology: Latash & Feldman [2004] and Webb [2004]. See also, and more importantly, Dimitriou & Edin [2010], who show that human muscle spindles act as forward sensory models.10 But let’s go back to our thread. Although we accept some of those refinements to circumvent neural delays, we can still support that expectations are crucial for the organisation of animal movement) It is now the moment, however, to ask what all this has to do with mirror neurons, or, more precisely, with the mirroring role of these neurons. For that purpose I will concentrate on three different elements. Obviously, two of these elements are the socalled ‘motor’ role and the mirroring role of mirror neurons. But I will also deal with another kind of neurons, the so-called canonical neurons, which have been found near the mirror neurons but whose function is quite different. Clearly, the activation of mirror neurons in their so-called ‘motor’, and not ‘mirroring’, role might correspond to the expectation of the postural result of the grasping movement. We might likewise attribute the activation of ‘canonical neurons’ to the expectation of results. These neurons, which have been discovered in an area of the brain close to where mirror neurons have been found, “fire both when the monkey performs a goal-directed action on an object, and when the monkey sees the object” (Rizzolatti, Fogassi & Gallese [2001]), and thus appear (cf. Ellis & Tucker [2000]) closely related to the psychological concept of affordance of an object. In both these types of neuronal activation (mirror neurons in their ‘motor’ role and canonical neurons), the expectation of the results of a movement is, I stress, one highly possible 9. Also the ‘comparator model’ (Frith et al. [2000]), which has been recently defended by Glenn Carruthers (in press), involves the predicted sensory consequences based on the so called forward model. 10. By contrast, I think that the model that controls the observational phase in the imitative learning of complex and new motor patterns would be a much more demanding one, as we will see in Chapter 8.



Becoming Human

interpretation. This is the case even when the monkey does not really perform any action on the object that has activated its canonical neurons; in other words, even when the goal defined by these results does not in the end triumph over the other goals that at that moment were fighting to gain control of behaviour. However, what might we say about mirror neurons in their mirroring role?

1.4.2 The new type of expectation which appeared alongside mirroring: Describing the difference between my hypothesis and that of Keysers & Perrett (2004) We come now to the suggestion we have been preparing: mirror neurons would also be related to the postural expectation in their other role, i.e. in their mirroring role. The activation of mirror neurons in response to a seen hand would correspond to the expectation of the particular postural state which the individual already knows corresponds to the grasping which it is seeing. However, this expectation that would occur in the observer entails an enormous peculiarity. Let us examine this carefully. In animals, the expectation of results would occur before any voluntary movement (in other words, for any movement forming part of a behaviour). This would be entirely generalised. Alongside this, however, there is a peculiar characteristic exclusive to self-perceptible movements. In these movements, information about their proper fulfilment will not come only from the postural sensation obtained or from the feel of the object grasped (where this is the case), but also visually. However, what if only the visual information arrives? The suggestion is that the visual mirroring mechanism would be constituted when, after a hand is viewed grasping an object, the expectation of the postural state associated with that grasping is activated. Let us compare this to the ‘Hebbian learning’ by Keysers & Perrett (2004), mentioned above. The two hypotheses are clearly similar, but there is also a marked difference between them. Instead of latent motor imitation, we are postulating a postural expectation in the observer.11 How could this postural expectation be activated in the observer? In all the movements performed by animals, the expectation of postural results is activated before 11. Unfortunately, and in spite of our possible first impression, this question (latent motor imitation or postural expectation?) would not be answered even if we copy Woodward’s habituation studies. Woodward (1998) habituated 6-month-old infants to an event in which a human actor grasped a toy. At test, infants demonstrated a stronger novelty response to events in which the actor’s goal had changed while maintaining the physical properties of the reach, as compared to events in which the physical properties of the reach had changed while maintaining the same goal. See also Hamilton & Grafton (2006) who, working with adult human subjects and observing some neurons in anterior intraparietal sulcus, got similar results. Let us return to macaques’ mirror neurons. My point is that, even if it is eventually shown that macaques (and their mirror neurons) are similar to humans regarding this issue, these data would not at all be really crucial arguments in favour of my hypothesis that postural expectations, not grasping movements, are the object of mirror-neurons. Let us note that inner expectations are very different from external goals.

Chapter 1.â•‡ Monkeys’ mirror neurons 

such results are reached: etymologically speaking, this is what is suggested by the term expectation. On the other hand, in mirror neurons in their mirroring function, according to our suggestion, the expectation of results would be being activated after the observed grasping has been performed. We need, therefore, to propose a time-relative inversion. Keysers & Perrett, in contrast, have no need for this. For them, as for many authors, the firing of mirror neurons in response to a merely observed grasping means the latent, inhibited, activation of a movement, not an expectation of postural results. And, therefore, although Keysers & Perrett speak, of course, of an inversion of direction (the motor to visual direction would become, after Hebbian learning, visual to motor), they do not need to postulate any time-relative inversion. The performance of the movement and its visual perception occur practically simultaneously. In contrast, with the expectation of results things are very different. In my suggestion, the expectation of postural results would be located differently for each of the two roles played by mirror neurons. In the case of the expectation of grasping which the subject will immediately perform, the expectation is prior to movement: this is what occurs in mirror neurons’ ‘motor’ role. In contrast, when the grasping is merely observed, the postural expectation is a posteriori to movement: this is the interpretation of mirroring that I am suggesting. Observe that the link or the shared element in these two different situations in which there is firing of the mirror neurons is by no means less clear than in the other interpretation. It may even be the opposite. Let us compare both interpretations. If we interpret that the firing of mirror neurons corresponds to motor commands, we must accept that that motor command would in some cases produce latent motor activation (the mirroring role of mirror neurons) while in other cases it would produce unfolded motor activation (in the so-called motor role of mirror neurons). If, by contrast, we interpret the firing of mirror neurons as corresponding to expectations of postural results of a movement, it is true that we must inevitably distinguish the cases in which the expectations are going to be fulfilled by the relevant somatosensory sensations, i.e. in the so-called motor role of mirror neurons, from the cases in which they are not going to be fulfilled, i.e. in the typical mirroring role. Nevertheless, the expectation as such would remain the same. In short, the expectation of postural results is a good candidate for that element – that factor shared by the two different situations of the firing of mirror neurons – which any explanation of the mirror neurons’ function must discover. The objection that could be made of my hypothesis is, of course, that it needs to postulate a very peculiar type of expectation and perhaps a strong new development in evolution. Throughout animal evolution, the expectation of results would always have come before movement pursuing those results. But with mirror neurons a revolutionary, a posteriori, expectation would have appeared. These postulated expectations are so special that the criteria of parsimony or simplicity seem to provide a ruling contrary to my hypothesis. My reply would be that, should there turn out to be other reasons in favour of this interpretation, we would then be fully within our rights to postulate this strong new



Becoming Human

development in evolution. Everything therefore rests on other reasons being found which might support the hypothesis. What might these reasons be? The main argument I can offer lies in stressing the extreme complexity and high demands which a latent motor imitation (or, more generally said, a simulation) would require. This complexity is hinted at in some areas of the bibliography, but we have not yet come to the heart of it, I believe. Let us consider this carefully. Focusing on mirror neurons, Gallese (2003), Gallese et al. (2004) and also Hurley (2005) have insisted that ‘we’ existed prior to ‘I’. This immediately raises the question of how movement of others is then differentiated from own movement. By contrast, if we accept the interpretation of mirror neurons in terms of the expectation of postural results, the problem of how to differentiate between another’s and one’s own states disappears. If (as with the typical situation of mirroring) the grasping is merely observed, i.e. if it is somebody else’ grasping, then the expectation of postural sensations will not be satisfied. The problem of this differentiation appears in other authors. Thus, when, addressing simulation (or off-line imitation), Brass & Heyes (2005, p. 493) asked “why do we not imitate all the time?”, they replied that the key must be found in the distinction between the subject and the observed individual (see the ‘Who?’ system proposed by Georgieff & Jeannerod [1998]; see also Decety & Chaminade [2003]). I believe that this latter type of response – the distinction between subject and observed individual – is close to being correct when we are interested in this question only in human beings. But we should return to macaques and see what would be implied by interpreting their mirror neurons in terms of latent imitation. In my opinion, granting motor simulation to macaques’ brains is probably to underestimate the difficulty and complexity involved in such simulation. We must not forget that the simulating organism has to keep itself informed about its own kinaesthetic and postural states. I will formulate this idea through a critique of Glenberg (1997). This author, commenting on how past episodes are evoked, says that the subject’s awareness of its real situation is suppressed during these evocations. I believe this to be an incorrect way of explaining what takes place. Such suppression occurs only during sleep, that is, while the subject’s motor activity is switched off and the subject is in a relatively safe place. Under different circumstances to these, suppressing the awareness of real circumstances might be extremely dangerous. What is the consequence of all this for our discussion? The consequence, in my view, is that genuine motor simulation must involve a double line of information. On one hand, a line for the state of the simulator’s body; on the other, a line for the state of the observed body. This duality, I suggest, is so extremely demanding that only the human brain can sustain it. (But, when one animal fights another, must it not exhibit something of this duality?: this could be a plausible objection. However, I think that the prediction of an opponent’s behaviour needs no association whatsoever with one’s own similar movements. In other words, as I said in 1.2, the visual (‘from the outside’) perception of

Chapter 1.â•‡ Monkeys’ mirror neurons 

movement and goal would be sufficient to make such predictions. Thus, if we admit that these predictions are independent from mirroring, they will be a fortiori independent from the duality that which we are talking about.) In contrast, the expectation resource in animals does not require this duality at all. It is true that normal expectations have to do with a later moment. But this does not at all mean that it needs to be addressed as an environment additional to the current real environment.12 Quite the opposite, expectations are built-in constructions (more specifically, empty but, nevertheless, well-defined profiles) which often exert their influence precisely by guiding attention toward the current real environment. In short, we have two contrary interpretations of mirroring. Whereas I explain this through a new type of expectations (i.e. a posteriori expectations), many proposals, in contrast, state that mirroring involves some form of latent motor imitation. Different authors classify mirroring in highly varied ways: direct resonance and ‘we’ perception versus (Heyes, or Keysers & Perrett) a (relatively) more Piagetian process; stressing competition versus stressing empathy; understanding of behaviour of conspecifics versus (Csibra [2007]) prediction of the immediate future behaviour of conspecifics. Nevertheless, none of these controversies is of any interest to me now. It is the interpretation of mirror neurons as involving latent motor imitation (or “emulative action reconstruction”: Csibra [2007]) that I wish to question for monkeys’ and apes’ mirror neurons. My effort to differentiate simulation from a posteriori postural expectations, my obsession with that difference, may seem to be splitting hairs. If the reader has had this impression, I beg for his patience until we use that difference, or more specifically its derivation (that is, that chimpanzees would have the ability to understand the correspondence between their own body and that of a conspecific, but they would lack simulation) to build a more general proposal.13 As I have suggested, that ruling provided at the outset by the criteria of parsimony or simplicity (a ruling, it should be remembered, which originally seemed contrary to my hypothesis) may now change. Postulating a simulation (i.e., a latent imitation or an emulative action reconstruction) in a macaque’s brain may perhaps be much less parsimonious than postulating an evolutionarily new type of expectation for self-perceptible movements. In addition, the speculation I shall now put forward – if in the future it were to receive some confirmation – might support the idea of a posteriori expectations.

12. See again the paragraph (in 1.4.1) about forward-models. With regard to the two alternatives (mock input versus a more simple and peripheral mechanism), it is the second one that nowadays is most likely true for biological systems. 13. In fact, in the elaboration of my proposal the issue of mirror-neurons was included very late. And my curiosity regarding them was whether they entailed stimulation or not.



Becoming Human

1.5

An adaptive but ‘non-social’ role? A speculation which would act as an argument in favour, were it to enjoy slightly more support

An arboreal primate moving from tree to tree finds it very useful to see which branch its own hand is grasping. Only with this type of visual perception would it be possible to prove the solidity of the branch, as well as to choose which nearby branch to move to next. However, although visual perception of one’s own hand is extremely advantageous, it could also entail a grave danger. If, erroneously, the visual perception of the hand of somebody else is taken as the visual perception of one’s own hand, making use of this information may be catastrophic. If the branch grasped is not sufficiently solid, yet the animal trusts that grip because it trusts a visual perception (or equally, if, trusting the visual perception, it decides to move towards a place where there is no branch), the likely result will be that the animal will fall from the branches. Consequently, there would be strong selective pressure at some very early point in the evolution of primates in favour of a mechanism which would allow them to distinguish between their own hand and the hand of somebody else. This would be, I suggest, the role originally played by visual mirror neurons – or, more precisely, by these neurons in conjunction with the somatosensory postural area. The animal has to compare the postural expectation that it detects in the seen hand with its own somatosensory information.14 Only if the two postures coincide will the subject be dealing with its own hand. However, is it really necessary to postulate an a posteriori postural expectation in order for the monkey to be able to avoid these dangerous confusions? I would respond as follows. Certainly, this confusion would be much more easily overcome for a subject that can freely move its hands. As soon as the subject makes the slightest movement with its hand, it will succeed in clarifying whether or not the seen hand is its own hand.15 This ‘move-and-see’ strategy deserves to be called ‘the easy route’ towards that

14. Postural expectation and somatosensory information. Cf. Heilman et al. (1998), who have recently added an interesting aspect to the traditional explanation of the anosognosia. According to the traditional explanation, anosognosia results from an inability to represent current body states automatically and through the appropriate signaling channels, i.e., the somatosensory systems. Heilman et al. suggest that the patients also lack an intention to move (or, as I would say, lack postural expectations) and are thus robbed of a means to check their defect easily. 15. Let us examine some data regarding macaques’ brains. These data, it should be noted, refer to hand movements that are different to the movements involved in grasping an object, and likewise refer to an area of the brain very different to that of mirror neurons. “We have measured responses of visual movement sensitive neurons in the anterior part of the dorsal Superior Temporal Sulcus of monkeys to stimulation caused by the animal’s own active movements. These cells responded to any stimuli moved by the experimenter, but gave no response to the sight of animal’s own limb movements” (Hietanen & Perrett [1993, p. 117]). Of course, it is not a case of the movement of one’s own hand being invisible; clearly, the animal will see its own hand move. However, since this has been self-generated, the movement appearing in the retinal image will

Chapter 1.â•‡ Monkeys’ mirror neurons 

clarification. (Cf. Hogendoorn et al. [2009].) However, it should be remembered that a hand holding an object is less able to move. We should comment in a more general way on the speculation that this subsection has suggested. Our initial point was that discovering whether a seen hand is another’s or one’s own hand is not an adaptively useful task except when this hand is grasping a branch. We can now add that the sophisticated route provided (under the present speculation) by mirror neurons is unnecessary except when the hand is already still and prevented from making any new movement. In all other situations, it is enough to make a movement with one’s own hand to determine if the seen hand is another’s or one’s own hand (“Agency structures body-ownership”: Tsakiris et al. [2006. p. 423]). The two exceptions mentioned coincide with one another – that is, the only type of situation where that utility occurs and the only type of situation where that need occurs. Let us see this in a more detailed way. When does the compulsory stillness occur which prevents ‘the easy route’ from functioning and which, thus, makes the sophisticated route (i.e., the intervention of mirror neurons) necessary? This occurs only when the hand is already grasping an object, which is an indispensable requirement for mirror neurons associated with the hand to be activated. We have, thus, a triple coincidence: the situation responsible for the utility, the situation responsible for the need, and the situation where the mirror neurons are activated. Thus, according to my suggestion, the absence of manual movements would be involved in mirroring in two different ways. Firstly, this absence is the key of the type of situations where the adaptive advantage of mirroring originally arose. Secondly, it is a necessary element in the mechanism of mirroring (and consequently it must be present in any activation of mirror neurons in their mirroring role). The animal has to compare the postural expectation that it detects in the seen hand with its own current somatosensory information. Note that, should there be any manual movement of the animal, this somatosensory information would then change, and consequently ‘the sophisticated route’ towards the useful comparison would become an impossible route. (Please note that it will become impossible even in those cases in which it would have still been useful – i.e., in cases in which the hand previously perceived is hidden behind an object or when in a runaway or fight situation it is not advisable to waste time by looking again at the hand perceived). Kraskov et al. (2009, p. 922) have discovered that “many pyramidal tract neurons in area F5 showed complete suppression of discharge during action observation, while firing actively when the monkey grasped food rewards”. This finding can be interpreted in two different ways. According to an interpretation, this “inhibition” of self-movement during action observation could answer the famous simulationist question (why do animals with mirror-neurons not imitate all the time?). By contrast, in my

be subject to attenuation, that is, it will be excluded from the set of information about the world (remember 1.4.1, about attenuation).



Becoming Human

view, suppression of manual movements would be a necessary requirement for the useful comparison. An objection to this ‘adaptive but non-social role of mirroring’ might be to reject the very possibility of confusion. The hand of the conspecific would be in an incongruent location, perhaps too far away, and in addition, the sight offered to our own eyes by our own hand (i.e. the egocentric perspective of the hand) differs from the non-egocentric perspective corresponding to the hand of the conspecific. I would respond that these ways to preventing confusion would perhaps not be totally effective. Think of the so-called rubber hand experiments* that have been used with human beings for years. “The Rubber Hand Illusion remained as long as stimulation of the two hands (rubber hand and the subject’s hand) was congruent in a hand-centred spatial reference frame, even though it was incongruent in external space” (Costantini & Haggard [2007, p. 229]). That is, even in human beings, situation in external space is not sufficiently orienting to avoid confusion. In contrast, in accordance with my speculated adaptive advantage, the postural incompatibility between own hand and the hand of the conspecific completely avoids the danger of confusion: “Postural compatibility of the visible rubber hand is a necessary component for the illusion of ownership in the Rubber-Hand Illusion” (Holmes & Spence [2007, p. 212]).16 As regards the opposition between egocentric and non-egocentric perspectives, we must begin by accepting that this opposition is categorical in the case of two subjects sitting at opposite ends of a table who extend their hands forward. However, if we focus on the example of the hand grasping the branch of a tree, the emphatic opposition begins to blur. But – it can be argued – why does mirroring also appear in cases in which perceptive opposition between own hand and foreign hand is clear? I would reply by invoking a general principle. Once a resource has been created, it is possible for this resource to be employed in some functions which, independently, would absolutely not have had sufficient strength to create it. Once the cannon has been invented, it can be employed, not only to knock down walls, but also to kill a mosquito. What of the question of distance? It has been discovered that mirror neurons differentially encode the peripersonal and extrapersonal space of monkeys (Caggiano et al. [2009]). In addition, when a panel prevented the monkey from reaching for objects close to his body, these authors observed that “neurons selective for the extrapersonal space started to respond also in the peripersonal space, while neurons selective for the peripersonal space ceased to respond” (p. 403). These authors suggest the following interpretation: “The distance between observer and actor is a feature that plays virtually no role in understanding the meaning of an observed motor act. Nonetheless 16. We can report other experiments with humans. Subjects saw either their own hand or the experimenter’s perform an action and were asked to say whose hand they thought they saw moving. “Recognition errors occurred largely when the subject saw an experimenter perform the same movement they had made” – Daprati et al. (1997). In other words, “if the movement looks the same as it feels then it is one’s own. If not then it is someone else’s” – Jeannerod (2006).

Chapter 1.â•‡ Monkeys’ mirror neurons

it is important for evaluating adequate subsequent interacting behaviors, since interactions in the observer’s extrapersonal space are possible only through intermediate steps (e.g., approaching the observed agent or removing an obstacle)” (p. 406). That could indeed be the correct interpretation. (However, why does the information regarding distance have to be incorporated into mirror neurons? The grasping of distance is already involved in any visual perception.) But I would like to suggest another possibility. This differentiation between proximal and distal movements might perhaps be used by mirroring in order to the useful function that I have suggested. The grasping which is seen from too great a distance or, more precisely, in operationally extrapersonal space, can never be confounded with one’s own grasping. In these cases, mirroring would satisfy its function in a fastest way. Thus, the comparison with somatosensory postural information would only be necessary for movements seen in peripersonal space. More generally, in order to finish setting out my speculation I must address auditory mirror neurons (Kohler et al. [2002]). These neurons are activated only by the sounds which are produced when an object is grasped and handled, and never, contrastingly, by shouts (at least, up to now, no neurons have been found that fire both when the monkey himself shouts and when the animal hears the shouts of others). What occurs with what we might call the ‘heard hand’? A hand that has handled, and goes on grasping, an object is often prevented from moving to try to produce more self-generated sounds that would confirm that the previously heard hand is the subject’s own. The hand-mirroring (which, in my opinion, would originally have been visual, given that manual grasping is audible in only a minority of cases) probably extended very quickly to the ‘heard grasp’. On one hand, knowing if the hand which has grasped something in the vegetation and has made recognisable noises is one’s own hand or not is clearly a very useful piece of knowledge.17 On the other hand, it is knowledge that can be derived almost immediately from the mechanism of visual mirror neurons. Which hand is causing these noises? The animal would calculate the postural expectations corresponding to this ‘heard grasp’, and its own postural sensations would immediately enable it to establish whether or not these expectations are satisfied in itself. In short, the original usefulness that our speculation has attributed to classic mirror neurons can be equally well attributed to these auditory mirror neurons. We must pause at this point because of a fact that is not immediately accommodated within the previous speculation. Let us first summarise this speculation once more. Let us remember that in mirror neurons in their so-called ‘motor’ role, the 17. Of course, this knowledge will require a previous perceptive skill, namely having learned that when different objects are broken, opened or scratched they will produce recognisably different sounds. But fitting sounds to objects is not the task of auditory mirror neurons. These neurons would only have the role of establishing if the grasping heard was one’s own or not-own (Remember 1.2, about mirroring involving paper or tool).





Becoming Human

expectation of the results of a movement is previous to the movement. These previous (that is, classic) expectations are of no usefulness in differentiating whether the seen hand holding an object is another’s or one’s own hand. In this situation, the hand has already finished its movement and has already satisfied previous expectations. It is therefore in this situation where the task of discrimination between one’s own seen hand and the seen hand of the conspecific must turn to the sophisticated method, i.e. to mirror neurons in their mirroring role. Now for the problematic fact: very shortly after mirror neurons were discovered, it was noticed that they were also activated in the case of a hand which was hidden immediately before it reached the graspable object toward which it was moving (Umiltà et al. [2001]). Here, however, I have explained mirroring based solely on the already performed grasping function. How can we face the challenge this datum poses to our suggestion? Clearly, our suggestion would be unsupportable if this were the only situation where mirror neurons are activated. In this situation, the sophisticated mirror neuron mechanism is entirely unnecessary in order to detect whether the hand is another’s or one’s own. The immediate movement or absence of movement in the subject would be sufficient. However, since this is by no means the only situation, the datum is explainable within the lines of our suggestion. Here, the prediction of grasping would have been produced by solely visual means, i.e. without any motor simulation. More concretely, this prediction would be supported by dedicated neural substrates within the visual system, most noticeably the superior temporal sulcus (STS). Let us pay attention to the discovery of STS cell populations coding for actions in relation to contextual cues rather than for actions per se (Jellema & Perrett [2005]). This prediction, which at first would have been derived, I repeat, from mere visual perception of trajectory, could then unleash an identical reaction to the reaction provoked by the visual perception itself. We already know what such a reaction to visually perceived manual grasping consists of: according to my hypothesis, there will be an a posteriori postural expectation reaction. As is evident, we have had to accept an assumption in making this explanation: the prediction of the grasping would first have emerged as a merely visual prediction. (Supra, in 1.2, I have focused on this type of prediction.) Clearly, this is no more than an assumption. However, it is one that, it turns out, is quite probable. We know that mirror neurons will not be activated in this situation if the trajectory of the hand is not adjusted to the necessary requirements for ‘good visual coherence’. A different attack on the usefulness we have speculated may come from the discovery that mirror neurons also respond to grasping made with instruments: Ferrari et al. (2005). Here, as the reader may already anticipate, I will insist on the datum that this habituation needed many attempts and lasted for months. Likewise, I will reiterate that the originary utility of a resource does not have to appear in all its uses. But we must not stick to just this defensive comment. Such comment is certainly appropriate in the context of arguing in favour of ‘adaptive but non-social role’ of original mirroring. However, Ferrari et al. (2005) and, even more, Umiltà et al. (2008)

Chapter 1.â•‡ Monkeys’ mirror neurons 

may be interpreted as highly compatible with my more nuclear proposal, i.e. with a posteriori expectations. In this latter study, monkeys were trained to grasp objects using two types of pliers: normal pliers, which require typical grasping movements of the hand, and ‘reverse’ pliers, which require hand movements executed in the reverse order (that is, first closing and then opening the fingers). The results showed that mirror neurons discharged during the same phase of grasping in both conditions, regardless of whether this involved opening or closing of the hand. So, there would not be a faithful simulation of movements. This type of data clearly favour Csibra (2007, p. 436)’s proposal (“action mirroring in the observer is achieved not by direct matching but by emulative action reconstruction”), and, likewise, would have led to the emphasis on the part of Rizzolatti & Sinigaglia (2010, p. 268) in that “parietalfrontal mirror neurons – owing to their motor nature and the fact that they encode the goal of motor acts – can be triggered by different visual stimuli that have a common goal (for example, grasping)”. But with equal, or even greater, clarity they favour my proposal that in mirroring there would not be latent motor acts, but only a posteriori expectations. Let us return to the adaptive but non-social role. Hickok (2009) concludes his critique of the dominant theory about mirroring with the following words: “the action understanding theory (...) has distracted the field away from investigating other possible (and potentially equally important) functions”. I do not think that he is referring to the above speculation. As far as I know, nobody has yet taken that possibility into account. However, this, namely, “another possible (and equally important) function”, is exactly what this speculation of mine has suggested for mirror neurons. As I have already said, if this speculation were to be confirmed, it would be a support for the hypothesis we are really interested in making. Note that it is only through a posteriori postural expectations (in conjunction with the absent or present somatosensory information, of course) that another’s and one’s own grasping can be differentiated. In contrast, motor simulation, far from being the cause, would in fact depend on this differentiation (Let us remember Brass & Heyes [2005, p. 493]: “why do we not imitate all the time?”.) A further question remains unanswered after all this: Why would mirror neurons associated with the hand have led to the emergence of mirroring (true mirroring: supra, 1.3.2) associated with the mouth? Why would this extension have arisen? A first possibility is to see this as merely a consequence of the association between one’s own hand and somebody else’ hand. (Keep in mind the two Piagetian paths signaled above, in 1.3.2. Both paths are associated to self-perceptible movements.) The extension would thus originally have taken place outside any useful function. A second answer would be that the expansion towards the mouth would have had a useful function which was not shared by mirror neurons associated with the hand. But the latter probably has to be discarded. Certainly, in chimpanzees, we will propose (in the next chapter) that the understanding of the matching of mouths would have a useful



Becoming Human

consequence, that is, the understanding of matching between own body and foreign body. However, in macaques, it is much more doubtful that such matching between whole bodies takes place. Therefore, it seems that we have to dissociate from any kind of utility the expansion towards the mouth.18

1.6

The relationship between the central hypothesis of this chapter and the above speculation

But we must concentrate on the key issue. The central hypothesis of this chapter is independent from the above speculation. More precisely, the central hypothesis can admit other possibilities without any problem. One of these possibilities is that the novel type of expectation, i.e., the a posteriori expectation would have originally emerged without any adaptive utility (i.e. as “a mere Hebbian learning”) and only afterwards did exaptation take place. Another possibility is that mirroring – that is, a posteriori expectations, according to my hypothesis – originally emerged in self-audible movements (birds’ learned songs) and not in self-visible movements and, consequently, was originally unrelated to the hand (something which would not prevent the hand from being a crucial element in the origin of visual mirroring). Against this background, self-perceptibility would be a block shared by different situations. The adaptive advantages would be different for each situation. Birds have an interest in comparing the singing they hear in their environment and the singing they produce (this issue will be seen infra, in 6.2 and 6.3.2). For monkeys, by contrast, the advantage would consist in a distinction between one’s own hand grasping and that performed by other individuals. However, despite this variable functionality and despite the long evolutionary distance between birds and primates, the basic building block could be homologous*. See Fitch et al. (2010, p. 796): “Capabilities that are convergent at one level (e.g., behavioral) may employ mechanisms that are homologous at another level (e.g., genetic)”. See also de Waal & Ferrari (2010, p. 201): “There is increased appreciation that the basic building blocks of cognition might be shared across a wide range of species”. (Nevertheless I will continue focusing on primates; as I said above, the very nature of human cognitive novelties would have been radically different if the starting point had not been the last common ancestor of chimpanzees and humans.) All this, I insist, is perfectly compatible with my central hypothesis. If we reject the previous speculation about the original adaptive advantage, the hypothesis only loses 18. But there might be another utility: For example, observing adults’ feeding habits might be useful to macaques’ young. Certainly, what, in this field of eating habits, has come to be known as ‘program-level imitation’, is limited to apes: see Byrne & Russon (1998) and Whiten et al. (1996). However, it is likely that, by observing their parents, the young of monkeys learn, at least, which part of fruit is edible.

Chapter 1.â•‡ Monkeys’ mirror neurons 

one of its supports and not exactly the most reliable one. Thus, we should move on finally from the question of the original utility of mirror neurons.

1.7

Summarizing the hypothesis defended in this chapter

Let us return, therefore, to the central hypothesis of this chapter and insist on its central question. The firing of mirror neurons in their two roles would not correspond to motor commands, but to expectations of postural results. There are no postural expectations prior to the observed movement of the conspecifics, as they would be impossible; there can only be a posteriori expectations, that is, postural expectations which follow observed movement of the conspecifics. Here, in this time-relative inversion, lies the great novelty of mirror neurons in their mirroring role. The ancient type of expectation can only point to a (possible) future moment, that is, to the moment when expectation will finally be satisfied. With mirroring, on the other hand, there are expectations that do not progress towards the future, but are derived from their visually observed satisfaction. For this reason, mirror neurons constitute a perception of internal states of others (a perception of others “from the inside”, as Rizzolatti & Sinigaglia [2010, p. 264] say). However much I have critiqued the interpretations that have been made of mirror neurons, I fully share the idea that mirroring is a very important landmark in evolution. The expectation of those internal postural states that is activated in the macaque corresponds to an action of the conspecifics. In some sense, we would already have come across, in the observer macaque, the beginnings of the ability that is absolutely central for human beings, namely, the ability to perceive internal states of others. However, even if we accept all this, we must heed the other side of the coin. That perception performed by monkeys’ mirror neurons is very primitive. I would enumerate three characteristics that attest to this primitivism. Firstly, macaques’ mirror neurons relate only to a very limited area of their bodies (only hand and mouth). Secondly, internal state of the conspecific originally is not yet interesting in itself. (Originally, the perception of this internal state would be a merely secondary effect, that is, Keysers & Perrett’s ‘Hebbian learning’, or, as in my risky speculation, it would have an adaptive function, but one which would be limited to confirming – or, in their case, rejecting – that the seen hand is its own). Thirdly, and this is the result that really interests me, there would be no simulation or latent imitation involved at all, only expectation (albeit a posteriori) of internal states. The role played by this chapter in the book as a whole derives exclusively from this third characteristic, I insist. The difference between simulation and expectation, or more precisely, the limitation of any expectation, is that expectations are always possibilities of the expecting subject. Consequently, only simulation can implant in a subject contents radically different from his own contents, that is, contents that belong to a self that is currently communicating



Becoming Human

and interacting with the subject. (Infra, in Section Two, this difference will be the nuclear issue.) The last of these three limitations or absences would only disappear, by my understanding, with human beings. I conceive the step from confinement within the resource of expectation to the possibility of true latent imitations or simulations to be extremely demanding: we would have to consider simulation as a second line of awareness that would be added to the first line (that is, to the awareness of own, real current states) only in humans. Would there not then be simulation in chimpanzees? It is precisely this question that the next chapter will address.

chapter 2

Chimpanzees and the visual field of the conspecific 2.1

From mirror neurons to the ability to reckon the visual perceptions of the conspecific

It is well known that when a chimpanzee sees a conspecific or a human being looking toward a particular place, the animal will normally make the appropriate movements to be able to look toward that place. Nowadays two explanations are fiercely disputed. According to one of the explanations, chimpanzees can ascribe visual perceptions to conspecifics. The alternative explanation is less demanding: Why not more directly link observed behaviors (reaching, locomoting) to the object achieved, and learn correlations with torso direction? I will deal with this debate in 2.2. But now I go back to the issue of the preceding chapter and highlight how closely related my proposal there and the first explanation above are.

2.1.1

From the perception of the matching between another’s and one’s own body to the ascription of visual perceptions to conspecifics

Thus, in this subsection I will assume that chimpanzees can ascribe visual perceptions to conspecifics. In order to detect what it is that a conspecific is viewing from its location, posture and orientation, they detect what they themselves would be seeing if they were in the same circumstances as the conspecific that they are observing. In my view, chimpanzees perceive the matching between their own body and the body of the conspecifics. We should highlight just what an enormous achievement this matching probably is. A chimpanzee cannot see its head or its whole body; it only senses internally the movements or postures of its head or entire body. However, it does possess a body schema common to the body of the conspecific, which is only seen, and its own body, which is only sensed. What relationship might this have with mirror neurons? I suggest that the move from the matching of the hands to the matching of whole bodies has not been the only occurrence. The other change – in other words, the absence of another of the primitive features of macaques’ mirror neurons – is perhaps more important. In the evolutionarily original mirroring (i.e., in my view, in mirror neurons associated with the hand), the internal state of others was not interesting in and of itself. This is the key point both for exaptation hypotheses, as well as for speculation about the



Becoming Human

non-social adaptive function. In this speculation (above, 1.5), the attention paid to internal state of others would originally seek only to dismiss the visual perception of hands of others. The hand of the conspecifics was only of interest, I repeat, in order to successfully concentrate on the perception of one’s own hand. That function was useful in relation to the hand because one’s own hand and the hand of others could be mistaken for each other. By contrast, such a function would be pointless now that whole bodies are what is matched. One’s own body, sensed but not seen, and the body of the conspecific, seen but not sensed, can never be mistaken for each other. Thus, the role of the chimpanzees’ ability might be to take interest in states of the conspecific in themselves, and not only to dismiss them, as was the case with macaques’ mirror neurons.

2.1.2 The side-effect of the perception of the matching between one’s own body and another’s body The adaptive usefulness of the detection of the visual field that is being seen by a conspecific would, therefore, be what propelled the evolution from self-perceptible movements and mirror neurons to the matching, achieved by chimpanzees, between one’s own body and another’s body. This suggestion explains a curious, or more correctly, shocking ability of chimpanzees, namely, the ability to imitate simple movements, including non-self-perceptible ones. Why should this ability be shocking and requiring explanation? Because it is unlikely that it can provide any adaptive advantage. Chimpanzees can imitate simple movements (even movements that are not visible in their own bodies), as Custance, Whiten & Bard (1995) demonstrated by experiment (see also Whiten et al. [2004]). But chimpanzees do not use this ability for any task. This is not at all surprising. The ability to imitate simple movements is no use whatsoever, since the subject already possesses each of these movements beforehand and knows perfectly well when to use them. Obviously, things are completely different with regard to the imitation of complex movements that are new to the subject, or, in other words, motor learning. Here, the potential for usefulness is enormous and never-ending: suffice it to say that all words are complex motor patterns. But this kind of imitation (the imitation of complex movements, that is), has never been observed among chimpanzees. Consequently, that lack of use of their real but very limited motor imitation abilities is logical and quite justified. What about the ‘program-level imitation’ of Byrne & Russon (1998)? These authors maintain that this complex kind of imitation is observable among apes and that it can also be useful for them. I have no objection to accepting this idea. But program-level imitation is defined precisely in contrast to motor imitation, which is, we must remember, the subject of interest to us here. In fact Byrne & Russon’s proposal is only one example. That goals (physical, end-state goals or mentalistic goals) mediate complex imitative behaviour is a well-known suggestion. Certainly, if the motor pattern is broken down into a long series of subgoals, then ‘goal emulation’ and genuine imitation of new and complex motor patterns will become indistinguishable. However, the results

Chapter 2.â•‡ Chimpanzees and the visual field of the conspecific 

of these two motor learning routes will not be really similar if the environment does not provide sufficient relevant points to be chosen as subgoals. (Note that circus animal trainers typically resort to building an environment with numerous and well-distributed relevant points.) In short, results will become really similar only on particular occasions.1 In addition, beyond the difference in results, there is a more crucial issue in order to differentiate between program-level imitation and genuine imitation of new and complex motor patterns. Genuine imitation of new and complex motor patterns is much more demanding and powerful than any ‘program-level imitation’ (I will discuss this in Chapter 8). Returning to our argument, we can say that chimpanzees’ ability to imitate simple movements has nothing to do with any ‘program-level imitation’. As soon as we accept the uselessness of the imitation of simple movements, we are forced to think of a new suggestion regarding such imitative abilities of chimpanzees. There we would find a side effect of another ability, an ability which in this case truly does have adaptive advantages. That adaptively advantageous ability would be, as I have said earlier, the ability to match one’s own body and another’s body and thus to reckon visual fields of the conspecific.

2.2

Are chimpanzees merely exploiting visual findings of the conspecific? Introducing the current debate

But it is now time to ask ourselves the question which is the object of a debate which has reached boiling point in the last five or six years. In what sense is the ability to detect visual fields of conspecifics useful for chimpanzees? Is its only benefit to enable them to use others’ visual findings (as Povinelli and his associates claim)? Or, on the contrary, is it also of benefit so they can ascribe (i.e., attribute) visual perceptions to conspecifics and thus predict their behaviour (as Tomasello, Call and Hare claim)?

2.2.1 Ascribing visual perceptions: The experiments carried out by Hare, Call & Tomasello As a previous paragraph announced, I support the most generous position as regards chimpanzees’ abilities. I think that these animals would be able to perform that ‘ascription and prediction’. When I read Tomasello, Call & Hare (2003), I find their arguments persuasive. Given the intense social life of chimpanzee clans, we must think that the detection of visual fields of the conspecifics would have immediately brought about the more complex ability, provided the brain requirements involved in ‘ascription and prediction’ were not too much for apes. What is more, if the exploitation of others’ visual findings were not associated with the ascription of that visual perception to a 1. Leighton, Bird & Heyes (2010) express their opposition to the general idea that goals mediate true imitative behaviour.



Becoming Human

conspecific, this exploitation could often give rise to disadvantages. Suppose a chimpanzee exploits the visual finding of a dominant conspecific, and suppose that visual finding relates to food. If there had been no ascription of the visual perception to the dominant animal, our chimpanzee would immediately go after the food he had just discovered and then suffer the punishment inflicted on him by the dominant animal. Of course, this conclusion can obviously be avoided if we add that the use of others’ visual findings is restricted to detecting dangerous situations. So far, however, there are no data to support such a restriction. Let us now look at last at the experiments carried out by Hare, Call & Tomasello (2001) and Tomasello, Call & Hare (2003), which support the idea that chimpanzees ascribe visual perceptions to conspecifics. The first thing recorded by these authors is that when a human looked at an object that was placed behind a barrier the chimpanzees moved accordingly to get a suitable visual angle on the human’s visual field. The chimpanzees even looked back at the researcher if they inspected the corresponding visual field and found nothing there. But none of these observations were relevant to the question of whether or not chimpanzees ascribe visual perceptions to other individuals.2 Thus, Tomasello, Call & Hare designed a further, genuinely interesting, experiment that makes use of the rivalry between a dominant chimpanzee and a lower-rank individual. In half of the tests, the subordinate animal could see a piece of food that the dominant animal, because of a barrier, could not. There was no barrier in the other half of the trials. The (statistically strongly predominant) result was that the lower-rank individual took the pieces of food when the dominant animal could not see them, whereas it refrained from doing so when the food was visible to the dominant subject. In a new set of experiments, an even more sophisticated ability was tested. The lower-rank individual was held back at a distance from which it witnessed the higherrank animal either look or not look at a piece of food. When the lower-rank animal was released a moment later, it found that the piece of food had already been hidden from both chimpanzees by a barrier. Nevertheless, the behaviour of the lower-rank chimpanzee still continued to adapt flexibly to the circumstances. The (statistically strongly predominant) result was that the subordinate animal took the food when the dominant one had not seen it, and refrained from doing so when it had. In my view, in this new set of experiments chimpanzees ascribed to conspecifics, at the moment m, the goal that the perception of a visual field would have provided in the moment immediately preceding m. The question here is whether or not that sensible behaviour on the part of a lowerrank animal can be adequately explained as a result of mere conditioning between 2. Gaze-following is common among all primate species and possibly shared by a wide range of animals: See Teufel et al. (2010) about monkeys, and also Kaminski, Riedel, Call & Tomasello (2005) about domestic goats, Miklósi, Polgárdi, Topál & Csányi (1998) about dogs, and Schloegl, Kotrschal & Bugnyar (2007) about common ravens.

Chapter 2.â•‡ Chimpanzees and the visual field of the conspecific 

other individuals’ behaviour and their torso direction. In other words: is it correct to suggest that chimpanzees sensibly guide their behaviour after having first noticed the matching between their own body and that of other individuals and accordingly ascribed visual perceptions to them?

2.2.2 Povinelli’s argument: The conspecific’s blindfolded eyes The author who most vigorously rejects those results, or, in other words, who maintains what (above in 2.1) I have called the second option, is Povinelli (2004) (see also Povinelli & Vonk [2004]). For years this author has been opposing what he calls the ‘argument by analogy’. Although two behaviours, one of them human and the other one animal, may be analogous on the surface, this does not allow us – Povinelli claims – to interpret them as the expression of the same process. Of course, such a claim is, in principle, not disputed. The question is where we should reject the analogy of the underlying processes, and, in particular, whether the detection of conspecifics’ visual fields by chimpanzees must be interpreted ‘behaviouristically’ or, on the contrary, ‘mentalistically’. In general, it might be said that the last few years (say, from the mid 90s onwards) have seen something of a revival of anti-mentalism. This is probably a pendulum reaction. Mentalist abuses in fields such as cognitive ethology had become notorious.3 It is in this context, it goes without saying, that we need to place Povinelli. According to Povinelli, several failures by chimpanzees point to the rule followed by these animals being a behaviourist one. The rule would be something like the following: ‘The direction of the torso of a conspecific may reveal where there are interesting objects’.4 According to this author, no ascription of visual perceptions would, therefore, occur in chimpanzees. What are the failures that Povinelli has detected? When it comes to supposed ascribing of visual perceptions to a conspecific, chimpanzees make no distinction as to whether the subject has his eyes blindfolded or not. Since the animals had previously experienced the effect of an opaque cloth on their eyes or a basket placed on their head, Povinelli concludes that this lack of sensitivity favours a non-mentalist interpretation of chimpanzees’ behaviour where they look towards where a conspecific is looking. Is this conclusion correct? Is this argument forceful enough to object to the results of Tomasello, Call & Hare’s experiments? 3.

A sarcastic criticism of those excesses can be seen in van Rooijen (2010).

4. Let us examine some data regarding macaques’ brains, more concretely, their superior temporal sulcus (STSa). “Two-third of STSa cells selective for whole body motion prefer a combination of body view and direction where the body moves in the direction that the head is pointing (walking forward following the nose)” Jellema & Perrett (2003, p. 1735). In other words, these cells pay attention to the direction of the torso. However it is clear that this datum does not answer our question of whether or not chimpanzees ascribe visual perceptions to other individuals.



Becoming Human

Certainly this question is what interests us, but let me insert a digression. Penn, Holyoak and Povinelli (2008) emphasize their explicit purpose of stressing the human peculiarity. In this respect, their approach is certainly very close to my own. However I think that Povinelli’s view – which rejects the idea that chimpanzees’ abilities represent a preliminary or primitive stage, however remote, of the human skills concerning a theory or perception of the minds of others – is less appropriate for clarifying this peculiarity.

2.2.3 Blindfolded eyes and adaptive usefulness The question is why the specific detail of the blindfold or basket is not included as part of the posture shown by a conspecific. Is this non-inclusion proof that chimpanzees do not ascribe visual perceptions to conspecifics? For an observing chimpanzee, blindfolded eyes would no doubt be absolutely decisive if the objective were to detect eyes of the conspecific fixed on him. This would be the case even for non-primates. However, as far as the ability of ascribing visual perceptions is concerned, things may be different. From the perspective of adaptive advantage, including the detail of a blindfold over the eyes could perhaps be an unnecessary complication. Cases where an animal in the wild was unable to see would have been too rare to have influenced the evolutionary configuration of the ability to ascribe visual perceptions. Let us remember that, according to previous suggestions (in 2.1.1), this ability would be an extension (and also a change in function) of monkeys’ ability to detect the a posteriori postural expectations corresponding to an observed hand or mouth. That extension would have given rise to the matching of bodies because this was a requisite for the fulfilment of the new function. Under normal conditions this was not only necessary but – and this is the key point – also sufficient. Why should that extension become more refined? Why should it pay attention to such strange details as blindfolded eyes or closed eyes in a perfectly upright body? Thus, in my opinion, Povinelli’s results do not refute a ‘mentalist’ interpretation. The inability to discern the detail of a blindfold would be perfectly compatible with an ascription of visual perceptions to a conspecific. While here I am accepting Povinelli’s point about the chimpanzees’ inability to discern the relevance of a blindfold on someone else’s eyes, Premack & Premack (2003, p. 141) do not. They argue that at least one of the chimpanzees with which they worked for years would remove the blindfold from the researcher’s eyes, but not the nose or the top of the head, for the researcher to go to the box containing the reward and unlock it. The Premacks interpret Povinelli’s results as the consequence of previous training of the chimpanzees by Povinelli, training in which, according to the Premacks, the animals had become used to obtaining food without concerning themselves with where the donor was looking. But let us return to what interests us. An observing chimpanzee would actually ascribe the perception of a visual field to his conspecific, and on this assumption, as the experiments

Chapter 2.â•‡ Chimpanzees and the visual field of the conspecific 

by Tomasello, Call & Hare have suggested, he would successfully anticipate the reactions of that conspecific.

2.2.4 Ravens and chimpanzees Nowadays many comments made on the results of Tomasello, Call & Hare are very different to those made by Povinelli. Thus, instead of questioning whether chimpanzees are able to ascribe visual perceptions to conspecifics, what seems increasingly doubtful is that this ability is restricted to chimpanzees: See Fitch et al. (2010). In this way, macaques have been shown to discriminate between human experimenters who can and cannot see food (Flombaum and Santos [2005]), as well as individuals who can and cannot hear the removal of food (Santos et al. [2006]), indicating multimodal sensitivity to others’ perception. (This latter observation seems contrary to Bräuer, Call, Tomasello [2008], who claim that chimpanzees do not take into account what others can hear in a competitive situation.) In addition, there is evidence of similar mechanisms in both scrub jays and ravens: See Emery and Clayton (2001) or, even more crucially, Bugnyar and Heinrich (2005). At this point, I would like to be more specific about my suggestion. Bugnyar & Heinrich (2005) and Tomasello et al. (2003) have showed that ravens and chimpanzees modify their tactics in relation to whether or not competitors previously had the opportunity of observing food. In addition, these authors observed that animals know that obstacles can obstruct the view of others. But I would like a research to be made regarding a slightly different ability. An animal knows where an object is thanks to the fact that it has seen it recently. But in the moment that we take interest in it, this animal no longer sees that object. What he sees in that moment is a conspecific placed in a location, orientation and posture which are adequate for the perception of the object in question. Will the animal be able to predict the subsequent behaviour of its conspecific? As I say, this experiment should be carried out with the two species, ravens and chimpanzees. Will the two species be able? Will both be unable? Will only the chimpanzee be able? I think that this ability that we have just described is much more demanding than the one studied in the experiments by Tomasello et al. 2004 or by Bugnyar & Heinrich, 2005. The one that all these experiments study is ‘logically compatible’ with the interpretation proposed by Heyes (1998). This is admitted by Bugnyar & Heinrich (2005). Gaze-following, detection of visual barriers that are placed between the conspecific and the object, grasping of the affordances of the object: The total of these resources and learnings could, perhaps, be enough to predict the conspecific’s behaviour. Likewise, that total of resources could be enough to explain that, when a human looks at an object that is placed behind a barrier, the chimpanzees, and also the ravens (Bugnyar et al. [2004]), move accordingly to get a suitable visual angle on the human’s visual field. However, if animals succeeded in the task that I have just described, that result could not, in my opinion, be interpreted in the deflationary way proposed by Heyes. In



Becoming Human

addition, if an animal species is able to succeed in this more demanding task, then it will be credible that, in that species, the success in the less demanding task is also based on the high level resource. But, moving on to what really interests me, let’s start by supposing that a particular species would be able to carry out that difficult type of attribution and prediction. Even if that supposition turned out to be true, that would not have to force us to award simulation to that species. That difficult type of attribution would not require – this is my proposal – any simulation of the mental contents of the conspecific. In order to explain it, a posteriori expectations would be enough. Certainly, if the subject performed the correct movements himself to get to see the object again, then its expectation to see the object would have been prior to such movements; that is, that expectation would be a usual, a priori expectación. However, if the visual expectation is activated in the subject only after the conspecific has made some movements and has placed itself in the correct location, posture and orientation, that would be an a posteriori expectation. That particular type of expectation would be an intermediate resource between the very simple abilities that Heyes invokes and genuine simulation. But we must proceed more slowly. Let’s go back to the results of Tomasello, Call & Hare (2003), and let’s interpret them in a non-deflationary way. As can be seen, once we accept these results and interpretations, the chimpanzee is very close to the human-type perception of another’s interiority. But, how close? This is also the question which Tomasello et al. (2006) try to answer. Will we adopt their conclusions here? I will leave this issue until later. For now, we need to focus on a more basic one.

2.3

What is involved in ascribing visual perceptions to conspecifics?

How should the ascription of visual perceptions to conspecifics be evaluated? Or, using Theory of Mind terminology, does that ascription equate to a perception of mental states of the conspecific? Human beings possess second-order mental states in which they can perceive beliefs or wishes that differ from their own currently active beliefs and wishes. In the last few years of this new discipline that deals with the ‘theory of mind’, these meta-mental states, or meta-representations, have been considered as exclusively human. We need now to ask ourselves about this. Can we still maintain that these states are exclusively human or, on the other hand, would the ability to ascribe visual fields to others hypothesised in chimpanzees refute this presumed human peculiarity? To begin with, we must realise that in this ascription of visual fields, we are not exactly dealing with a mental state of the conspecific. As Gómez (1998) has stressed, the content of the visual perceptions ascribed is an external, not mental, content. The visual field is there, outside, shared by one’s own eyes and the eyes of any other individual. Admittedly, someone might not have noticed it; in this sense, the visual field will not be shared by all individuals. However, once a visual field is ascribed to

Chapter 2.â•‡ Chimpanzees and the visual field of the conspecific 

someone else’s eyes, this field, or rather (as in Flavell’s developmental Level 15) this set of seen (and meaningful6) objects, will be considered as shared. Accordingly, the visual perception being ascribed would necessarily be shared by the attributor (or ascriptor) and the conspecific, that is, by no means would it be only another’s perception, but another’s and one’s own at the same time. By contrast, the concept of belief (that is, the perception of a belief as such) emerges at the very moment when that common vision of reality breaks down and the distinction must be made between reality, on the one hand, and the incorrect or insufficient belief that someone has shown themselves to hold about that reality. This is why it is only with verbs of belief, and not verbs expressing perceptions, that a false complement can leave the truth of a main clause intact. Compare ‘In antiquity people believed that the earth was flat’, or ‘Some naturalists in antiquity believed in unicorns’ (both of which are true, despite the falsehood of the complement) with ‘My brother has seen a unicorn’ (false because of the falsehood of the complement). This well-known contrast of perception with belief has been used by some authors to argue in favour of the difference between ascribed visual perceptions and ascribed mental states. Certainly, I find no difficulty with it as an argument, and support it without hesitation. However, I think that the understanding of false beliefs of others, probably well out of reach of the abilities of chimpanzees, is not the ideal term of comparison. The comparison with something less sophisticated may turn out to be a more clarifying theoretical instrument. This is why I will suggest another reason against viewing as second-order mental states the visual perceptions that a chimpanzee is able to ascribe to conspecifics. This other reason has to do with the extent to which that visual perception being ascribed could be radically and intrinsically different from the attributor’s own perceptions. I shall make two suggestions. Firstly, a visual perception is conceived by a subject as radically different from his own perceptions only when that subject is able to ascribe it to eyes that are observing him (Bejarano [2003a]). Secondly, this extremely particular kind of visual ascription would be impossible for chimpanzees (or, in other words, could not be supported by mere a posteriori expectations). But this leads us on to what we shall discuss in the next chapter.

5. Flavell has provided supportive data for the two stage development of perspective-taking skills. In the earlier developing Level 1, children can infer what a person does or does not see, and are also capable of saying what objects can be seen by them and not by another person. At the later developing Level 2, children are aware that an object gives rise to differing images, depending on the point from which it is viewed. (Masangkay et al. [1974]; Flavell et al. [1981]). 6. “Perception involves proprioception and affective and motivational state”: Reddy (2008, p. 29), who also mentions the main milestones of the long history of that idea. Cf. also Fuster [2009, p. 2054]: “Perceiving is remembering as much as sensing”, or Dretske (1995) (meaningful perception versus sense perception).

section two

The basic human ability

chapter 3

The three modes of processing the eyes of others 3.1

The progressive convergence of this issue and the ‘theory of mind’

In ‘theory of mind’ research, it took some time for the question of processing of the eyes of others to be addressed. As we know, almost all interest, at first, focused on the understanding of beliefs and desires. For a long time, cognitivism saw visual perceptions as being too far from the mind. This would only change when the pointing gesture, which Bruner (1977) (or Scaife & Bruner [1975]), and Bates et al. (1976) had already emphatically stressed, was introduced into the agenda of the ‘Theory of Mind’, as well as Trevarthen’s concept of secondary intersubjectivity or triadic communication (also in the 70s: See a very late description in Trevarthen [1998]). Among the factors responsible for such a change in the agenda, we must also mention Butterworth’s work on the long and progressive development of pointing gestures in children (Butterworth, [1991, e.g.]) and, on the other hand, the more general issue of autism. Baron-Cohen was the first to systematise the processing of another’s eyes. This author’s starting point was an interest in autism, in which he saw a sort of ‘negative’ of the ‘theory of the mind’. The symptoms that allow an early diagnosis of autism are not related to typical ‘second-order mental states’, that is, with the perception of beliefs of others, but with skills which appear much earlier in a child’s development, such as communication through sight direction or pointing gestures. This led Baron-Cohen (1999, for example) to propose three modules whose staged maturity would lead to successive forms of processing the eyes of others: I(ntentionality) D(etector), E(ye) D(irection) D(etector) and S(hared) A(ttention) M(echanism). I will certainly take Baron-Cohen’s modules on board in this chapter, but I will also move some distance from them. My hypothesis focuses only on the so-called Direction Detector, and identifies three very different evolutionary levels of this detector. Kobayashi & Koshima (2001) are notable along a different line. These two primatologists studied the presence and absence of the ‘white of the eye’ in different primate species. Certainly, their final statement – only humans possess a conspicuous and contrasting ‘white of the eye’– only transcribes a datum that had always been in clear view for everyone. However, this statement was extremely important as it meant the incorporation of the white of the eye into what we might call the official list of exclusively human features.



Becoming Human

3.2

What are the three different modes of processing the eyes of others to be proposed here?

3.2.1 The difference between the first mode and the second one The first mode would have started early in evolution, long before primates appeared. An animal detects an eye staring at it, and responds by increasing its alert. That is why some butterflies have eye-shaped markings on their wings. Even a mark only remotely resembling an eye scares away potential predators; for that reason, that resemblance was progressively selected. But it is not only the alert response. An eye staring at oneself would always indicate the opportunity for any kind of interaction. We know that, right from birth, humans are fascinated by other people’s eyes. Already as neonates, we prefer to look at faces compared to non-face stimuli (Johnson et al. [1991]), and we favor faces with open rather than closed eyes (Batki et al. [2000]) as well as faces with mutual rather than averted gaze (Farroni et al. [2002]). Similar preferences in nonhuman primates suggest that the tuning of the neonatal visual system to face- and eye-like stimuli is a general primate heritage (Myowa-Yamakoshi et al. [2005]). In addition, Rosa-Salva et al. (2010, p. 565) have concluded that “Faces are special for newly hatched chicks (...), and the eye region of stimuli is crucial in determining the expression of spontaneous preferences for faces”. In my view, the first mode would typically involve no ascription of visual perceptions to other individuals. To be more precise, let me yuxtapose two ideas. The first idea is that, as the current followers of Trevarthen’s approach (Reddy [2008]) suggest, two month-old babies are capable of genuine emotional interactivity. The second idea is the assumption that two month-old babies are not yet capable of ascribing visual perceptions to other individuals. What I want to make clear is that for me these ideas are not incompatible. That is, the first mode of processing the eyes of others could certainly imply interactivity, but, according to my definition of the phenomenon, it would also lack the power to ascribe visual perceptions to any seen eyes. Likewise, I think that the two following ideas are completely compatible. One, “nonhuman primates can and do make judgments about being watched by others” (see, for example, Tomonaga & Imura [2010]). Other, this sensitivity to being watched would involve no ascription of visual perceptions to other individuals. In this respect, we must also focus on Ferrari et al. (2009, p. 1768), who (besides recording frequent episodes of mutual gazing between days-old macaques and their mothers: see above, in 1.3.2) also make the following remark: “The mother held the infant and actively searched for the infant’s gaze, sometimes holding its head and gently pulling it toward her face”. In my opinion, there would be no ascription here of mental perceptions on the part of the mother to her infant. If there were such an ascription, the scene, as we will soon see in this same chapter, would be a case of the ‘third mode’. However, there is no need, in my view, to resort to such an ascription to

Chapter 3.â•‡ The three modes of processing the eyes of others 

explain that behaviour. In short, following the invocation of Ferrari et al. (2009) to Trevarthen, I would rank that behaviour of macaques as belonging to the first mode. Let us now deal with the second mode. This is actually a processing of others’ visual perceptions rather than a processing of others’ eyes. As we shall soon see, the difference is important. This mode coincides with the abilities that Tomasello, Call & Hare’s recent experiments have shown – or, at least, suggested – in chimpanzees. (See also supra 2.2.4: I would like a research to be made regarding a more clearly demanding task.) Here we must cite Klein et al. (2009, p. R961): “Our sensitivity to others’ gaze is two-fold: more urgently, we sense when we are being watched; more subtly, we sense the referent of observed gaze within our shared environment, discriminating between distal regions which are, or are not, the focus of another’s attention. There is overwhelming evidence that the first manner of sensitivity to gaze direction – sensitivity to being watched – is both innate and shared by most vertebrates. The second manner of sensitivity to gaze direction, however – the use of gaze as a referential cue – remains somewhat mysterious. While gaze following appears fairly reflexive in humans and other primates, its sensitivity to social context suggests that the underlying mechanisms are not strictly modular but rather deeply enmeshed with other aspects of social information processing.” I agree that “the use of gaze as a referential cue remains somewhat mysterious”. Are the experiments carried out by Tomasello et al. (2003) and Bugnyar & Heinrich (2005) really enough to place that ability above the mere associative learning? It may be convenient, I insist again, to try to find in some animal species a visual ascription in more difficult conditions than those of those experiments. In my proposed distinction under the labels of ‘first mode’ and ‘second mode’, the reader will have noticed a certain similarity with some previous paragraphs. When I dealt with imitation in newborn babies (Meltzoff and Moore [1983]) or newborn macaques (Ferrari et al. [2006], and Ferrari et al. [2009]), I considered that kind of imitation as a much more primitive mechanism than mirroring or genuine matching between one’s own mouth and someone else’s mouth. I likewise proposed a radical disconnection between ‘movement priming’ and mirror neurons. In the face of the debate between Povinelli and Tomasello, I interpreted chimpanzees’ abilities as different from the associative learning. In all these cases I defended the hypothesis that two very different capabilities were involved. In distinguishing now a first mode and a second mode of processing others’ eyes I have proceeded in the same way. Let me make a more general comment on this distinction. Instead of rejecting the ‘primatocentrism approach’, as Pepperberg (2005) or Bickerton (2009), or, even more generally, Fitch et al. (2010) advocate, I still think that the focus on chimpanzees can be a very valuable methodological strategy. Certainly habitat changes must have demanded new life styles, and the demands must have induced the changes which eventually produced the peculiarities of Homo sapiens sapiens. However, the changes must have affected what was available then: the abilities of the last common ancestor of



Becoming Human

chimpanzees and humans. In other words, what I have called the ‘second mode’ would have been the indispensable springboard for the emergence of what I will call the ‘third (or exclusively human) mode’ of processing others’ eyes.

3.2.2 The third mode. The most basic and primaeval exclusively human capability Lastly, there would be, I propose, a third mode, which would be exclusive to humans. But, in order to describe it, we should start by making clear that the order the modes appeared must not be interpreted as each mode replacing the one that came before it. Far from that, each mode would be added to the previous one. We can thus say that, although chimpanzees are probably capable of ascribing visual perception to conspecifics, they continue to rely on the phylogenetically old mode of the processing of another’s eye. Chimpanzees (like we humans, as well as many animals outside primates) are quickly able to see eyes staring at them, and respond automatically with an increased level of alert (or of readiness for interaction).1 Why am I insisting on this accumulation of the first two modes in chimpanzees? It is because from this starting point I can pose the decisive question, namely, whether chimpanzees would or would not be able to apply their recent evolutionary attainment to the correct situation for the phylogenetically old procedure. The question is therefore the following: can chimpanzees continue to ascribe a visual perception to a conspecific when the conspecific is staring at them? Or are they, in this particular case, confined to the phylogenetically old procedure, which will alert them to eyes staring at them, but which involves no perceptive ascription being given to those eyes? I will argue in favour of the latter option. Chimpanzees, while they have the second mode and also the first mode, would not be able to combine both. The combination of these is precisely what we propose to call ‘the third mode’, that is, my ability to ascribe visual perceptions to the eye staring at me. A possible objection here could be that non-human primates are able to perceive that another individual is looking at them, say in a conflict. Or, more precisely, that in the experiments conducted by Tomasello, Call & Hare, surely the subordinate chimpanzee is not only aware when the dominant one can see the food, but is also aware that the dominant one can see him. I fully agree. The latter awareness can be explained by means of the first mode of processing other individuals’ eyes. The only thing that my hypothesis rejects is that at that very moment chimpanzees are able to ascribe visual perceptions to conspecifics.

1. Humans detect faces with direct gazes among those with averted gazes more efficiently than they detect faces with averted gazes among those with direct gazes (‘stare-in-the-crowd’ effect) (Cf. von Grünau & Anston [1995])

Chapter 3.â•‡ The three modes of processing the eyes of others 

3.3

Why would the ‘third mode’ be so demanding?

3.3.1 The peculiarity of visual perceptions ascribed in the ‘third mode’ As soon as we have formulated this option or theoretical bet, hypothetical-deductive reasoning requires the process of combining the first two modes to be extremely difficult (Perhaps experimental work by Conty et al. [2010] might provide a small amount of evidence in favour of this difficulty.) If it were not, chimpanzees, which possess both modes, could not be deprived of the third mode. From here, a question forces itself on us. Why would transferring chimpanzees’ new ability to the field of the old ability have to be so difficult? There would be no real value in continuing to set out the suggestion if there were no answer to this question. So, before anything else, let us try to respond. Certainly, we know that any ability which evolution has attained will hardly ever be lost later in evolution (although there are cases). However, the difficulty cannot lie here. Note that this transferral, far from implying any loss, means only that the two different resources, ancient and new, would both be applicable to the field of the old resource. Why, then, would chimpanzees be unable to ascribe visual perception to the eye staring at them? Our first answer is that this type of ascription is very different from any other. The perception ascribed there is one that in no event could be that of the ascriber. Seeing oneself as a distal stimulus is a perception radically and intrinsically different from one’s own perceptions. Let us examine this point. Chimpanzees, as we have said, detect the front/rear axis of the body of the conspecific, and thus calculate the direction in which the conspecific was looking. Then, (by moving in whatever way necessary and which does not at all imply imitating the movements or postures of the conspecific) they supply themselves with the visual field that their conspecific was looking at.2 In other words, chimpanzees obtain from reality the contents of visual perceptions that they are ascribing to conspecifics. Certainly, this does not prevent the ascription from being correct: it is just the same collection of objects that they and their conspecifics are looking at. However, the content of the perception ascribed to conspecifics is primarily the ascriber’s own perception. This is precisely what changes when I ascribe to the other a visual field in which I am included. I will never be able to obtain this content with my own eyes. 2. Piagetian spatial décentration would therefore be accessible to chimpanzees. They may not be able to overcome the complexities of the three mountains test (Piaget & Inhelder [1956]), but the nucleus, at very least, of what decentration there was in that task is perfectly accessible to chimpanzees. The fact that Piaget made no distinction between this spatial decentration and the mental decentration (i.e. thinking ‘according to someone else’) to which he points in his ‘Commentary on Vygotsky’ was probably one of the most significant curbs on the Piagetian model. I have harboured this idea for over twenty years (Bejarano, 1985). In contrast, I have only recently noticed another deficiency in Piaget’s approach: the failure to differentiate between the two kinds of spatial decentration which correspond respectively to what I have here labelled as the second and the third modes of processing the eyes of others.



Becoming Human

3.3.2 Radically not-own visual field, expectation, simulation How can we reach that radically and intrinsically not-own visual content? The only way of understanding that content would lie in imagining a radically not-own interiority. We can thus conclude that such content would be as cognitively demanding as a metabelief (or, according to the terminology of studies of ‘Theory of Mind’, a ‘false belief ’). It is obvious that I must detach false beliefs from me and set them aside, as my knowledge of the world cannot include them. Such beliefs must be attributed to an interiority different from my own. But the conception of that interiority, I stress, would not have to wait for the understanding of false beliefs of others. Less is necessary for it to emerge. It is only necessary for a visual perception that includes the ascriber to be ascribed to the conspecific. That particular ascription of visual perception is the truly ideal term of comparison we were longing for in the previous chapter (2.3). The type of ascriptions found in chimpanzees can be differentiated from a metamental state without having to step outside the ascription of the visual perception. The third mode of processing eyes of others would therefore be highly demanding. We have already said this, but we shall try to explore it more deeply. The key to its demands lies in that here (that is, in this ‘third mode’), the old resource of expectation – which had sustained all animal behaviour effectively – breaks down for the first time. Expectation is no longer useful. By contrast, the ‘second mode’, that is, the ability to ascribe visual perceptions to conspecifics, would require nothing more than an expectation (even if it is an expectation which emerges, not before the result of the movement, but in view of the result already achieved in the conspecific’s body, that is, an a posteriori expectation). Think back to how expectation was defined in Chapter 1. All expectation is expectation of a state that is, in principle, reachable for the subject. When chimpanzees gauge a visual field from the location, posture and orientation of a conspecific, they would actually be gauging the visual field that they, the observers, would obtain if they were in the same circumstances. In addition, as we stressed in that same chapter, expectations (both a priori and a posteriori expectations) are always empty profiles that the subject could satisfy or fill. This is what we often find in the ‘second mode’. After observing their conspecifics, chimpanzees (by moving in whatever way necessary) supply themselves with the visual field that their conspecific was looking at, or, in other words, satisfy the expectations that their observation had raised in them. This is all over the top and obsolete when facing the ‘third mode’. Visual perceptions that are now ascribed to the seen eye – or more in general, contents that are ascribed to a self who is looking at me, addressing to me or communicating with me – are intrinsically unreachable for me. Thus, this ascription was an impossible task if expectation is the only resource. But the impossible task became possible with simulation. Simulation, that is, a second and novel centre within the mind, would necessarily be involved in the third mode of processing eyes of others. This is what I shall

Chapter 3.â•‡ The three modes of processing the eyes of others 

hypothesise. But we should first re-examine more calmly what we have just called the ‘radically not-own visual field’.

3.3.3 Self-recognition in the mirror and perception of a radically not-own visual field: Facing a potential objection We will pause here from our string of arguments to ask the following question: would self-recognition in a mirror be sufficient to provide the subject with that visual field which we were viewing as impossible for him? We have said that the radically not-own visual field includes myself as a distal stimulus. Can I not, however, perceive such a visual field in the mirror? Certainly, the mirror is a cultural artefact of which we can only find very poor approximations in nature. However, leaving aside the question of whether it played a role in evolution, let us address only the question mentioned. Is self-recognition in the mirror sufficient to obtain a radically not-own visual field? Let us imagine the answer was affirmative. The suggestion we have been defending (requirements of the ‘third mode’ can only be satisfied by the human brain) would then be in trouble. Remember that self-recognition in a mirror is an ability that chimpanzees (and probably also other species) do possess. The importance of this question to us is therefore clear: does the selfrecognised image in the mirror equal the visual field of those who are looking at me? Certainly, the image in the mirror in which I recognise myself is similar to the visual field of the other who is looking at me face to face. The similarity is complete, except for the right/left axis.3 However, once the self-recognition of the image in the mirror has been achieved (and this self-recognition must be assumed in order even just to pose our question), this image would be a very peculiar one. Each subject would perceive it as undistally in relation to himself as he would perceive his own hand. Attempting to confirm that it is one’s own hand can sometimes cause a subject (certainly, as long as the subject is not grasping an object or hanging from the branch of a tree: remember 1.5) to execute manual movements in order to check whether he sees them or not. The first self-recognition in the mirror also implies this kind of check. But, I stress, once the understanding that it is one’s own image has been attained, that image will no longer be able to be considered distal, and the subject will no longer be able to consider it as the body of another. On the contrary, within the visual field of the eyes of others that includes myself, my image will be considered truly distal. This marks a profound difference between the two perceptions, mine and not-mine, however much they share the same visual content. So I am inclined to conclude that self-recognition in mirrors, precisely because 3. That axis belongs to the corporal scheme and, therefore, it could well coincide with the gravitational up/down axis when we are looking at ourselves in the mirror while lying down. “In our ecology, the axis perceived as inverted in the mirror turns out to be the horizontal one” (my emphasis): Navon (2002).



Becoming Human

it is self-recognition, does not offer at all the radically not-own visual field that we have linked to the ‘third mode’. There is another different way of reaching the very same conclusion about selfrecognition in the mirror. Several authors – mainly, Mitchell, in press (Mitchell [1993] is a classic study on chimpanzees’ self-recognition in the mirror) and Bräten (1998, p. 115)– have highlighted that contrast, to which we have earlier referred to, between one’s own image in the mirror and a not-own body facing us. In the not-own body (but not in the image in the mirror), the hand that I see facing my right is the left hand. Until recently, I had never considered this contrast to be important. However, now that I have seen how much attention they are given by these authors, I have begun to think that those facts about the lateral axis fit very well with my suggestion about the basic human ability. The moment a subject can understand the break-up of his right/left body axis in the not-own body would be good evidence that simulation has replaced expectation. We shall now examine this in more detail. When do I need to make room for inversion of the right/left axis? Precisely at the moment I pay attention to the interiority of a not-own body that I perceive to be approaching me. But this is the same as saying precisely at the moment when I must dispense with expectation and begin to use genuine simulation. Expectations – which, according to our hypothesis, can in no way be made of an interiority which I notice looking at me, communicating with me or addressing me – do not need to take account of this rupture of axis. The detection of the visual field of the conspecific through the ‘second processing mode’, as we said above, would actually be reduced to the calculation of own expectations – what is it that we would see if we were in the other location, posture and orientation. It is always our own body with its own axis that intervenes there. On the other hand, in order to detect an interiority which approaches me, or which is staring at me, I will have to conceive it as a self radically different from me. Any simulation of a movement toward the right would have to be detachable from the real space to my right. (In other words, ‘anatomical matching’ – not ‘mirror matching’– of the conspecific that looks at me requires genuine simulation.) Thus, from that moment on, two different implementations would become necessary for the lateral axis, one spatial and behavioural and the other for motor simulation. I suspect that this may have quite a lot to do with the fact that the hemispheric specialisation of the animal brain is modified in the human brain.4 But, how could that idea be displayed? That is unfortunately much more difficult than suspecting. 4. In my view, we should investigate the changes occurring in brain activation in situations such as the following. A subject has been looking at himself in a mirror, the light slowly fades and the subject is deluded into believing that he is still in front of the mirror. But then, suddenly he discovers that he is no longer in front of the mirror, but in front of someone who is staring at him. Surely there must be evidence of many brain changes at that very moment, for example, changes corresponding to the subject’s surprise or a state of sudden alert. However, some neurophysiological change attributable to simulation could perhaps also be seen. Could

Chapter 3.â•‡ The three modes of processing the eyes of others 

Animal hemispheric specialisation involves contralaterality and (see Meguerditchian et al. [2010]) some behavioral and brain asymmetries. But the human left hemisphere typically controls language and the rest of the movements that are culturally learnt. How would this particular separation of types of movement link with the two different implementations for the lateral axis? It is only in combination with these questions that the above referred suspicion could really become relevant. Thus, let us leave this issue of hemispheric specialisation for the time being. None of this – none of this duality of implementations for the lateral axis – would be at all required for self-recognition in the mirror. This self-recognition, in order for it to be self-recognition, can in no case view the kinaesthetic interiority of the image as approaching the subject. For this very reason, there is no need for a right/left axis different to our own body’s axis. So we conclude that simulation would not be necessary for self-recognition in the mirror. Or, in the words of our previous question, self-recognition in the mirror does not equal a radically and intrinsically not-own visual field. (See Rochat & Zahavi, in press, whose conclusion about self-recognition in the mirror is close to this.)

3.4

The communicative use of sight direction

We have just suggested that ascribing visual perceptions to the eyes staring at me would be exclusively human. What data can we offer to support it? We might begin by finding the adaptive advantage of that ‘third mode’, that is, by addressing the communicative use of eye direction. Since just before they are a year old, children know that someone is asking them to look at an object when that person looks at the object and at them in turn. This communicative resource, which (almost) always accompanies the hand-pointing gesture, can frequently occur without it.5 What is involved in understanding this communicative use of eye direction? Of course, chimpanzees’ own ability is clearly involved. The perception of the object needs to be ascribed to the conspecific. Chimpanzees are capable of all this, according to Tomasello, Call and Hare. But now this ability would be included within another completely new ability. This ascription must continue to be made when the eyes of the conspecific move on to look at oneself. If that immediately previous that possible neurophysiological change have something to do with hemispheric specialisation? Needless to say, the same experiment would have to be carried out with chimpanzees. What are the cerebral changes which, in that situation, are common to both humans and chimpanzees, and what are, on the other hand, exclusive of humans? 5. More concretely, small children are unable to understand the meaning of any hand-pointing gesture in the absence of this gaze-shifting. Admittedly, in the middle of a conversation, adults’ understanding can dispense with the gaze-shifting, but small children definitively cannot do so.



Becoming Human

perception does not continue to be ascribed, understanding that the other subject is pointing that object out to me will not be possible. Thus, two things – perceiving the eye staring at me, and ascribing visual perceptions to the conspecific – that are each accessible to chimpanzees, will have to be done at the same time in order to understand the communicative use of sight direction. This simultaneity of both things is what would be beyond the abilities of non-human primates.6 In the remaining chapters in this section, I will attempt to extend this explanation to other behaviours present in human beings and absent in chimpanzees. But right now I need to bring in other data in support of the idea itself, namely, the idea that that particular kind of ascription of visual perceptions would be exclusively human.

3.5

The white of the eye

In order to examine this new evidence, let us begin by summarising the suggestion here presented. On one hand, I must receive the impact of the staring eye. On the other hand, I must ascribe visual perceptions to that eye. The difficult part, we have suggested, would not be the ascription itself, but the ascription while the eye is looking at me; or more generally, the difficult part would be the ascription of interiority to conspecifics while they are looking at me, approaching me or communicating with me. Let us assume this hypothesis to be true. Let us assume that the great achievement, exclusive to humans, consists of the transfer of ascription already mentioned – its transfer from the moment the other looks at an object to the moment the other looks at me. If we accept this assumption, we will then be able to ask how that transfer could be facilitated. In chimpanzees, the detection of visual fields of others made use of the conspecific’s body schema. The direction of the torso was the key element, whereas the eyes were of little importance (think back to the observations which gave rise to Povinelli’s apparent arguments).7 However, with a view to enabling ‘the moment when the visual field of the fellow is detected’ to link up fluidly with ‘the moment when the eyes of the fellow will be staring at me’ (i.e. with a view to facilitating the evolutionary step forward), it will help if the visual field of the fellow is detected by paying attention to his eyes themselves. Thus, his eyes of the first moment will link more easily with his eyes 6. We might remember at this point that, in the mid nineteenth century, von Baader proposed putting Descartes’ Cogito into the passive voice. In a similar way, instead of “Cogito, ergo sum”, we might say “I know that I am thought, therefore I am human”. In other words, the primate who notices that it is being thought by someone else is the only one that is human. (It is only with this particular knowledge that the genuine concept ‘we’ can arise. As I proposed supra, in 1.4.2, mirror neurons do not support that concept.) 7. Certainly Povinelli has contributed substantially to the acknowledgment of this fact, but the suggestion has been also adopted by rival scholars: see Kaminski et al. (2004) and Tomasello et al. (2007). That humans like to look at the eyes much more than chimpanzees has also been confirmed by Kano & Tomonaga (2010).

Chapter 3.â•‡ The three modes of processing the eyes of others 

of the other moment. The eye will be the same throughout those moments, except in the one difference that matters, i.e. the direction of the eye. In short, it can be deduced from our proposal that the third mode of processing would be facilitated if the travelling of the eye became more conspicuously perceptible. This fits perfectly with the fact that human beings and only human beings (at least amongst the species alive nowadays), possess the ‘white of the eye’.8 The adaptive function of this feature appears very clearly within the framework of our description of the third mode of processing. The human eye is exceptional amongst primates because it presents a very visible white of the eye (we make mention once more of Kobayashi & Koshima [2001]).9 Of course, apes’ eyes can be highly expressive (Holloway [2003]). But the human particularity that interests us here relates not to expressivity, but to a different question. Where the white of the eye is highly visible, the travelling of the iris becomes conspicuously perceptible.10 This idea of the adaptive usefulness of the white of the eye in humans has also been highlighted by other authors (Hurford [2003]; Gliga & Csibra [2007]; see also Csibra & Gergely [2007]). However, I have re-described it to fit into my hypothesis about the human basic ability, i.e. about the second mental centre. In my formulation, a showy white of the eye is insufficient to guarantee genuine communication via eye direction. The ability to ascribe internal states to the eyes looking at me is also necessary for me to understand communication as such. I want to make clear two points on the way I use the idea of the adaptive usefulness of the white of the eye. Firstly, the primary function of the white of the eye is not to stress the direction of the look: It is perfectly easy to infer someone else’s gaze direction from the direction of his trunk and head (chimpanzees and ravens clearly reveal this possibility: supra, 2.2). The white of the eye emphasises the travelling of the iris, i.e., the iris shifting from the object in question to the signal recipient, and this travelling permits the bringing together of those two moments which remain completely disconnected for chimpanzees. Secondly, we can certainly say that the human eye, with its 8. White patches are very used in communicative signals. For example, human and non-human primates use the white patches of the teeth in order to generate signals. Thus, Bouissac (2010) uses the term leucosignals (from Greek leukos = brilliant white). 9. By contrast, “for those primate species that have acquired independent eyeball movement (i.e., independent from head movement), the combination of the features of eye shape (i.e., the features that make possible scanning with eye ball movement) and dark brown sclera might be adaptive in a context that requires collecting visual information inconspicuously. These features (or, more concretely, this low-contrast eye) can thus be interpreted as a counter-strategy against high sensitivity to the others’ eye gaze.” (my emphasis): Kobayashi & Hashiya (in press). In addition, these authors show that “the proportion of scanning with eyeball movement alone per total scanning correlates with group size and neocortex ratio”.) 10. This conspicuousness would have been all the more necessary the further we travel back toward the origin of the third mode of processing. Incidentally, from this we can make the hypothetical deduction that fair irises would not have been frequent when the feature originated.

 Becoming Human

white sclera, is “a cooperative eye” (cf. Tomasello et al. [2007]). However, according to my hypothesis, it was only after the emergence of the ‘third mode’ (only after and as a result of that) that the white of the eye could become useful for cooperative or mutualistic behaviour. In the next chapter, I shall focus on this question. Before moving on to look at another point, we shall comment further on the question of the white of the eye. The adaptive usefulness of this feature would come from the third mode of processing. It is thus plausible that the white sclera would not be prior to such a processing mode. As a result, it would be extremely interesting to find out which was the first species to have the white of the eye. Did Neanderthals have it? DNA analysis may soon be able to give us the solution to those questions.11 Guillermo Lorenzo (see his elaborate generativist view of the genesis of language in Lorenzo [2003]) believes (personal communication) that Neanderthals’ rich cultural heritage invalidates the possibility that they lacked pointing gestures. That culture is indeed a rich one. We must escape the temptation to look down on it. However, this is not the question; instead, what we need to ask is whether attention towards the adult’s task could be provoked without pointing gestures. In my opinion, the response to that is far from obvious, and all we can do is wait. We might bring in other arguments, apart from the already mentioned white of the eye that is exclusive to humans, in favour of my description of the ‘third mode’. It is obvious that ‘self-conscious emotions’ – embarrassment, pride, shame and guilt – are relevant in this sense. “It is not the simple act of reflecting on our own appearance, but the thinking what others think of us, which excites a blush” (Darwin [1965, p. 327]; my emphasis). See also Lewis (1992) and (2000).12 But, although it is a subject to which I am greatly drawn, I shall not address it in this book: What interests me here is the series of abilities, that is, of biological innovations, out of which language could historically emerge. Thus, let us focus on the idea that the same specifications that we have proposed for the ‘third mode’ would also be necessary for human understanding of the finger-pointing gesture, as well as for four-hand tasks. I shall address this in the following chapters. 11. There is an exceptional report of a wild male chimpanzee who had exposed white sclera (Goodall [1986]). Finding further apes with such characteristics could be useful to guide the search in the genome of Neanderthals. 12. See also Leary [2004], Rochat [2009] and Zeedyk [2009]). In addition, it will be useful to read Jackendoff (2007, Chapters 10 and 11). Tomasello (2008) and (2009) focuses on shame and guilt in a context perhaps more similar (that is, more similar than the one used by the above mentioned authors) to the one of current Section Two. However, Tomasello´s invocation to rules, indisputably adequate as far as the study of ontogenesis is concerned, seems to me no longer appropriate when we deal with the evolutionary origin. In short, in my view, the power of rules, the ‘construction of social reality’, could not come into being until later and as a result of the emergence of the ‘third mode’ or basic human ability. (Certainly, it has been repeated that language could not function without a ‘social contract’ because words are cheap. However, in my view, in the origins, each collaborative task, from pointing gestures on, would bring immediate results for each of the collaborators.)

chapter 4

Pointing gestures 4.1

Pointing gestures in children

In children, finger-pointing gestures would be an extension and emphasis of the communicative use of the eye. Pointing gestures are always accompanied in children by the communicative use of the eye. This use of the eye can occur on its own, but in most cases it will be accompanied by a pointing gesture. There have been some anti-mentalist interpretations of pointing-gestures in children. We shall start by looking to Vygotsky’s theory, later elaborated on by Bates, on the origins of the pointing gesture. Children, at first, would make an effort to reach an object, stretching their arm and body out as much as possible, but without succeeding. The adult accompanying them realises what the child wants and gives him the object.1 This is repeated on several occasions. And then, the child’s behaviour changes: while he initially made the gesture to reach for the object without any communicative purpose at all, he now begins to do it without any serious effort, just so the adult will see. As we have seen, in the original formulation, the child, at first, wants to reach the object. By contrast, in Delgado, Gómez & Sarriá (2009) and also in Carpendale & Carpendale (2010), infants first use their extended index finger as a manifestation, and probably also an enhancement, of their own attention. But this difference is irrelevant. It is the very nucleus of Vygotskian theory that we can not accept. We should note that this theory clashes with a well established datum about children’s acquisition of the communicative pointing gesture, namely, the precedence of understanding the gesture over producing it (see, for example, Corkum & Moore [1995]). In recent years, there have been some anti-mentalist interpretations of this understanding. Moore (1999) is a good example. According to him, 12-month-olds would merely understand that an adult’s pointing gesture normally precedes an interesting experience. As you will see, this hypothesis, unlike Vygotsky’s, refers to the genesis of understanding and not of production. Correspondingly, Moore’s theory envisages a

1. What Vygotsky actually says is that the mother would inadequately interpret the child’s effort to grab the object as a requesting gesture: this is what has been called ‘the mother’s deception’ or ‘the illusion of intentionality’. This interpersonal mediation would be, Vygotsky continues, an example of the General Principle that he formulated (“superior processes would originate interpersonally and only later would they become intrapersonal”). As it will be seen in other chapters, I agree with the Principle. However, I do not agree with that specific example.



Becoming Human

less instrumental (or less actively instrumental) conditioning than the one contemplated by Vygotsky. But some research placed this behaviourist interpretation in difficulty. Certainly, infants can be conditioned to follow changes in direction both by people and by objects. However, “the temporal co-ordination between pointing, gazing, and vocalisations occurred at a much higher rate in the Person than in the Object condition. That infants produce these behaviours more often to people than to inanimate objects reveals that infants have different conceptions about people and objects, namely, that one communicates with people and not with objects” (Legerstee & Barillas [2003]). From this point on, therefore, I will assume that children understand the pointing gesture completely, at least from their first birthday onwards. Of course, this does not imply that the child understands the ulterior motives, wishes and intentions that underlie the pointing gesture that he observes. Children would only perceive that the adult is pointing out an object to them, but this would be precisely the essence of the pointing gesture. For quite some time, pointing gestures have been split into two types: imperative (also called directive) and declarative (Bates et al. [1975]). With the latter, the producer tries to attract the recipient’s attention toward an object, but not so that the recipient will give them an object. Why then? It is clear that the producer-child tries to get the recipient to talk about the things being pointed at, i.e. to name or talk about them. Language learners’ need of such linguistic nourishment is enormous, and declarative pointing is a crucial resource in supplying it. On occasions, the child will accompany the gesture with a term. That term – the verbal protodeclarative – will help the child to check that the meaning he is giving to the term is correct, or, in other words, to ‘negotiate meaning’, as well as serving exactly the same purpose as the simultaneous gesture. Both the verbal and the pointing protodeclarative do not only seek indiscriminate linguistic stimulation, but also create the ideal context to conveniently exploit that stimulation. The child will receive words that she does not yet know, but she will know that such words would have to do with the object that has been pointed at. The conclusion of all this is that protodeclarative pointing-gesture is a type of communication whose main usefulness is linked to linguistic learning. (See Southgate et al. [2007]). This usefulness has been tested by Bigelow et al. (2004). But the protodeclarative gesture can also fulfil the communicative function of getting the recipient to notice the object. Liszkowski (2006): “At 12 months old some of children’s pointing gestures intend to provide information (telling an adult the location of something the adult is looking for)” This second type of declarative pointing, i.e., Liszkowski-style pointing, has very frequently been associated with altruism or cooperation, whereas imperative pointing has been asssociated, by contrast, with selfishness. But however valid the second correlation may be for the individual, it cannot account for the establishment of the communicative resource. The imperative type requires altruist or cooperative recipients. It

Chapter 4.â•‡ Pointing gestures 

is precisely the cooperation issue which has recently gained considerable attention in attempts to explain why chimpanzees in the wild fail to perform pointing gestures.

4.2

Why don’t apes point? Distinguishing the indirect cause from the direct cause

Why don’t apes point? Let us begin with the idea pursued by Hare & Tomasello (2004): chimpanzees’ abilities unfold in a framework of competition, not collaboration. In a chimpanzee clan, each individual looks for its own food. A piece of communication that intentionally seeks to transmit useful information would thus make no sense in chimpanzees. This call for attention certainly merits being taken into consideration. We must bear in mind the characteristics of each species’ lives. However, the importance of the competitive framework in the clan could perhaps be compatible with some degree of altruistic or, at least, mutualistic, collaboration. Since in a chimpanzee clan one gang of allies often confronts another gang, there are both confrontation and alliance. It is true that Tomasello (2008) distinguishes between ‘helping allies in a fight’ and ‘cooperating’. This distinction, however, would, at most, reveal different – low or high – levels of efficiency, or different – less or more general – types of altruism, but has nothing to do with motivation, which is a shared feature on both cases. As can be seen, I have refrained from invoking Boesch (2005) and the allegedly cooperative strategies used by chimpanzees in their hunting parties. I have focused only on the indisputable aid behaviours during fights. But there is surely evidence to accept a fair amount of cooperation: See de Waal (2006), who ascribes a large amount of altruistic behaviour to chimpanzees.2 But do not misinterpret my reluctance in the face of the connection between absence of cooperation and absence of pointing gestures. There is no doubt that the human way of life is much more dependent – incomparably more dependent – on cooperation. There are three main types of altruism as defined by the ‘commodity’ involved: goods, services and information (Warneken & Tomasello [2009]), and it is only the second type – helping behaviour – that Warneken et al. (2007) in their experiments have observed in both human-raised and mother-raised chimpanzees. I also completely accept the idea that some changes in habitats or niches would have 2. Chimpanzees share food, and foraging is a highly social affair. Like many other primates, chimpanzees possess specific vocalisations to announce that they have found a new food source. They cooperate regularly during hunting, territory defence, anti-predator behaviour and intragroup aggression, facts long known since the 1960s when studies of wild chimpanzees began in Africa. In addition, there is evidence that capuchin monkeys will help a human experimenter to obtain an out-of-reach object, irrespective of whether or not they are offered a reward afterwards (Barnes et al. [2008]). Marmoset monkeys spontaneously provide food to non-reciprocating and genetically unrelated individuals (Burkart et al. [2007]).



Becoming Human

required higher levels of mutualism eventually producing the evolutionary emergence of hand-pointing or gaze-pointing. However, in my view, this higher demand of cooperation would have only indirectly caused that emergence. By contrast, the crucial step or direct cause would have been the emergence of the basic cognitive ability that I have called the ‘third mode of processing the eyes of others’ or ‘second mental centre’. Many authors take into account changes in habitats and ensuing changes in life style. Tomasello speaks of a more cooperative or mutualist way of life. Hurford (2007, p. 219) prefers to speak of a process of “self-domestication”. (See Hare et al. [2005] and Hare & Tomasello [2005]; see also Wellman et al. in press.) But it is perhaps in Bickerton (2009) where, in the light of the Niche-Construction Theory, the issue becomes particularly prominent. As I have just said, in no way am I opposed to this whole range of proposals. But this is not my concern here. After distinguishing between the direct cause (that is, the cognitive changes) and the indirect cause (that is, the increasing necessity to cooperate), I will concentrate on the cognitive changes. I agree with Tomasello, Call & Hare in their generous interpretation of chimpanzees’ abilities. Chimpanzees are presumably capable of ascribing visual perceptions to conspecifics, that is, they would have reached the second mode as I have already defined it. But I disagree with the idea that it is only out of a lack of motivation to cooperate that chimpanzees fail to point. In my view, the direct cause for the pointing ability of humans is the exclusively human cognitive ability that I call the ‘third mode of processing others’ eyes’ (or, more in general, the ‘duality of mental centres’). In short, according to Tomasello, the difference between humans’ abilities and chimpanzees’ abilities for a theory-of-mind is only a question of degree. However, I see a clearcut line between abilities based on an a posteriori expectation (monkeys’ mirroring and the chimpanzees’ ability to ascribe visual perceptions) and abilities requiring genuine simulation. As for Povinelli (see mainly Penn et al. [2008]), I agree with him in stressing human exclusivity. But I disagree with his ensuing radical confrontation (i.e., his confrontation without bridges or interconnections) of mere conditioned learning in animals, and intellectual, exclusively human capabilities. In my opinion, if we pay due attention to the changes that the self-perceptible hand and the subsequent matching of one’s own body with a conspecific’s body provoked among primates, we are not depreciating human exclusivity, rather the contrary. In the terms used in this Section we can articulate it as follows: the best way of accurately understanding the complexity of the third mode is by comparing it with the second mode. But, leaving these general comments aside, let us go back to our main point and see whether or not a concrete conclusion of my hypothesis, namely, that the presence or absence of pointing has a direct and cognitive, not-motivational cause, can be refuted. Let us start by noting that apes held in captivity multiply their request behaviour in the presence of benevolent carers as a result of the cooperative environment they live in. I will try to show that this request behaviour presents observable features that are very different from those of human pointing.

Chapter 4.â•‡ Pointing gestures 

4.3

The requesting gestures of the apes of Gómez and Leavens

Gómez (1998) focuses on request or imperative behaviour in apes, and the absence in them of finger- or hand-pointing gestures. His research was triggered by the monitoring of some gorillas that were being reared in a human environment. The smaller gorillas depended on carers to meet some of their needs. As a result, Gómez’s observations give a prominent role to requests, which are a behaviour that, while not properly collaborative, lies totally outside any competitive framework. From early on, gorillas learn a range of behaviours, according to Gómez. They tug at their carer and lead him to the place where they want him to do something, they take the carer’s hand to the point in question, or even push him in that direction, and so on. Even more important than this, however, is the datum that, when they begin any of these behaviours, the animals always make sure that the other individual is looking at them. It is not just that the gorilla, Gómez says, takes the person to a particular place; typically, the first thing the gorilla would do is attract the attention of the human to whom he is going to make the request. The way to attract their attention would typically be to touch them and wait until the person looks at the gorilla, with the result that the person’s and the animal’s eyes would meet. Only then would the gorilla take the human’s hand and carry out their act of request (normally by making more eye contact during the request itself). The gorillas observed developed a repertoire of gestures which specialised in attracting people’s attention before they issued a request: tugging at their clothes, patting their leg, touching their hand, turning their head... In every case, the gorilla would first make eye contact with the human, and then issue the request. (Earlier, Plooij had understood this sequence in baby chimpanzees which ask their mother to scratch them. Although at three months of age, baby chimpanzees do not bother to look before they ask, they will start to do so systematically before the time they turn one. See Plooij [1978]). The pointing gesture is, however, absent in this range of behaviours that apes spontaneously develop. Gómez explains this situation by stating that the request function, which is common to human and non-human primates, would nevertheless use elements that are different in our species and in apes. Certainly, while that formulation is irreproachable, I would like to continue questioning: why is the pointing gesture absent in Gómez’s gorillas? Of course, it is obvious that the specific gesture of pointing with our finger or with our hand can only be found in animals with hands. But this evidence only allows us to formulate this question better: what prevents chimpanzees or gorillas from discovering it? A potential answer would invoke Hare’s idea (2001). While requesting became a very frequent behaviour in the context of Gómez’s gorillas in the zoo, this would not happen in life in the wild. In the wild, not even baby apes would be much given to requesting. The concept of heterochrony, stressed in a later work by Gómez, 2004, would account for that difference between non-human primates and our children. Little chimpanzees or little gorillas, having a much earlier motor maturity, are (relatively)



Becoming Human

more independent. Thus, both in babies and adults there are few occasions for request communication in non-human primates. In short, there was no opportunity in the life of such species to discover the economical request resource that is the pointing gesture. So, when an individual is brought up in the abnormal environment of a zoo, that individual will manage with request behaviours which, while less optimal, will be more rooted in their habitual range of behaviours. That response would be, I insist, a possibility. But, as I have already promised, I will propose another kind of explanation and provide an outline of the cognitive difference between any pointing gesture and that behaviour. However, we must first look carefully at the work of Leavens and his associates. These authors have shown, firstly, that “unlearned (i.e. with no explicit training whatsoever) captive apes frequently point to unreachable foods”, and, secondly, that “these are communicative signals because apes will not reach towards obviously unreachable food if there is nobody around to see them do it”: Leavens et al. (2005). Why are these observations so different from those made by Gómez? Unlike Gómez’s young gorillas, Leavens’s animals were encaged when they first made the pointing movement and thus were unable to use any of the request behaviours listed above. As a result, far from being their spontaneous first choice, pointing toward food is for these animals the only movement of approach to the food that circumstances allow them to perform. But let us focus on the crucial issue. In my view, this pointing, although it has a communicative purpose, is not a ‘communicatively shaped action’. Just the same could be said of the request behaviour perfomed by Gómez’s gorillas. But, what do I call ‘communicatively shaped action’?

4.4

Communicative action versus communicatively shaped action

Let us focus on the request behaviour we have seen in Gómez’s gorillas or Leavens’s caged apes. What does the recipient need to understand this request behaviour? The recipient must be able to infer the external goal pursued by the ape in need of help, namely, the distant piece of food or the opening of the closed door. Once the recipient has inferred that goal, there is a chance that his or her helping altruism, together with the primary intersubjectivity which the animal in need of help has being trying to encourage, provokes an adequate response. In short, to be able to understand this request behaviour, the recipient must simply interpret it as behaviour intended to interact with the environment. The producer certainly performed the request behaviour so that the recipient could see and understand it, that is, by virtue of a communicative function. However, this behaviour can be understood as being a sensible interaction with the environment and with the external goal pursued, that is, it can be understood although the recipient does not interpret it as communicative. This, I maintain, is the case of the behaviour of Gómez’s or Leavens’s apes. If we accept this, we must conclude that the comprehension of the requests recorded by Gómez and Leavens would be

Chapter 4.â•‡ Pointing gestures 

entirely within the reach of the abilities widely acknowledged in apes. Thus, this complies with a desideratum for any spontaneous communicative behaviour, namely, that, to be understood by its recipient, the behaviour involved must only require abilities possessed by the species of the individual performing it. Let us compare this description of the behaviour of the animals of Leavens or Gómez with a classic statement made by Seyfarth & Cheney (2003, p. 159) on animal communication: “listeners (or recipients) obtain information from a caller (or sender) who may not have intended to provide it.” As can be seen, the movements that animals of Gómez or Leavens directed towards the food or the closed door clearly exceed that situation, since the sender intends to provide information. However, they are still confined to an old type of communication, since their communicative success does not depend on the recipient’s ability to perceive them as being communicative. Let us look at the difference with the case of a genuine pointing gesture. If a pointing gesture is not interpreted as communicative behaviour, then it cannot be understood. An arm extended in the air would be seen as to be an absurd, rather meaningless piece of behaviour.3 Curiously enough, Tomasello, who stresses how strange any mimicry can be for a recipient if the gestures involved are not interpreted as being communicative –“the recipient will see my iconic gestures as some kind of strangely misplaced instrumental action” (2008, p. 149 and also p. 203)–, never says anything of the sorts about pointing. In my view, however, mimicry and pointing are equally strange and absurd if they are not interpreted as being communicative. By contrast, the requests performed by Gómez’s or Leavens’s apes appear, I insist, as entirely sensible behaviour even for recipients who do not understand it as communicative. These communicative behaviours are ‘environmentally shaped’ actions, or, in other words, they are not ‘communicatively shaped’ actions. In short, I am hypothesising that it is the understanding of communicatively shaped actions that chimpanzees lack. In order to interpret an experiment conducted by Hare & Tomasello, 2004 (see also Tomasello [2008, p. 40–41]), it could be useful to bear in mind this lack of understanding. More precisely, it could be useful in order to interpret the results of the experiment in a very different way from that suggested by these researchers. Hare & Tomasello (2004), in support of their explanation based on the absence of cooperative motivations, point out the astonishing contrast between chimpanzees’ inability to understand a ‘benevolent’ (i.e., cooperative) carer and their 3. Do you say that much of the animal behaviour performed to alter the behaviour of other conspecifics or animals is equally ‘absurd’? My reply is that we must distinguish between two kinds of communicative behaviour. On the one hand, there are innate communicative signals whose production and reception have been phylogenetically linked and, consequently, can never be absurd for the individuals of the species performing them. On the other hand, there are the requests of the apes of Gómez or Leavens, which are, by contrast, improvised (Tomasello [2008, p. 20] insists on the point that the gestures of apes are much more flexible than their shouts). It is in relation to these requests that I maintain that for them to be understood by apes, they must be an ‘environmentally-oriented type of behaviour’.



Becoming Human

extremely skilful knack of understanding a ‘competitive’ carer. According to the hypothesis of these authors, the key point marking the difference between the success or the failure of the chimpanzees is just that: the carer’s attitude. Tomasello and Hare conclude that chimpanzees do not understand cooperative behaviours. But, if we read the proceedings of the experiment with care, we see that the lack of understanding took place in the face of a hand-pointing gesture, while the animals’ understanding, by contrast, was triggered by an attempt to reach a container (despite the obstacle presented by a hole in a curtain that was far too small), that is, by environmentally-shaped behaviour. Consequently, results would be more clarifying if the contrasting pairs could be dissociated, that is, the pair ‘competition and environmentally-oriented behaviour’, on the one hand, and the pair ‘cooperation and hand-pointing gesture’, on the other.4 Contrary to Tomasello and Hare, I think that it is only the second element of each pair what makes the difference between success and failure. ‘Hand-pointing gesture’ (or, in other words, ‘communicatively shaped action’): Chimpanzees´ failure. ‘Environmentally-shaped behaviour’: Chimpanzees´ success. Comprehension of the communicative use of eye direction, on the one hand, and comprehension of the pointing gesture, on the other, would both have the same crucial requirement. That practically all pointing gestures are accompanied by alternating looks is, as I see it, a good clue. Just as both behaviours are related to each other, the respective underlying processes would also be related as well. Infra, in 4.9, we will try to go deeper into the analysis of those underlying processes. But beforehand, it is convenient to underline certain points.

4.5

Commenting about Grice and also about triadic communication

Here we must remember Grice’s description of what speakers (utterers, for Grice) mean. The speaker intends (1) that the audience believe something or go on to do something, and (2) that the audience recognize the intention (1).5 Focussing on comprehension of pointing gestures, from Grice’s research we can derive the formulation that recipients must perceive the producer’s intention of making them look at an 4. Admittedly, I cannot think of an experiment designed to test the new pair competition/ pointing gesture. What about an experiment combining an unfriendly carer, that is, a competitor, who hand-points towards another likewise ‘unfriendly’ carer? Needless to say this experiment would not do, because the understanding of a gesture pointing to a third person would very likely be different from the habitual understanding of the gesture. (But a gesture to a third party is possibly an interesting fact for primates: See Teufel et al. [2010]: “One specific facial expression that is given in response to social interactions between third parties was particularly efficient in eliciting gaze-following responses”.) 5. As you can see, in order to stick to what is of interest right now, I have cut down to just two points Grice’s formulation, which even in its more primitive version (Grice [1957]) was made up of three points.

Chapter 4.â•‡ Pointing gestures 

object. This is tantamount to what I have been calling ‘the third mode of processing the eyes of others’. In that case, a possible objection could be: is there any point in proposing new terms to designate something which already has a familiar, almost classic label, namely ‘the understanding of utterers’ communicative intention’? My answer is that Grice’s description adequately depicts only one of the possible applications of the ability concerned. Note that Grice’s description fails to account for self-conscious emotions’ – embarrassment, pride, shame and guilt – (see above, 3.5) or ‘four-hand-actions’ (our issue in the next chapter). In these cases it is not a communicative intention of others that the subject perceives. Therefore, I prefer to continue using my formulation, that is, the concept of ‘third mode of processing others’ eyes’ (or, more in general, the duality of mental centres). Anyway, it is very important to distinguish explicitly the communicative intention I am talking about from another communicative intention on the part of the producer, whose perception would involve much less demanding requisites. Note that it is extremely easy to perceive a readiness for interaction or communication in the eye which (as happens, for example, at the end of a pointing gesture) fixedly stares at the recipient. This perception could require only the ‘first mode of processing another’s eye’, or, more precisely in this case, primary intersubjectivity. But this would not be enough to understand a pointing gesture. Consequently, it is important to make perfectly clear what the difference between these two perceptions of communicative intention is. Let us move on to a similar question. On one hand, I have described (above, in 3.3) the basic ability as being able to ‘attribute visual perceptions to the eye which is looking at me’. In this ability, sometimes there is no ‘third element’, i.e., there is no referent characterising triadic interaction.6 On the other hand, however, in order to show that basic ability, I am turning to pointing gestures or triadic communication. Let me explore this apparent contradiction in my position. As long as there is, in the subject, no triadic interaction or attempt to communicatively guide another’s attention toward some referent or other, we can have no basis for believing that this subject’s dyadic interaction actually involves the basic ability. Who can guarantee to us that those dyadic interactions are surpassing the phylogenetically very old ‘first another’s eye processing’? However, when the subjects have shown their ‘third processing mode’ in pointing gestures (eye- or finger-pointing), then we would certainly be authorised to interpret their participation in ‘mutual attention’ episodes as the ‘perception of an interiority which is looking at me’.

6. In this regard I seem be moving closer to Reddy (2005), who insists that there is no need to wait for pointing gestures to appear if we want to observe the comprehension of another’s attention. But, in my view, we would have to clarify what is being understood by another’s attention in Reddy’s formulation: see supra, 3.2.



Becoming Human

4.6

Some unavoidable issues which must be dealt with

4.6.1 Wild chimpanzees that extend their arm in the direction of an object: How could those gestures really happen and yet be so scarce? It is likely that this lack of understanding explains a controversy that has recently arisen. Some primatologists claim to have observed chimpanzees in the wild make gestures where they extend their arm and reach out their hand: In Vea & Sabater-Pi (1998), there is a very detailed description of a bonobo pointing twice to human observers who were hiding in some shrubbery. Other researchers are unsure, however. If these gestures did in fact happen, why have so few been observed? Why are such observations so utterly unusual? I believe that we can explain why such observations, being real, would be so rare. We shall start by looking back to Vygotsky’s theory, later elaborated on by Bates, on the origins of the pointing gesture. Children, at first, would make an effort to reach an object, stretching their arm and body out as much as possible, but without succeeding. The adult accompanying them realises what the child wants and gives him the object. As I said in 4.1, we can not accept this theory about the genesis of the pointing gesture in children. We can, nevertheless, bring the idea of conditioning to the controversy we have been addressing. A wild chimpanzee stretches its arm in the direction of an interesting object. Its movement might arise from the conflict of opposite motivations; or it is also possible that its movement has the function of re-picture it as an objective in its mind (cf. Delgado et al. 2009 about private pointing in children). The question is that, even though a chimpanzee produces such a movement, it will never find in another chimpanzee a recipient like the one in Vygotsky’s theory. Consequently, those observations about wild chimpanzees could be real. Their exceptional rarity would be entirely explicable.

4.6.2 Dogs and chimpanzees compared to the human pointing gesture At this point, we need to create space for another issue. Hare et al., 2002, have shown that untrained dogs are sensitive to human gestures that point out a direction. He did not find this sensitivity, however, when he later studied wolves. As a result, he was able to conclude that dogs had been selected according to that criterion right from the very first time that humans had begun to make use of them. In a human environment, the best hunting or shepherd dog would have more opportunities to transmit its genes. This is not surprising: think back to Darwin’s initial inquiries with breeders of domestic animals. We assume, therefore, that sensitivity to human gestures of pointing would have been selected in dogs. Yet, I would like to continue my questioning. Originally, that is, in wolves, there had to be a base from which to develop toward the ability of our best dogs. Wolves hunt in herds and it is very possible that the head of that herd would decide which was the weakest of the group of potential prey. This decision would need to be known by

Chapter 4.â•‡ Pointing gestures 

the other wolves. All in all, wolves would possess, in the context of the co-ordinated hunt, a natural sensitivity to specific gestures by the dominant individual. It is on this base that the selection of tamed dogs would have operated. This generous position as regards wolves’ abilities seems to have been confirmed and even strengthened by Udell et al. (2008). Once we have reached this point, we can ask the question that we are interested in: how close are dogs to the human understanding of the pointing gesture? Would they be closer than chimpanzees? I could only say that, in essence, there is no need to believe that dogs are ascribing even a visual perception to the producer of the gestures. Thus, chimpanzees and ravens might be much closer than those canine abilities to the understanding of the gesture of pointing.

4.7

True pointing in chimpanzees brought up by humans?

Chimpanzees brought up by humans do eventually learn the pointing gesture. We must mention here Savage-Rumbaugh, on human-reared bonobos (see, for example, Benson et al., 2002). The success of this learning seems, in principle, to refute my explanation of the cognitive inability of chimpanzees for these comprehension and production. Am I not acknowledging, after all, that chimpanzees are able to learn how to hand-point? And, on top of that, don’t we teach children how to do something similar? This issue is the object of a hot debate.7 As I have already said, I align myself with the defenders of the difference between these learning processes. I believe that Vygotsky’s and Moore’s antimentalist explanations (which in 4.1 I have dismissed as an explanation for pointing in children) would be, however, adequate for enculturated chimpanzees. In short, a more or less instrumental conditioning could explain the production of gestures (remember Vygotsky’s proposal) or the reception (remember Moore’s proposal) by enculturated apes. In other words, on this issue I agree with Povinelli’s criticism of the argument by analogy. Above, in Chapter 2, I did not agree with his deflationary opinion on the abilities of chimpanzees in connection with a conspecific’s visual field, and, consequently, contrary to his stand, I found it correct to apply the argument by analogy. However, in the current issue, I definitely do not think that that argument should be applied. This is not an arbitrary whim of mine. In my view, there are good reasons for differentiating between the two learning processes of pointing, that of children and that of enculturated chimpanzees. It is not only that the behavioural repertoire of 7. I must say that Tomasello & Call, who were previously very keen on the influence of enculturation, have modified their view to some extent. Tomasello & Call (2004, p. 214): “It is likely that human experience only serves to modify existing social interactional and attentional skills, rather than creating new ones”.



Becoming Human

chimpanzees in the wild lack the pointing gesture, nor that my suggested explanation of pointing can also be used to explain ‘self-conscious emotions’ or the four-hand-actions. On top of that, we must also take into account a fact as genetic as the exclusively human white of the eye. As I said in Chapter 3, this feature yields, according to my hypothesis, a crucial contribution to a child’s learning. In addition, a very interesting neurophysiological fact has been recently discovered in small children: “Self-initiated joint attention leads to a differential increase of neural activity in reward-related brain areas, which might contribute to the uniquely human motivation to engage in the sharing of experiences” (Schilbach et al., in press). Nothing similar has ever been found in non-human brains. I find this discovery – this child brain programme for joint attention and the exclusively human intersubjectivity – fascinating. In my opinion, this enjoyment would be similar to that of play, which aims at inducing the subject to practice an ability that needs to be trained. But, could it not indicate something different? Or, more precisely, could this not be an indication against what I have been suggesting and, by contrast, favourable to Tomasello’s suggestion that the presence or absence of cooperative motivation is the only crucial factor? Let us return to this issue again.

4.8

Lack of motivation in chimpanzees? Seeing more in detail the difference between Tomasello’s proposal and the one which is being defended in this chapter

In my view, the difference between chimpanzees and humans in relation to pointing gestures is explained by a cognitive inability, not because of a lack of motivation. As I said in 4.2, the direct cause must be distinguished from the indirect cause. The indirect cause, namely, a minimally cooperative way of life, would certainly have prevented evolution from inducing in chimpanzees the acquisition of the third mode of processing others’ eyes, that is, the basic human ability. However, in my view, the direct cause would be cognitive inability alone. I certainly attribute two abilities to chimpanzees. First, the ability to attribute visual perceptions to their companions. Second, that chimpanzees, like many other animals, are sensitive to communicative signals being directed at them and also to eyes staring at them. However, in no way can we conclude from this that chimpanzees would be able to perceive the interiority of the conspecific looking at them, or the interiority of the communicative signals being addressed to them. The possession of two abilities does not ensure their simultaneous use. Have chimpanzees cognitive sufficiency for the pointing gesture? That is the key question. If they do, then this would need to be reconciled in some way with the fact that they lack the pointing gesture. This strategic reconciling role is the one that

Chapter 4.â•‡ Pointing gestures 

Tomasello sees being played by chimpanzees’ lack of motivation ‘to share intentional goals’ (See Tomasello [2008], and also Hare & Tomasello, 2004; Tomasello & Racokzy, (2003), Tomasello et al. [2006].) In my opinion, however, this is unnecessary. I would not insist on the idea that an individual chimpanzee necessarily lacks motivation to help others. Instead, I think that the crucial deficiency of chimpanzees is cognitive. In the chimpanzees’ way of life, cooperation never became sufficiently adaptive so that evolution could provide them with the cognitive requirements involved in pointing gestures. The direct cause of their inability to develop pointing gestures must be ascribed to the absence of these requirements. Let me try to be a little more precise. Certainly any gesture indicating to one’s own allies the approaching enemy would be very useful both for the producer and the recipient. (This might have been the primaeval function of the evolutionarily initial pointing ability. That the white of the eye becomes bigger and, consequently more conspicuous in threatening situations – cf. Whalen et al. [2004]– could, to some extent, be interpreted as favourable evidence in this sense.8) However, in the life style of chimpanzees, despite frequent struggles both between clans and intraclan factions, such an indication was by no means a sufficiently strong adaptive advantage. By contrast, in a scenario of, for example, “power scavenging” (Bickerton [2009, p. 161]), the absence of this kind of indications would have made the survival of the entire group impossible, and, as a result, evolution gave rise to the ‘third mode of others’ eye processing’ (or, in other words, the understanding of ‘communicatively-shaped actions’) and also to the white of the eye. (Take, I insist, just as an example that scenario of active scavenging: I am not at all commiting myself with that specific proposal.) It follows from the difference between my proposal and Tomasello’s that we would take up different positions on some possible data. Think, for example, of Nissen’s (1946) observation that a chimpanzee turned a conspecific around until it faced the direction a danger was coming from, and both chimpanzees then escaped in silence. Is this an over-interpreted anecdote, or is it, on the contrary, trustworthy datum? The only comment I can make is that, if these data and interpretations were correct, they would be compatible with the hypothesis in this book. Certainly, in Nissen’s anecdote the producer pays attention to the recipient’s visual perception (we shall call “producer” the one who manipulated the conspecific’s body, and “the recipient” the conspecific itself). However, we should note that it is not necessary in such behaviour for the recipient to be paying attention to the communicative intention in order for the information to be understood by the recipient. The only important thing is that the recipient be positioned in such a way that the visual perception including the dangerous element can be ascribed to it. This case is very different from what we have said happens with the production of the pointing gesture and its two simultaneous requirements. 8. In addition, experimental work with human subjects has showed that “spontaneous visuospatial perspective taking occurs, at least more robustly, when a face with a fearful, but not with a neutral, expression is perceived” – Zwickel & Müller, in press.



Becoming Human

Therefore, if Nissen’s anecdote and its interpretation were true, it would not hurt my proposal. In contrast, were this anecdote and interpretation true, it would become necessary to deduce a conclusion which was contrary to Tomasello’s hypothesis – it would become necessary to grant chimpanzees motivation to share intentional states. There is only, therefore, one clear conclusion: we need reliable data regarding this behaviour as soon as possible. Before we move on, let us look at the question of falsifiability. Although it can admit those apparent data, the proposal I am making holds back from predicting them, as I have already said. My point relates only to a cognitive inability in chimpanzees. Can chimpanzees desire the situation they would find if, contrary to what occurs in reality, they were able to carry out pointing gestures? Could they then, driven by that desire, try to accomplish it by using other, less economical means (along the lines of Nissen, for example)? I have no opinion about these questions; instead, I will make only the trivial comment that we must not attribute this desire to chimpanzees if the use of such less effective means was never observed. The ability to have a type of desire is an adaptive ability only if there is some instrumental means connected to the realisation of that desire. But beyond that point, there is no further prediction on my part. Falsifiability would not, thus, come to my hypothesis by this route. Only Tomasello’s lack of motivation theory would be falsifiable this way. But do not automatically condemn me. In many other paragraphs, both before and after this one, I scrutinise possibilities which, should they be true, would cause me to withdraw my hypothesis.

4.9

What are the requirements for a genuine understanding of pointing gestures? A closer look at the expectation/simulation dichotomy

4.9.1 Going back to the expectation/simulation dichotomy Now it is time we also addressed the dichotomy between expectation and simulation in this field of finger-pointing gestures. We saw above, in relation to the ability to ascribe visual perceptions to a conspecific, that the expectation resource works up to a point. Beyond that point, expectation becomes bankrupt. The key to this limitation is that the inner contents corresponding to the fellow staring at me can never be an expectation of mine. What happens, then, in the case of kinaesthetic-postural and visual interiority of the conspecific and comprehension of the pointing gesture? If the perception of another’s interiority remained at the level of mere expectations, then the communicative reception of the pointing gesture would never be attained. Expectations of any kind involve the possibility of their fulfilment. For that reason, it will be impossible for A to have an expectation of inner states which seek to communicate with A. But if that communicative intention is not included in the perceived expectations, then the movements extending the arm and moving the hand forward, which were perceived in the conspecific’s body, will be absurd. As I said in

Chapter 4.â•‡ Pointing gestures 

4.4, extending one’s hand in mid-air is meaningless if it is not related to communication. Consequently, the human-type comprehension of pointing gestures cannot be based on the old expectation resource, but on the exclusively human resource of motor simulation. This simulation must be ascribed to a radically not-own self. In the chapter on monkeys’ mirror neurons (above, 1.3) some readers may have felt uncomfortable at my selection of received gestures of hitting or begging as examples to be contrasted with the only manual movement able to activate mirror neurons in their mirroring-role, that is, the grasping movement. These readers might have felt that mirroring needs a certain ‘contemplative distance’ which does not occur when the not-own behaviour so directly and immediately affects the animal being considered as the subject; or, in other words, when the not-own behaviour is interacting with the subject. Now, here in Chapter 4, we can completely back up and justify that reticence, which was in the past entirely real in some of my students, in addition to being possible in some of my readers now. The mirroring applied to not-own movements that are interacting with the subject would be a more costly and demanding process than a non-human brain can afford. The key point here is to stress what is achieved with the third processing mode of others’ eyes (or, in other words, with true simulation, or the perception of a radically different self). With this acquisition a brain could, for the first time ever in evolution, attend to two lines of awareness (the primary mental centre and the second, simulatory centre). Obviously, the second line is not only partial and discontinuous, but also much weaker than the first line.9 However, despite its weakness, the second line is enormously important. 9. The second line is weaker because it is sporadic and cannot guide behaviour. (Flanagan [1992, p. 103]: “If it means having the experiences exactly as the experiencer has them, then this never happens; but if it means understanding another or conceiving of what things are like from another’s point of view, then it often happens.”) This weakness of the not-own interiority inside one’s own mind contrasts with the subject’s recognition of the fact that, in principle, the contents of the two lines are at a similar level. I would suggest that this contradiction, inherent to the second mental line, between an actual lower-rank status and a theoretical equal-rank status, is one of the defining features of human beings. Neither interests of others nor the remote future are as intimately related to biological forces as are the contents of the first line. Cf. Albrecht et al. (2010): “Our results support the hypotheses that participants show less affective engagement (i) when they are making choices for themselves that only involve options in the future or (ii) when they are making choices for someone else.” (It is well known that humans discount the value of future rewards over time.This discount – althought, in my view, it is a less interesting issue – can be more easily studied than the role of interests of others: Metcalfe & Mischel [1999].) Despite this fact, however, a human being knows objectively that both interiority of others and his own interiority (or both the remote future only envisaged by evocation and the immediate future connected to expectation) are similar. This problem, this contradictory duality of opposed estimations, appears alongside the ability for a ‘theory of the mind’. This contradictory state can only be alleviated by means of an effort aiming at focusing on the complete known reality (e. g., Boyer [2008] has suggested that the capacity for episodic future thought – also referred to as prospective mental time travel – may underlie the human ability to make choices with high



Becoming Human

4.9.2 The different manners in which somebody else’s body may be informative I would like at this point to refer to a position from philosophical anthropology that I find particularly sensible.10 This position has been frequently neglected and ignored throughout history, but reappears from time to time. The human body would be a mixture of opacity and transparency. On the one hand it hides its thoughts and, in general, its mind or personal interiority; in this sense, the body acts as a genuine barrier. On the other hand, however, the body is a bridge (the only possible bridge) toward that interiority, and is thus the condition that enables us to open up to others. Needless to say, it is not only when confronting a human gaze that the body is informative. Such a restriction would be obviously false. Communicative innate signals or, likewise, the circumstances allowing the ascription of a given visual field to another individual’s body, are also perceptible in the body itself. They are, obviously, only for perceptive abilities permitting it in each case. However, given those abilities, the conspecific’s body is informative. That informative function of the body can be found, I repeat, in many animals. But now I want to deal with its role as a bridge towards the perception of a radically notown self, a perception which, according to my hypothesis, would be exclusively human. I think we should analyse further the relationship between another individual’s body and the perception of a radically not-own self. If we focus on the exact moment in which the individual who is pointing (after having looked at the object) turns his or her head to look at the recipient, we will realize that the only information that the producer’s body can provide at that very moment consists of signalling a readiness to interact with the recipient. Nothing else – nothing relevant for the comprehension of the pointing, I mean – can be extracted from the perception of the producer’s body. At that very moment the producer’s body goes on stage only to interact with the recipient. Now his or her body is neither looking at the object in question nor ready to establish any relation with that object: the producer’s body seems only ready to interact with me. In short: if we do not ascribe to the producer interiority in clear contradiction with what his or her body at that very moment is proclaiming, we will not be able to understand the pointing gesture. The genuine complexity involved in triadic communication derives from the fact that only one single relation can be directly perceived in a body at every moment. Either you perceive in that body the innate communicative signals (of readiness to interact with you), or you perceive the relationship between that body and a given object, but never both things at the same time. The only way to go beyond this is to conceive an interiority different from one’s own, that is, a second long-term benefits). It is an issue which interests me deeply (Bejarano [2010b]), but one for which there is not space in this present book. 10. Hierro Sánchez-Pescador (1990) pointed out that, in order to leave their disorientation, the philosophy of language should be connected to anthropology. That recommendation, which I considered to be so lucid back then, now seems to be completely indisputable.

Chapter 4.â•‡ Pointing gestures 

mental centre in one’s own mind. That radically different interiority would have his/ her own accumulative sequential line, a line where the current moment is the heir of the previous moment. However, is this accumulative sequence not in fact the case of a chimpanzee’s ascriptions? Remember the third experiment mentioned in Tomasello, Call & Hare (2003). A lower-ranked chimpanzee, having seen that at the moment m the dominant individual was looking at the food, is able to foresee that at the moment m + 1 the higher-rank animal will go for that food (supra, 2.2.1). Is it not true that this chimpanzee’s ascription involves the same accumulative sequence that I have presented as typical of human pointing gestures? No, I think there is a crucial difference. Admittedly, the lower-ranked chimpanzee’s forecast relies on the visual perception that it has previously ascribed to the higher-ranked animal. At that very moment m, the ascription of a visual perception ipso facto generated the forecast of future behaviour directed at the food, and that forecast remains active at the moment m + 1. No objection to that, I insist. But at the moment m + 1 the higher-ranked animal’s body reveals nothing that contradicts the forecast. It is just that – i.e., to contradict the forecast – that the producer’s looking at the recipient provokes at the second moment of his/her pointing gesture. At the previous moment, the producer’s look and hand directed both towards the object might be remarkably clear and perceptible signals of the producer’s probable future actions on the object. However, immediately afterwards, the producer’s body interrupts his relationship with the object, puts an end to those apparently announced actions and reveals a completely different readiness, a readiness to interact with the recipient. (In other words, in my production of the pointing gesture, “for you signals”, as Tomasello [2008, p. 96] calls them, cannot be simultaneous with the signals of my focus on the object.) Consequently, only by perceiving the self, or mental line, of the producer, can the recipient avoid being deceived by the apparent cancellation of the previous indications. Only by conceiving an interiority different from his, that is, a second mental centre in his own mind, can the recipient truly understand pointing gestures. The recipient must understand that the inner state at the second moment aims at communicating something about the object looked at at the first moment. Or, in other words, he must understand that the gazing at the object at the first moment was only preparatory to the second moment. Focussing again on the informative role which the perceived human body plays for a human perceiver, I would stress that the informative body is, first and foremost, the body in all its movements, i.e. its visible as well as its audible movements. It is in not-own movements that not-own interiority can be perceived. If we keep this in mind, the almost esoteric aroma of terms such as ‘mind reading’ disappears and gives way to fairly concrete explanations. We find comments on this term, as well as, of course, the hope that somehow adequate explanations may be found, in a number of authors: see, for example, Sabbagh & Baldwin (2005). Needless to say, therefore, that my view has absolutely nothing to do with the debate between ‘embodied cognition’ and ‘theory of mind’ (see, e. g., Spaulding [2010]). In



Becoming Human

my view, this alleged opposition (‘embodied cognition’ versus ‘theory of mind’) rests on a very constrained understanding of the ‘theory of mind’, an understanding that identifies it with a single one of its subcurrents. The distinction I am interested in is not the one at the core of this debate, I insist, but that of the three modes of processing a conspecific’s eyes, or, more concretely, that between expectation and genuine simulation. But, let us continue with the relationship between the third mode and the movements of someone else’s body. Language, only perceptible in the results of its motor component, would be, in my opinion, the paradigmatic example of this opportunity to grasp mental contents of others. Analogously, some 20th-century conceptions about language are also typical of the persistent reluctance to accept the above mentioned philosophical position: from considering any mental reference as a taboo, research shifted to concentrating on artificial intelligence and computational cognitivism. Neither of these extreme approaches paid due attention to the peculiar bridging role played by the human body. However, it is not in language, but in other less complex abilities, that the origin of this role is to be found. I have dealt here with the human ability to point using a finger and the eyes. It is in these skills that we must look for the first showings of that bridging role towards a radically not-own interiority. In addition, as I have tried to suggest, pointing gestures would be similar to several features that distinguish primates, and at the same time subtly different from those features. Studying pointing gestures could thus bring us closer to two understandings at the same time. On the one hand, we would be close to understanding why in those gestures the informative role of the producer’s body is peculiar, that is, why it is possible only with the human observer. In my view, the peculiarity that we have just seen at its initial stage, that is, the ability to understand a radically not-own interiority, is the grain of truth enclosed in the classic concepts of spirit or soul (cf., e. g., Barresi, 2010, or also Barresi & Moore, 1996). On the other hand, if the ascription of visual perceptions to conspecifics were to be confirmed in chimpanzees but not in other animals (see above, 2.2.3 and more importantly 2.2.4), we could also be moving closer to an explanation of why it was precisely among primates that this special animal, the human being, emerged. But let us leave aside all these premature comments and continue with the task in hand. Thus far, we have brought together two abilities. The comprehension of the communicative use of the alternating gaze and the comprehension, likewise, of pointing gestures would share a single crucial requirement. Now, in the next chapter, we shall also add four-hand co-operative action.

chapter 5

Four-hand co-operative actions and children’s interpersonal co-ordination games 5.1 5.1.1

Co-operative actions Four-hand action

Until very recently, the study of radically co-operative actions had attracted very little attention from researchers, although we might, admittedly, cite a few lines in Allport (1924) or Reynolds (1993).1 This latter writer has, in my opinion, the merit of speaking specifically about four-hand actions, and not only about co-operative actions. However, as I have noted, it is only since 2003 or 2004 that this research field has begun to take shape as such. Thus, Sebanz, Bekkering & Knoblich (2006) provide, in addition to other valuable comments, experimental results which show that feedback about the temporal display of not-own actions can be as effective for anticipatory control of action as internal signals about one’s own actions. Likewise, these authors make the important conclusion that the action plan in joint actions is based ultimately on a prediction about the joint effects of one’s own action and the not-own action.2 Our aim here, however, is more specific. One of the five ‘Questions for future research’ which Sebanz, Bekkering & Knoblich (2006, p. 75) list at the end of their article is the following: to what extent does any joint action rely on a ‘theory of mind’? This question is closely related to the issue we shall address here. In short, my aim is to connect the ability for co-operative actions to what we have called the basic human ability. 1. Let me take the opportunity to say that I read an article by this author a long time ago which I really liked. In that first article he coined the term “polylith” as a composite of parts whose cohesion is not a mere effect of gravity. The construction of a polylith involves, Reynolds (1983) said, having arrived at considering a part – an independent thing in itself – as an incomplete part of the whole which is yet to be built. This idea is generally accepted nowadays. It can be seen in Wynn (1993), when he says that subtraction techniques are previous to addition techniques, in Ambrose (2001), and Whiten (2005). The point is that the concept of polylith has remained active in the back of my mind for years. 2. An attractive suggestion can be found in Nagel (1971, p. 409): “These cases (brain bisection) fall midway between ordinary persons with intact brains (between whose cerebral hemispheres there is a better cooperation), and pairs of individuals engaged in a performance requiring exact behavioral coordination, like using a two-handed saw, or playing a duet. In the latter type of case we have two minds which communicate by subtle peripheral cues; in the former we have a single mind.” (Cf. also Churchland [1981, p. 87])



Becoming Human

Many basic, straightforward tasks cannot be carried out by one single individual. Think of a piece of ground with a lot of water under the surface. In order to get at this water, we would dig a hole and apply pressure to the walls to keep them from collapsing. As a result, some water gathers at the bottom of the hole. If the individual applying pressure on the walls tries to collect the water which is in the bottom himself, things will undoubtedly begin to go wrong: as soon as the pressure on the walls is released, the earth will slide to the bottom and reabsorb the water which had gathered. The solution is obvious: another individual is required at that moment. How would this co-operation be achieved? On the first occasion, there was probably no planning. One individual applies pressure to the walls and the water begins to gather at the bottom, another individual sees the long-awaited water and goes to collect it. However, for this manoeuvre to become one of the group’s repertoire of procedures, someone had to perceive the task as a joint plan. That perceiver might have been one of the people involved in the chance execution of the task, or perhaps a third person present as a mere observer. What does this comprehension consist of? The subject would have first perceived the kinaesthetic-postural not-own interiority: the not-own interiority applying pressure to the walls, for example. Such perception might have occurred, as we already know, through the simple resource of expectation (or, more specifically, through the a posteriori expectation). Later, however, the subject will necessarily have had to conceive of that interiority that is applying pressure to the walls as the individual with whom he must interact. The frequently-mentioned simultaneity of requirements is involved here. On the one hand, the seen body has to be perceived as an interiority, and simultaneously, on the other hand, as interacting with me. In no way will there be a joint plan without the simultaneous presence of both requirements. In four-hand actions, as you can see, we find the same requirement as in the comprehension of alternating gazes or pointing gestures. The self to whom I ascribe perceptions or kinaesthesias must be a radically not-own self, since it must see me, come towards for me or interact with me. There is, also, a further point to be highlighted. Four-hand actions would typically require the two subjects to be positioned approximately face-to-face, and would, therefore, involve the reversal of the right/left axis between one’s own and the other’s body. We have already seen this relationship between axis reversal and true simulation in 3.3.3. Keeping our focus on the key point here, we should be aware that it would not be possible to develop this plan for the whole task using the old resource composed of expectation and a single mental line. With only this resource, the subject could obviously choose the appropriate behaviour when he comes up against a real hole full of water. Or he could, alternatively, perform the first movement on any piece of sufficiently wet ground. But for the plan to be a real plan, the succeeding steps must be linked. Indeed, the successive linking of means and goals has frequently been observed among animals (see, for example, Suddendorf & Whiten, 2001). However, what the animals have learned in these links of means and goals is to perform each movement

Chapter 5.â•‡ Four-hand co-operative actions and children’s interpersonal co-ordination games 

precisely when the respective suitable context has been finally achieved. This would not be sufficient for our plan for four-hand action. Here, the movements have to be shared out between two subjects. We should, perhaps, comment further on the objection and response at which we have just hinted. It is true that the movements forming the second step could be learned as context-related instrumental conditioning (with the goal, obviously, of getting water). Specifying that this conditioning would have to be linked to a context does not constitute any special requirement; any conditioned instrumental learning is contextrelated. As a result, learning the second step (i.e. the water-extracting movements) could, therefore, be very simple. This is true; however, we should note that this simple learning is not of the kind required for planning. In a plan, the context associated with the water-extracting movements is a result that will be attained when the first individual performs his task. In the planner’s mind the suitable context for them is the goal that a different agent would have to achieve.

5.1.2 A comparison with co-operation among chimpanzees Let us be a little more specific about the key element of this kind of co-operative action (which, for reasons of brevity, I have called ‘four-hand action’, despite it not necessarily involving the intervention of four hands3). In so doing, it will be useful to compare it with data on co-operation among chimpanzees. Of course, research on this issue has been going on for a long time, although it is still a matter of contention. Crawford attempted an experiment on this issue as early as 1937, although how this particular experiment has been interpreted is generally exaggerated. Many years later, the reliability of the available data on the issue of co-operation is still highly controversial. There is no general agreement regarding the observations made of the hunting parties of groups of chimpanzees trying to catch a monkey (although Boesch [2005] tries to present these expeditions as a task planned by the group). But let us move on to encultured chimpanzees, and look specifically at the experiment carried out by Savage-Rumbaugh (more than twenty years ago) with Austin and Sherman, (two chimpanzees that had learnt to use a lexigram board). “Sherman was placed in a room in which were a number of boxes baited with different kinds of food. Each box could only be opened by using a different kind of tool, and Austin was placed in another room with all the tools. Through a window Austin could see all the different foods in the boxes, and he would signal to Sherman which it was that he wanted. Sherman responded by using the lexigram board to tell Austin which tool he needed to open that box. Austin would select the appropriate tool (e. g., a key or a wrench) and pass this through a small hole to Sherman. Sherman would then open the right box and pass the food through to Austin (eating a small portion of it along the way)” (Rumbaugh et al. [2000, p. 121]). 3.

In a similar way, pointing gesture can be called ‘four-eye perception’.



Becoming Human

As we can see, this old experiment focussed on the use of lexigrams. However, we are interested only in collaboration. Thus, from this point of view, the chimpanzees Austin and Sherman are very similar to Gómez’s gorillas who asked their carer to open the door. However, in the chimpanzee experiment an additional point is apparent. The animal not only knows to ask for an intermediate step in the task to be resolved, but also plays the role of recipient in this type of requests. Another question is whether the abilities revealed in this example can be legitimately considered as merely a result of enculturation. On this matter, I tend to think that training and learning are simply making the most of chimpanzees’ natural abilities. In human beings, as we shall see in other chapters, interpersonal relations and cultural learning do indeed have an impact on mental abilities. However, this would be because exclusively human processes, both on the part of the teacher and of the learner, play a part when children learn human culture. The situation would, for this reason, be different with encultured apes. This, as I have said, is what I tend to think, although on this issue we cannot as yet categorically either prove or reject any opinion. Indeed, as I said in 4.6.3, Tomasello & Call, who were previously very keen on the influence of enculturation, have modified their view to some extent. At any rate, I consider my own opinion on encultured chimpanzees as a theoretical wager. After all these clarifications, let us consider the following question: is there any difference between this co-operation and our four-hand action? The ‘tool-openingfood’ plan could have been carried out by one single chimpanzee. The plan finally put into practice did not differ from the plan each individual had learnt by himself. If chimpanzee A were not locked in the cage, he himself would go for the tool, but he cannot do so in his current situation. Likewise, if chimpanzee B had been able to, he would have carried out the subsequent behavioural steps (opening, food) by himself. Only the circumstance that the box was inside the cage and, consequently, inaccessible to him, forced him to leave the next steps in the hands of his companion. By contrast, one single individual could never have performed the planning involved in our four-hand water-extraction task. From the very moment the plan was formed, it required the involvement of two individuals. In short, whereas chimpanzees A and B each had one individual plan which was repeated in both of them, the planner of the water-extraction task has a plan requiring two individuals. There is an enormous difference between a plan that is repeated in two individuals and a plan that intrinsically involves two individuals. But the data on which this explanation relies are too comfortable. Let us make things slightly less comfortable, and imagine that only one of the chimpanzees knew how to operate the tool. Here, the least able chimpanzee would be like Gómez’s gorilla, and the other chimpanzee would play the role analogous to the gorilla’s carer. Does this change anything? Certainly, it could not now be said that it is only external circumstances which prevent the chimpanzee performing the entire task by himself. Collaboration is now intrinsically necessary for him. However, we would not be looking at a genuine four-hand action in this case either. The request for help is the instrumental

Chapter 5.â•‡ Four-hand co-operative actions and children’s interpersonal co-ordination games 

means for the sub-goal of opening the box. When this sub-goal is achieved, the individual will return to working alone (although, clearly, in order to ensure success when the task is repeated, he will have to compensate his companion). On the other hand, in four-hand tasks, the companion must maintain his collaboration while the subject is acting. Consequently, this simultaneity of the two actions has to be considered in the unitary plan. Summing up, what would be the crucial requirement of this four-hand plan? The crucial requirement, I stress once more, would be the same requirement we suggested for understanding alternating gazes or pointing gestures: I have to envisage a not-own centre within my own mind. In other words, what is needed is to perceive an interiority so radically different from me that it is able to address me. Let us open out this contrast once more. What happened when chimpanzee B handed the tool to the caged chimpanzee A, or when Gómez’s gorilla, in a kind of implicit request, put his carer’s hand on the closed door? In these cases, it is certainly possible that there was not only expectation of the desired results, but also even, moments later, an a posteriori expectation about the inner state (visual perception of the box or the door) corresponding to a not-own position. However, even accepting this, the interiority thus perceived may continue to lack the feature of being conceived of as radically not-own. In such situations there would be no need to perceive the not-own interiority as interacting with the subject. By contrast, when we think of the role of the not-own interiority in the full plan of a genuine four-hand action, that interiority must be defined as a radically not-own self. (Above, in 3.5, to explain why we did not study ‘self-conscious emotions’, I said that amongst the consequences of the duality of mental centres, what interests me here is the series of abilities out of which language could historically emerge. Therefore, I will be asked why I have studied here four-hand tasks. I reply that the ability for those tasks implies taking a first step towards pre-motor planning. Certainly, that step is still little. However, from there on, we will reach the latent imitation of motor sequences which, as we will see in Chapters 8 and 9, will be crucial for the emergency of language.) So far we have given only a single example of four-hand co-operative actions. However, with primitive technology this kind of co-operation was clearly a constant need. If we were looking for an explicit statement of the schema which would be so often repeated, it would be something like ‘Hold this steady while I do that’ (once again, I am quoting Reynolds [1993]).4 What interests me here is that the ability to do this kind of four-hand task became more and more essential. The development of that ability in a child is something that it was worthwhile to ensure. And thus we come to the final issue in this chapter: interpersonal motor co-ordination games.

4. This collaboration would not have to imply either ‘altruism in front of genetically related individuals’ or even ‘reciprocal altruism’. Each collaborative task would bring immediate results for each of the collaborators.



Becoming Human

5.2

The interpersonal motor co-ordination game

5.2.1 From the adaptive advantages of play in general to the interweaving of evolution and culture The pleasure of game playing did not appear, of course, only with human beings. Among mammals, playing games is widespread. The chasing and fighting observed among young animals are, firstly, probably great fun for them, and secondly, very useful given their contribution to developing the skills adult animals need for survival. The novelty in human play is simply that the skills exercised and promoted by the child are the specific skills needed by human beings. This view of play, evidently, is not far from the ‘pleasure of function’ in early 20th century authors. But there is a difference: nowadays, we take evolution, adaptive advantages and the selection of favourable features very seriously (for a recent exposition of this issue, see Ellis & Bjorklund [2005]). In short, the pleasure of playing works like any other pleasure. Through it, the subject is not only taught some adaptively useful behaviours, but, in addition, often has them activated in him. However, in the particular case of game playing, the adaptive usefulness of pleasurable behaviours, although undoubtedly enormous, is indirect. It has to do with learning and updating skills that cannot be completely specified on the genetic level. A question arises, since, in the case of humans, some of the skills exercised and promoted by game playing will be requirements for cultural learning. On what level should we locate the usefulness of these games? On the level of evolution or of culture? My view is that the requirements of cultural learning became adaptively useful, from a given moment onwards. They became so adaptively useful that rather than saying that their presence provided the individual with advantages, it would be truer to say that their absence meant disaster. For that reason the biological pleasure mechanism was able eventually to become associated with play that exercised and promoted those requirements. Human abilities that would be exercised in this way by game playing are of different types. At this point, we are interested in the ability to co-operate in four-hand actions. Specific games related to these actions almost certainly exist in all societies.

5.2.2 An interpersonal motor co-ordination play In the part of Spain where I live there is a play called ‘Té, chocolate, café’ (‘tea, chocolate, coffee’): “Tea” (adult and child each slap their own knees with the palm of their hand), “chocolate” (adult and child each clap their own hands), “coffee” (adult and child clap each other’s hands). The question referred to above regarding the interweaving of evolution and culture arises here in an extreme form. The adaptively advantageous exercise of the potential for four-hand co-operation is embodied here in a cultural tradition taught by adults to children.

Chapter 5.â•‡ Four-hand co-operative actions and children’s interpersonal co-ordination games 

What was it that evolution supplied in order to establish such a game? Almost certainly, evolution merely needed to design that particular pleasure or innate consummatory pattern. The attentive and joyful reaction of children to any event involving something similar to four-hand co-ordination would have prompted adults to try to prolong that joyful moment; thus, each conventionalised game of this kind would eventually have come into being. However, we must also explain co-operation on the part of adults. What can we say about adults? On one hand, adults would always experience pleasure whenever they catch a baby’s attention. This pleasure would have a role in adults’ general behaviour towards children. (This would be similar to what occurs in the case of lullabies and motherese. Fernald [1989] showed that the musical component or intonation pattern used in lullabies and motherese is ideal to attract a baby’s attention). On the other hand, and more particularly for this game, we might also point out that adults would still be sensitive to the innate pleasure which guided them during their own childhood. Although those games are adaptive especially during childhood, it was beneficial that the pleasure associated with them did not fully disappear in adult life. Indeed, it is true that continuing to find pleasure in games during adult life can lead adults to perform activities which would not be in the least adaptive; however, this risk is extensively compensated for by the tremendous usefulness derived from games which adults play with children.

5.2.3 The ‘tea, chocolate and coffee’ game: The learning process Let us analyse how a child learns to play the ‘tea, chocolate and coffee’ game. From a long time before, the child has known how to imitate the clapping hands movement, or the movement of slapping something with his hands. Hand movements are the epitome of self-visible movements and are, therefore, the easiest to imitate. What is of interest here is that the motor imitations involved in the first two steps of the game (the ‘tea’ and ‘chocolate’ phases) can be performed using the old resource of kinaesthetic-postural a posteriori expectations. The imitator could simply detect the kinaesthetic-postural expectations corresponding to the movements seen in the model and then fulfil them on his own person. Nothing else would be required to successfully imitate these movements. But it is obvious that the quid of this game lies in the third step (the ‘coffee’ phase): without it, the game would be reduced to imitations similar to the ones the child had been performing for many months. What does the third step involve? When, in the third step, the child observes the model’s movements he must also perceive that the model is addressing him both physically and communicatively. And here, we come to our constant refrain once more. Merely detecting a kinaestheticpostural expectation is not enough in the third phase. The inner state detected in the model turns out to be addressing the child himself. Consequently, that state can never be encompassed in one of the child’s expectations. True motor simulation and the radically not-own centre would necessarily be involved in the third step or, in other words, in the step which forms the quid of this game.



Becoming Human

Think back, at this point, to self-recognition in the image in a mirror. There is certainly some resemblance between the child who performs the third step of our game and this self-recognition. The synchrony of expert players can turn each of them into a sort of mirror image of the other, into, as it were, a mirror on the behavioural level. However, there is a crucial difference which, needless to say, has to do with the very essence of this game. The fact that chimpanzees can recognise themselves in a mirror but, by contrast, do not enjoy this sort of game is by no means casual, but fits perfectly well with everything else. In play, the presumed behavioural mirror possesses its own self who communicates with me. This is precisely what is missing in the selfrecognition in the mirror. The self which I ascribe to the image in the mirror, if I am actually recognising myself in it, is my own self. There is a final point we must address. After long training, chimpanzees could probably learn to play these games. I have no data at all in relation to this, but that this learning could take place seems plausible. There is a reason why making use here of such a risky and unfounded supposition is entirely excusable: this unfounded and assumed learning, far from lending support to my hypothesis, gives rise to an objection to which I must respond. So, now, I shall do just that: how would apes, despite their lack of the basic human skill, have succeeded in learning to play such a game? Duly rewarded, chimpanzees could easily learn the first two steps. As I have already said, imitating simple movements, or, in other words, successfully performing the ‘Do what I do’ task, would be a useless side effect, for apes, of the advantageous ability of calculating a conspecific’s visual field. Apes can even imitate non-self-visible movements (that is, non-visible in their own bodies). Consequently, they would find these manual movements very easy to imitate. Furthermore, the imitation of the complete sequence is easy here, since the movements of clapping or slapping each have a noisy result, which helps with the clear distinction of each sub-goal within the sequence (remember 2.1.2). What about the third step? The chimpanzee could learn to clap hands with someone else, but in this third step of the game that learning would rely on a different resource to motor imitation. Whereas the animal would have detected kinaesthetic-postural a posteriori expectations during the first two steps, now, in the third step, he simply knows that the goal is to clap his hands with the other individual’s hands. By the time he gets to the third step, the chimpanzee would not be conscious of doing the same thing as the model. This is my explanation of how a chimpanzee would learn this kind of game. However, this is much more than an explanation of chimpanzees’ presumed learning. As a result of this explanation, we are now better equipped to describe how children would gradually come to learn the game. At first, the child would learn the ‘Coffee!’ movement in a way which is still very similar to chimpanzees’ presumed learning, i.e., paying almost no attention yet to the not-own interiority. Precisely for this reason, however, the game is adaptively advantageous. The child would immediately feel that the closer he gets to the new and just glimpsed way of performing the third step, the

Chapter 5.â•‡ Four-hand co-operative actions and children’s interpersonal co-ordination games 

more pleasurable the experience will be for him. By means of the repeated exercising to which the pleasure pushes him, the child would be activating more and more the other way of executing the ‘Coffee!’ phase. Indeed, if the child can attain this way of executing the third phase, it is because his brain – his ability to maintain a double mental line – allows him to do so. However, he would also need the game in order to exercise and promote this ability. For this reason, I repeat, evolution would have perfected an innate feeling of pleasure that is associated with these games. This pleasure, as with any other pleasure or innate consummatory pattern, acts as a teaching mechanism, a mechanism which teaches the appropriate stimulus for its satisfaction (as Lorenz [1966] said) and, simultaneously, would push the organism toward that satisfaction.

5.3

Enjoyable communicative imitation

Asendorpf et al. (1996) and Nadel (2002) and (2004) have studied an example of child behaviour similar to the one observed in this game. It gives equal pleasure and also involves the crucial requirement of the perception of an interiority which is addressing the agent. I am referring to the mutual imitation game, which children so often play with each other. Nadel calls this ‘communicative imitation’. Taken literally, this label is ambiguous. Linguistic behaviour, which involves an imitation of the social code and is communicative, would also be perfectly covered by such a term. However, since there are ever more authors who use it, we will have to accept it. What does this kind of game consist of? A child, without at first saying anything, begins a step-by-step imitation of all the actions which another child is performing. When the model becomes aware of this, both children, imitator and model, burst out laughing. Although the motor imitation ability is obviously useful per se (it enables cultural learning, no less), it is not principally that usefulness which is active here. No new motor pattern is normally learned in these scenes. What is more, on some occasions, indeed on a large number of them, the silent imitator may be an adult, who obviously wants the child to realise that the other person, the adult in this case, is imitating him. Likewise, when the imitator is a child, the model of imitation is very frequently a child of the same age, and therefore has the same skill level as the imitator. But if these behaviours do not contribute to motor learning, we must ask ourselves why they take place. Let us begin by looking at when the child experiences the pleasure in each case. Look, first, at the case of the child who, without realising it, had been acting as a model. The enjoyment will only occur when the model becomes aware that he is being imitated. In order to become aware of this, the imitated child will have had to pay attention to the inner state of the other’s movements. It is possible for attention to be paid to the inner state of the other’s movements – we are repeating the same idea once more – using the procedure typical in chimpanzees, that is, the procedure which



Becoming Human

detects kinaesthetic-postural a posteriori expectations in movements of others. However, although this procedure actually enables the internal analysis of not-own movements, it will never enable me to conceive that this self is looking at me, is addressing me or copying me. Consequently, understanding that the other is imitating me would involve the crucial requirement of a double line of awareness. By analysing the pleasure experienced by the child who had been acting as a model, we have also explained the variant in which the adult plays at imitating a child. Turning now to the child imitator, at what point does he experience enjoyment? Obviously, he has had to be copying the model’s successive movements. Even so, he is still waiting for the moment of pleasure. This is why he continues with his silent imitation. The imitator’s enjoyment will only occur when the model realises he is being copied and looks up with an amused expression at the imitator. As you can see, we find here the same cause as before. The pleasure only occurs when the interiority that I am perceiving in the other becomes, by directing itself at me, a radically not-own self for me. At this point, let us begin to get interested again in chimpanzees. Nielsen et al. (2005) provide the first evidence that a chimpanzee is able to detect that he is being imitated when “the experimenter replicated all movements and body postures of the chimpanzee as he exhibited them”. This ability was entirely to be expected, bearing in mind the well-established fact that chimpanzees can imitate simple movements (see supra, 2.1.2). However, according to my suggestion, this detection on the part of chimpanzees would be different from the playful and pleasurable experience we find in children. Children would be to able to perceive a not-own interiority directed at themselves and would, therefore, have to abandon the resource of expectation and turn to simulation. It is this required establishment of simulation and of a second centre in their minds that would explain the enjoyable character of these children’s exercises. By contrast, nothing of the kind would occur in chimpanzees, who, after themselves having been repeatedly imitated, would already have established the prediction that the movement which they are about to perform will also be observable in the body which they are observing. However, as long as chimpanzees keep that kinaesthetic-postural interpretation of the not-own movements active, they will not be able to understand them as communicative signals: This is what would be deduced from the hypothesis set out here. In short, that this pleasure ever emerged in evolution was undoubtedly because there were adaptive advantages to splitting of one’s own mind in two in that way. The advantages would come through the communicative use of alternating gazes and the finger pointing gesture, as well as through the ability to perform four-hand actions. This is what we have said so far. Evidently, however, the basic human characteristic suggested here will be truly significant only if we can show its connection with language. This will keep us busy in Sections III and IV. In the first of these Sections, we will deal principally with how what we will call the symbolic ability might have emerged. In Section IV, it will be full, syntactic language that we will attempt to derive.

section three

Specifying some necessary requisites of language This section deals with some necessary, but still not sufficient, conditions for language. We will give attention, first, to what has come to be known as Saussurean parity, or symmetry between the production and reception of linguistic meaning. The second requirement we will examine is symbolic ability. In both cases, we will try to show the connection with the basic human ability.

chapter 6

Saussurean parity and the perception of a radically not-own self We will now, for the first time in this book, begin to look directly at language. What will occupy us here is but a small part of language, an aspect which, despite already involving a socio-culturally advanced learning, looks like an almost immediate, scarcely-developed, derivative of the pointing gesture. But we are getting ahead of ourselves; we must proceed more slowly. When a recipient hears an order or a request, he understands that he is being ordered or requested. By contrast, when he utters exactly the same word with the same intonation, it is he who is ordering or requesting. In spite of this difference, the word in question is the same for him in both cases. This happens in all human languages. We have here an absolute linguistic universal which, despite its simplicity, may be of interest. This characteristic of signs has come to be known as ‘Saussurean parity’ or ‘producer/recipient symmetry’, or in Tomasello’s (2008, p. 103) words, ‘role reversal imitation in communicative conventions’.

6.1

Toward a formulation of the problem

6.1.1 Production and reception in animal communication Does the same thing happen in animal communication? At first glance, the parity between the signal produced and the signal received may seem a natural and inevitable element of all communication. It is, however, a question we must tackle. There is an inquiry into parity, which has become a classic, in Hurford (1989). Likewise, Arbib (2005) or Rizzolatti & Arbib (1998) hypothesise that mirror neurons would evolve to support parity. However, the aspect focussed on there is different from the one occupying us here. For the time being, we can state that this parity (the awareness of the identity between the signal produced and the signal received) is not essential for animal communication to occur. In order to show an aggressive attitude, a fish raises itself up and exhibits the red spot on its belly. Another fish sees that red spot and reacts accordingly, whether by retreating or attacking. The recipient has understood perfectly the communicative value of the red spot. Even when confronted with a supranormal stimulus, a remarkably big red spot on a wall, the fish will react in the same way, although more



Becoming Human

intensely. However, when it, in turn, becomes the producer of the same signal, it will not know that it is showing the red spot on its belly. The fish is simply in an aggressive state at that point and performs the corresponding innate motor pattern. In short, these fish have absolutely no need to equate the aggressive signal they produce with the aggressive signal they receive. When we humans observe how fish behave, we can easily identify both signals as equal, but fish themselves cannot. It is clear that fish can do without that identification. If we leave the world of fish and look at a cat’s swept back ears or its entire body in a demonstration of submission, for example, we will have to draw the same conclusion. A cat, unable to recognise itself in a mirror, does not know what it looks like when it adopts a submissive posture. However, when it takes on the dominant role and observes those same postures in another cat, it understands them perfectly. Evolution must have made the necessary arrangements for this co-ordination to occur. (“In the evolution of any communicative system, whenever change of any sort occurs, there must be a change in two respects: the signal and the receiver”: Alexander [1962, p. 465]1.) The individual animal does not need to consciously equate the signal received with the one produced.2 Thus far, we have limited our examples of animal communication to a particular type. We have only mentioned visual signals that are not perceptible in one’s own body. The fish cannot see its belly, nor the cat see its ears. However, there are also very frequent auditory signals in animal communication, signals which are perceptible both for the receiver and the producer.3 Consequently, the animal would have the possibility of identifying a signal she receives with the signal she produces. What we must now 1. The manner in which the change in the two plans will be executed (that is, in the production and also in the reception) is a more difficult question. Scott-Phillips (2010, p. 79) says: “Since mutually dependent behaviours are unlikely to emerge simultaneously, this gives rise to a prediction that communication will only emerge if cues or coercive behaviours do so first”. 2. This same consideration can also be applied to behaviours called imitative behaviour, such as birds grouping together in flocks or fish in schools. Certainly, here, contrary to what occurs in the cases we are dealing with, there is no inversion (threaten/be threatened, e. g.) of the producer and recipient of the relevant stimuli. Here, the result is that the behaviour of both is identical. However, despite this difference there is still a strong similarity between those communicative cases and these ‘imitative’ behaviours. In these birds (or fish) grouping together, it is again only the human observer that perceives that the produced and received stimuli are equal. No bird joining the flock perceives its behaviour to be identical to that of the other members of the flock. But the issue of imitation will have to wait until later chapters. 3. How is it that hearing his own cries does not induce the corresponding responses in the animal making those sounds? How is it that communication by means of self-perceptible signals does not have the same impact on the producer as on the recipient? Answering this question is straightforward: it has been well known for years that there is a kind of phylogenetically very old brain mechanism which ‘attenuates’ or suppresses some expected perceptual consequences of one’s own movements (see above 1.4.1). In addition, we do not hear our own vocalizations as we hear others. There is distortion that occurs from bone conduction when perceiving one’s own voice: Maurer and Landis [1990].

Chapter 6.â•‡ Saussurean parity and the perception of a radically not-own self 

ask is whether or not this possibility becomes reality; more specifically, would it be useful or actually, on the contrary, counterproductive for animal communication? What is the relationship in the dog’s brain between the aggressive barking he produces and the aggressive barking he receives? Given that, as we have already seen in visual signals, recognising the identity between the signal produced and the signal received is not essential for a communicative function to take place, that recognition will take place only if it is useful. So, is it really? In aggressive barking, the level of aggressiveness is obviously a piece of essential information. A ‘graduated continuum’ is at play here and, therefore, the differences between aggressive barks will be as informatively important as their similarities. Indeed, this would be compatible with recognising one’s own aggressive barking and not-own aggressive barking as belonging to the same type. However, the importance of the level of aggressiveness cannot exactly be seen as good news for the supporters of the theory that the brain interprets the barking produced as being identical to the received barking. There may be those who say that the recognition of any conspecific as a conspecific is one of the pieces of information transmitted by auditory animal communication, and one that is not precisely lacking in importance. I agree. However, this is a long way from being a conclusive argument in favour of the identification between produced and received sounds. We know that the initial recognition of a conspecific cannot rely on such identification. At first, young animals either emit no auditory signals or emit only a small fraction of the species’ repertoire. Nevertheless, from the very beginning animals must give preferential attention to the signals of others of their species. Consequently, procedures different from the identification between produced sounds and received sounds must be involved here, procedures that it would be hard to believe disappeared in adult life. But let us move on to a more general point. This ‘identity recognition’ about which we have been speaking: what exactly would it be? In my view, that recognition requires reception to involve a kinesthetic-postural interpretation of the sign. Thus, the key question is whether kinaesthetic-postural interpretation of shouts of others is present in primates. Certainly, this would be a possibility. Shouts are movements; what is more, as they are ‘self-perceptible’ (more precisely, ‘self-audible’), they belong to the type of movements whose kinaesthetic-postural interiority is more easy for the observer to understand. Nevertheless, I would defend the opposite possibility. In other words, I am inclined to believe that the kinaesthetic-postural interpretation would have remained outside communicative signals in non-human primates. Or, more specifically, that belief is inferred from my hypothesis and, therefore, the hypothetical-deductive method requires me to evaluate it as reliably as I can. Some indications which favour that belief are early found. Remember what we saw in 1.5 above in relation to Kohler et al. (2002) and auditive mirroring. Applying mirror neurons from the seen hand to the ‘heard hand’, or, more concretely, to non-communicative sounds of a kind which are producible both by one’s own manual



Becoming Human

grasping and by not-own manual grasping, occurs in monkeys. In contrast, however, none of this has been found for the reception of their communicative cries; more in general, in nonhuman primates, it seems that vocalizations cannot be elicited from the motor cortex (Ploog, 2002). But let’s consider the whole issue within a wider frame.

6.1.2 The problem involved in Saussurean parity This kinaesthetic interpretation (or, more generally, interpretation in production-format) would occur in human speech: in the rest of this chapter (since 6.3 to 6.4.2) we shall set out the good reasons there are for believing this. Right now, however, I wish to present, without delaying any further, the question that forms the very heart of this chapter. If the reception of human speech occurs in a production-format, we will then have to explain how it is that the reception of a linguistic sign (of an imperative linguistic sign, for example4) can, on the one hand, involve the same process as production yet, on the other hand, induce states exactly opposite to those accompanying production.5 The tables have been turned. It is now the apparently natural parity that seems surprising. How is it that Saussurean parity can occur? If production and reception share a common core, how can the reactions associated with production and reception be differentiated and even opposed to one another? This, I stress, is the question we have arrived at. There is a first response which comes directly to hand: we should pay attention to intonation. In addition to the articulatory-phonetic ingredient, messages involve an intonational ingredient which is undoubtedly much more similar to animal communication. The reception of this second ingredient would involve no kinaesthetic interpretation at all, but would be a more analogical and immediate process. Faced with a threatening communicative signal, animals feel themselves immediately threatened. The intonatory component of the linguistic message probably works, I insist, in the same way. Without doubt, this natural, unlearned component considerably alleviates the problem posed by Saussurean parity. Since it links with animal communicative patterns, intonation will produce a difference between the state associated with a message’s production and the state associated with its reception. The received threat 4. Imperative and also non-imperative. The contrast ‘spoken versus heard’ is parallel to the contrast ‘produced order versus received order’. (Cf. Janet [1936]) 5. This problem has been pointed out in other contexts. For example, Chang & Vermeulen (2010), in their criticisms to Simulation of Smiles (SIMS) model, argue: “According to that model, smiles elicited when viewing positive (i.e., enjoyment) smiles indicate mimicry. In this case, it is the current expression and feeling of the expresser that is simulated. According to that same model, by contrast, people are less likely to mimic negative (e.g., dominant) smiles, but instead, they simulate the feeling of being dominated. In this case, it is the previous experience of the perceiver that is being simulated. It is not clear why the different smiles should invoke simulations with reference to different people- and time- perspectives.”

Chapter 6.â•‡ Saussurean parity and the perception of a radically not-own self 

(Go away!, Get out!) will then be experienced as being very different from the threat produced by oneself with precisely the same words; the same thing happens for ‘Give me!, Water!’, for example. The problem remains unsolved, however, even after this alleviation and in spite of it. In the receptive processes for these articulatory-phonetic signs, the core is just the same as if the hearer himself were producing those same signs. Since the articulatoryphonetic patterns are interpreted in production-format, reception of them inevitably refers back to the production process. As a result, we will have to keep looking for a solution to this problem. Saussurean parity is paradoxical and requires explanation.

6.2

Saussurean parity and the second mental line: Our suggestion for resolving the problem

Why would interpreting in production-format the received signal not be harmful in human communication? In my opinion, this question takes us directly to the central hypothesis of this book. We humans are able to perceive a radically not-own self, to be exact, a self that is currently looking at or addressing us. This not-own centre that emerges inside our own mind would be the centre of true simulation. In the third processing mode of the eyes of others we would ascribe visual perceptions to eyes that were looking at us. In four-hand tasks we would detect an internal aspect to movements being directed at us. In our reception of speech we would analyse (analyse in production-format) signals that turn out to be directed at us. In all these cases, expectation would not work; simulation located in our not-own centre would be necessary. For this reason (namely, that simulation would be linked to the not-own centre), what is irreconcilable in animal communication can now be reconcilable in human language. A conscious identification of produced and received motor patterns would now take place. However, thanks to the human ability to conceive a not-own centre inside one’s own mind, that identification does not obstruct difference or opposition between states accompanying production and states accompanying reception. Let us sum this up. Why does parity between production and reception become a problem? Only because reception entails interpretation in production-format. Where is the solution? At the very same point which causes the problem. The kinaestheticpostural interpretation of not-own movements, if applied to a communicative movement directed at oneself, must necessarily be performed in the second, or simulatory, mental centre. Thus, although the reception involves a production, it is a production assigned to this centre. We now need to face some data which seem to threaten the proposal which has just been presented. According to recent investigations, something like mirror neurons has been found in songbirds (Prather et al. [2008]). Thus, there are animal communications whose reception involves their kinaesthetic interpretation. Does this force our proposal to attribute also to birds the duality of mental centres? I would



Becoming Human

stress that bird singing does not involve dyadic communication. By singing, birds mark their territory, proclaim what species they belong to and probably send an honest signal of their good health. However, this hormone-driven singing does not involve communication deliberately directed at another individual. We know that birds reared experimentally in complete isolation begin singing at exactly the same age as their wild conspecifics (even if the former only sing the basic melodic pattern, without the dialectal embellishments of the latter: Marler [1991]). In this sense, it is convenient to remember that specialists in vocal behaviour of birds establish a strong difference between male singing and the calling coming from any individual within the flock. In other words, although songbirds can obviously be heard, there is no clear addressee’s role involved: no recipient understands the message as being produced for him or her. Consequently, in birds, although they carry out the kinaesthetic-postural interpretation of the singing heard, that is, even if their reception of the singing takes place in a production format, the problem detected in the Saussurean parity does not occur here. An argument in favour of these suggestions lies, of course, in the more comprehensive explanation which they provide.6 In this way we can bring together several human abilities, more precisely, abilities that originate in children at relatively similar ages, as is the case of early linguistic communication and pointing with a finger or indicating with the eyes. Saussurean parity would already be present – this is the key point – in the pointing gesture. The receiver of the pointing gesture would have to estimate the kinaesthetic-postural and visual interiority that he or she would have if he or she were in the body of the producer, but, at the same time, the recipient realizes that what he or she, the receiver, is doing is not pointing, but receiving a pointing gesture. Now, we must set out the reasons there are for believing that interpretation in production-format occurs in human speech. Where should we look for them? Two directions appear promising. First, Piaget’s theory of imitation of new and complex motor patterns, or, in other words, the theory of ‘motor reception during the learning stage’. Second, the question of the meaning of deictics, or of some deictics, to be more precise.

6. I have said that in this comprehensive explanation would lie an argument in favour of my hypothesis. We should, I think, at this point delay no further a question that has already raised itself on several occasions. What is it I am doing in this book? As I said in the introduction, the hypothesis being put forward in this book is a synthesis. I have collected data from here and there, and simply make use of them as arguments in favour of my theoretical framework or synthesis. In this way, the mutual support between the different arguments emerges. But, and this is the key question, should we be taking into account support like this which is based on the decision to put such data together? This will of course depend on the theoretical wager. I will return to this question in later chapters.

Chapter 6.â•‡ Saussurean parity and the perception of a radically not-own self 

6.3

‘Motor reception during the learning stage’, the reliable core of the ‘motor theory of speech perception’

6.3.1 Liberman’s theory: The ‘motor theory of speech perception’ Alvin Liberman devoted the whole of his long career to developing the discovery he made in the 1950’s while doing some work for telephone companies. His discovery was this: the confusions most frequently observed in recipients cannot be described as confusion between auditorily similar sounds (or, in other words, between sounds regarded as similar on a spectrogram), but rather as confusion between sounds which are articulatorily similar. The relevant data had to do with a double dissociation: some sounds which a spectrogram describes as close, but which are very distant from an articulatory-motor point of view, were never confused. On the other hand, some very distant sounds according to the spectrogram, but which are close from an articulatorymotor point of view, were very frequently confused.7 Liberman formulated a theory from this data: the ‘motor theory of speech perception’: see Liberman & Mattingly (1985) (or also Liberman & Whalen [2000], the last, almost posthumous, paper that Liberman wrote). Speech reception would involve latent articulation. Motor simulation, which many years later would ride the crest of the wave, along with mirror neurons and the simulationist trend of the theory of mind, was clearly visible in Liberman’s contributions of the 60’s. Liberman very soon moved on to hypothesise that the phonemic categorisation of sounds would depend on this articulatory reception. However, the discovery that many animals were able to perform some categorisation of sounds (Kuhl & Miller [1975], Kluender et al. [1987]) hit this on the head, and brought considerable discredit to the theory. Thus, Galantucci et al. (2006) conclude that “the claim about the phonemic categorisation of sounds is likely false”. See also Ohms et al. (2010, p. 1003): “Zebra finches (Taeniopygia guttata) can discriminate and categorize monosyllabic words that differ in their vowel and transfer this categorization to the same words spoken by novel speakers independent of the sex of the voices. The birds, like humans, use intrinsic and extrinsic speaker normalization to make the categorization. This finding shows that there is no need to invoke special mechanisms, evolved together with language, to explain this feature of speech perception”. In fact, categorisation, despite Liberman’s efforts to use it for his own purposes, is more closely related to general perceptual abstraction. What occurs repeatedly in several perceptions is differentially reinforced. Of course, phonemic codes differ depending on the particular language. (To illustrate, let us take the example that appears in 7. Let’s recall a point by Liberman et al. (1967) which is usually less quoted. In dichotic listening experiments (i.e., studies in which headphones were used to play one speech sound to the right ear and a different speech sound to the left ear), listeners showed an advantage for phonemic categorisation in speech played to the right ear, that is, to the ear that has stronger connections to the left hemisphere of the brain. Cf. Shankweiler & Studdert-Kennedy (1967).



Becoming Human

Kuhl [2010, p. 717]: “Adult speakers of English and Japanese produce both English r- and l-like sounds, even though English speakers hear /r/ and /l/ as distinct and Japanese adults hear them as identical. Japanese infants are therefore exposed to both /r/ and /l/ sounds, even though they do not represent distinct categories in Japanese”.) The change in meaning determines the maximal limits acceptable up to which and only up to which a sound may vary without changing its category.8 Anyway, the perceptual abstraction process would work identically in all languages. The only variable element would be the exact profile of the arrangement from which the perceptual abstraction is later performed. Let’s go back to the nucleus of Liberman’s theory. Is there anything in this theory that would still stand today? Before answering, we must pay attention to some of Piaget’s proposals.

6.3.2 Piagetian premotor plan Piaget (1959) argued convincingly that, in children, latent motor imitation must always precede the first copy of all new and complex ‘motor patterns’, among which he included articulatory-phonetic patterns. From his data, he concluded that children made two achievements virtually simultaneously. At a certain age, the child, during her first reproduction of a learned pattern, manages to move beyond trial and error (she performs it without “tâtonnements”), and, virtually within days, also successfully performs a delayed ‘first reproduction’. These two simultaneous achievements can be explained – this is Piaget’s hypothesis – by means of a pre-motor plan or latent imitation that the child would have carried out while observing the model. Can Piagetian ‘latent imitation’ be extended to the learning of singing dialects by birds? We know the answer to this. Marler (1991) showed that young sparrows stored the adult singing in an auditory format during their mute and learning stage. To be precise, the animals stored the songs as an enrichment added to the innate auditory 8. Fraser (2004, p. 274) has revisited and stressed this indisputably structuralist issue: “Any sublexical conceptualization depends upon prior conceptualization of words” (or, in other words, knowing the phonemes requires knowing the words). Certainly, we must remember ‘statistical learning’ that makes infants segment word-like units from ongoing speech – Saffran et al. (1996). Statistical learning is computational in nature, and reflects implicit rather than explicit learning. It is based on the notion that the units (phonemes, syllables, chunks of whatever size) that make up a word remain in fixed positions relative to each other whenever the word occurs. Thus, it relies on the ability to automatically pick up and learn from the statistical regularities that exist in the stream of sensory information we process. See also Scott et al. (2007) and Mehler & Dupoux (1994). However, as Kuhl (2010, p. 715) proposes, “social factors ‘gate’ computational learning”. Social cues, such as eye gaze and pointing to an object of reference, would help infants segment word-like units from ongoing speech. Thus, Fraser’s structuralist point would be not completely wiped off the map. As Cutler (2008) has said, speech perception per se must be investigated in connection with speech perception in the service of spoken word cognition.

Chapter 6.â•‡ Saussurean parity and the perception of a radically not-own self 

singing pattern, that is, to the foundation to the songs of each individual in the species. How was it discovered that in sparrows’ brains both the learned dialect and the simple foundation initially adopt an auditory format and not a motor format? Marler deprived some birds of hearing just when they were about to start singing, and thus proved that learner birds, in order to become singing adults (either with a dialect, or, if the birds had been reared in isolation, only with the simple foundation) must first hear their own singing attempts, detect to what extent these are similar to or dissimilar from the model and rectify accordingly.9 This, incidentally, could be extended to the abilities of parrots. Marler’s experiments reveal that dialectal learning has nothing to do with ‘motor reception during the learning stage’ in these birds. See also Gobes et al. (2010). It seems that the mirroring discovered by Prather et al. (2008) is confined to adult males who already know how to sing. Among humans there is probably something relatively equivalent to this learning in sparrows. This (relative) equivalent might be the stage when children start to make their first sounds. Children make one attempt after another until in one of those attempts they hear a sound similar to the one they have heard.10 However, once this stage of first sounds is over, the child succeeds in reproducing in at least a recognisable way a word the first time she attempts to reproduce that word. Thus she gives up Marlerian way. Piagetian ‘latent imitation’ can definitely be relabelled as ‘motor reception during the learning stage’. In other words, we have a convergence between Liberman and Piaget. This convergence is surprising given the enormous differences between these two authors: Piaget was never interested in adult speech reception and Liberman took only peripheral interest in child learning. However, both would suggest a motor format for the perception of learned motor patterns.

6.3.3 What happens once acquisition has come to an end? A proposal for a reformulation of the motor theory of speech reception The learning stage would be the most reliable core for the motor perception of speech. The hypothesis of ‘motor perception during acquisition’ has some supportive evidence. 9. All this clearly resembles the descriptions provided by connectionists: See a summary in Churchland (1988). Marler’s research probably offers a proof that these descriptions match with a real brain process. 10. But, in children, this is likely to be preceded by the process described by Vihman (2002, p. 310): “The infant practices canonical babbling and produces CV sequences at 6–8 months of age. This practice in production sensitizes the infant to similar input patterns, which are now easily recognized because they pop out of the acoustic stream”. Later, after having acknowledged in the acoustic stream ‘words similar to babbling’ (‘mamma’, ‘baba’..., which in all languages design one or other element in children’s world), children would feel more motivated to pay attention to speech and to the similarities and differences of the patterns that they hear regarding the ones they already know from their babbling.

 Becoming Human

It is known that the premotor area of the brain is activated during ‘observation for copying’ of motor patterns, but not during observations that do not have this purpose: Decety & Grèzes (1998). See also Mattar & Gribble (2005). In addition, Imada et al. (2006) reported synchronized activation in response to speech in auditory and motor areas at 6 and 12 months. Thus, I disagree with some statements against the motor theory that have focussed on the learning stage (See Massaro & Chen [2008, p. 456]: “We know that receptive language is acquired before productive language, so it is difficult to understand how motor behaviour would contribute to speech perception”). In my view, the learner’s motor reception, or, more specifically, the latent sequential imitation that analyses the articulatory-phonetic sequencing of the level of wording, provides the bridge between the merely receptive language of the beginning, which would lack motor reception, and productive language, which appears later. Now I must deal with the question of what happens once acquisition has come to an end. I would reply with a two-fold answer to this question. On the one hand, I suggest that the sequential, articulatory-phonetic arrangement peculiar to ‘latent imitation during the learning stage’ disappears in adult reception (or more exactly, in post-learning reception). In this way, in adults, speech production can be impaired in syndromes such as the so-called Broca’s aphasia while leaving comprehension relatively intact. This dissociation between production and understanding was highlighted in Hickok (2009, p. 1238): “According to Liberman’s theory (...), damage to the motor speech areas should produce deficits in speech recognition. However, damage to motor speech areas, evidenced in many cases by large left frontal lesions and severe speech production deficits, do not typically lead to speech recognition deficits.” “Damage to the motor speech areas should produce deficits in speech recognition. However, damage to motor speech areas, evidenced in many cases by large left frontal lesions and severe speech production deficits, do not typically lead to speech recognition deficits.”11 It is also convenient to bear in mind the expert reader, who does not only read in an inaudible manner, but also treats each word as a undivided unity. Likewise, most plausibly, the motor sequential arrangement often disappears in the production of inner speech. (Later, in 9.4.4 and 9.5, we will see all this in a broader context.)12 On the other hand, however, I think that all this has no effect whatsoever on the fact that speech reception always occurs in production-format, that is, in the same format in which it must have been learned. This format-fidelity would be the basis of Saussurean parity. For the recipient, the word that he or she hears corresponds to the same pattern that he or she on other occasions produces, i.e. it is actually a pattern in production-format. We have now reached the goal pursued in 6.3. 11. Hickok’s criticism really aims at the ‘the mirror neuron theory of action understanding’ (See also Lotto, Hickok & Holt [2009]). 12. Certainly, the type of latent imitation which would take place in learning disappears once the learning process is completed. However, Devlin & Aydelott (2009), and also Samuels (in press) show some motoric contributions to adult speech perception.

Chapter 6.â•‡ Saussurean parity and the perception of a radically not-own self 

But let me summarize the above suggestions with regard to the ‘motor theory of speech perception’. Instead of supporting the current trend which dismisses the ‘motor theory of speech perception’ completely, I prefer to reformulate it. Firstly, I think that any acceptance, rejection or clarification of the motor theory must attend to the different phases of development – babbling (no ‘reception in production-format’, or, in other words, no ‘latently imitative reception’), learning of words (latently imitative reception that imitates articulatory-phonetic movements), adult language (latently imitative reception that imitates each word as a undivided unity). Secondly, my concrete proposal is that the way in which children always learn words has a significant impact on – confers a production-format to – any subsequent, adult reception of them.

6.4

The comprehension of deictics which ‘cannot be repeated as an echo’: What about the egocentrism of deixis?

In our language there are a small number of meanings in which we can analyse in great detail the subject we are dealing with on a larger scale in this chapter. These words belong to the category of deictics. In fact, they are only a tiny fraction of the category of deictics. However, in order to properly focus our attention on them, we should first address deictics in general.

6.4.1 Deixis The number of words available in a language must always be limited. The limited resources of our memory require this to be so. However, we human beings may want to communicate about any particular object. As a result, the particular objects on which our attention and our linguistic communication can focus are much greater in number than the linguistic terms available in any language: we have the term ‘chair’, which is suitable for any chair, but we cannot have one term for each particular chair. Deixis is the device invented by language to deal with this problem (this chair, your chair, etc.). A deictic is any linguistic element whose particular referential anchorage depends on who is using the term in question, and when and where it is being used. As a category, deixis comprises many different classes of linguistic elements. Thus, the verbal morphemes of past, present or future are aligned with the adverb ‘near’, or the contrasts ‘go’/‘come’, ‘import’/‘export’, or even the vocative in the terms for family relationships. In fact, deixis is almost omnipresent. Note how difficult it is to avoid deixis in a linguistic message. We can avoid it if we resort to metalinguistic or encyclopaedic messages, such as “Horses are mammals”, with their ‘apodictic present’, i.e., with a non-deictic present. Another way (the opposite really) of not using deixis is to leave the level of the particular reference as implicit: ‘How sad!’ for example, leaves all the work to the context of the utterance and there is no need to specify that it is sad here and now – or, alternatively, in the remote

 Becoming Human

or fictional scene which is the object of attention. (I have written ‘there is no need to specify’. But it would be preferable to say that the speaker does not make an effort to specify anything. This lack of effort sometimes leads to misunderstandings, and it is likely that it is such misunderstandings that have originally given rise to the motivation so that language would develop the resource of deixis.) However, it would be impossible to avoid deixis in a message invoking a particular referential anchorage. Thus, the sentence “Julius Caesar crossed the Rubicon in 49 BC, or in the year 704 after Rome was founded”, despite using proper nouns to refer to the subject, place and time (the calendar, which is a system for producing designations, pivots around a proper noun), cannot completely avoid deixis. The temporal morpheme of the verb ‘crossed’ clearly indicates that the speech act happened after the year mentioned. The day before he crossed the Rubicon, the verb would have been in the future, and at the very moment he was crossing the speaker would have opted for ‘is crossing’. As can be seen, the two extreme types of speech (metalinguistic or implicit) that do not need deixis constitute a tiny portion of all linguistic messages. Deixis is, consequently, almost consubstantial with language.

6.4.2 The egocentrism of deixis, and deictics ‘which cannot be repeated as an echo’ When a speaker says ‘today’ or ‘yesterday’, the hearer can repeat these words immediately afterwards without the term’s specific anchorage being changed. The day they are talking is shared by the speaker and the hearer. However, not all deictics can be repeated as an echo with such impunity. Some deictics would acquire a completely different sense if the hearer immediately repeats them in a conversation. This, of course, is the case of deictics related to the pronouns “I” and “you”. This effect can be observed not only with “mine” and “yours”, but also, depending on the contexts, with “this” and “that”, or “here” and “there”, or with “in front” and “behind”. These ‘deictics that cannot be repeated as an echo’, as we might call them, are what will occupy our attention now. The meaning of all deictics depends on who is speaking, and on when and where. The zero point of the spatial and temporal co-ordinates that will determine the referential anchorage of a deictic is always the speaker acting as such. The egocentrism – the pivoting around the speaker’s ego – peculiar to deictics is indisputable. However, thus far we have paid attention only to the producer. What about the reception of deictics? Or, in order to restrict the question to what we are interested in, what about the reception of deictics which cannot be repeated as an echo? Obviously, the recipient cannot apply his egocentrism to the reception of the pronoun ‘I’. (Later, in Chapter 21, we will reformulate this in a broader context, but for the moment this comment will suffice). However, since any linguistic term is consciously recognised as identical both when produced and received, there must be a resource

Chapter 6.â•‡ Saussurean parity and the perception of a radically not-own self 

where the identity of the “I” produced and heard can be reconciled to the completely different referential anchorage of the one and the other. What will this resource be? Let us address first the resource involved in metalinguistic definitions. In order to define what the term “I” means, it is said that it refers to the speaker in question in each case. As can be seen, that definition can be used both by the speaker and by the hearer, and so our problem disappears. The question now is whether or not this definition is really used in mental processes. It is true that that neutral and aseptic definition, from which all trace of egocentric feeling has been erased, can perfectly account for the case of ‘I’. However, this neutrality and asepsis will not work so well if we now consider other deictics belonging to the I-series. Imagine that a speaker says: “The thing (a cobra, for example) is right behind me”. The two speakers in the conversation will probably not be shoulder to shoulder; they will, instead, be face to face, although not necessarily precisely geometrically opposite each other but at an angle somewhere between shoulder-to-shoulder alignment and an exact geometrical face to face. The metalinguistic definition of ‘behind me’ will be ‘behind the speaker in question’. However, the definition makes it necessary to calculate what ‘behind someone’ means. The only way to perform that calculation is to perceive the ‘face/back’ body axis of the body of the speaker. Of course, in order to perceive that axis explicit instructions can be given. However, if we take into account that the ability to perceive this is a skill inherited from apes, the neutral linguistic definitions become more and more grotesque and unbelievable. Finally, we should remember that deictics such as ‘right here’, or ‘there’ are usually accompanied by gestures or glances. Thus the comprehension process involved in the reception of these deictics may be very similar to the comprehension process involved in the reception of pointing gestures. We can now put forward our suggestion regarding the comprehension of deictics which cannot be repeated as an echo. This comprehension does not ignore egocentrism. Egocentrism is our inherited biological procedure for conceiving any distality (near/far; up/down; in front/behind):13 consequently, any other procedure designed to replace it will not only be horribly complex but also, in the end, inefficient. However, in reception, the egocentrism involved in the meaning of ‘I’ is associated with a self that we have conceived inside our minds and that is different from our own self. A similar proposal to mine – although applied to the comprehension of scenes and, of course, without using my terminology of a second mental centre – can be found in Lozano, Hard & Tversky (2007): when observing an action, the observer tends to adopt the agent’s perspective. In a similar manner, we can also say that when listening to a linguistic message, the hearer tends to adopt the speaker’s perspective. Once again, the key point is this: when the hearer receives a linguistic sign, he does not see it as a received sign, but as a produced sign, although definitely produced by another individual, i.e. as a not-own production. With this explanation of deictics 13. More generally, this ‘egocentric’ distality is the key to animal perception and to animal self.

 Becoming Human

which cannot be repeated as an echo we have reached the second argument in favour of my explanation of Saussurean parity. But let me repeat what I have said above: the suggestion that this feature of the linguistic sign is possible only as a result of the perception of a radically not-own self would have to be seen within a more general framework in which, for the moment, pointing gestures (with the finger or with the eyes) and truly co-operative actions are included.

chapter 7

About evocation 7.1

What is it we mean by “evocation”?

Linguistic signs have the ability to make us evoke objects not within reach of our perception at that particular moment. This is an ability they share with other kinds of symbols. This is extremely clear. What is not at all clear, on the other hand, is whether evocation is something animals can access. Can the term ‘evocation’ be replaced by ‘displacement’? This latter term, at least in the sense given to it by Hockett (1960), or also Bickerton (2009), refers to a feature of communication: human language, just like bee-dancing, is an example of displaced communication. Given that the term refers to communication, it is clearly unsuitable to formulate two of the questions I am interested in -i. whether the goal pursued by an animal is evoked or not by that animal; ii. whether children are able or not to evoke by means unrelated to both language and communication. But let us define more exactly what evocation is. Can we talk about evocation in dreams? Certainly, when a horse chases me in a nightmare, the horse is not actually present. In the midst of my nightmare, however, there is no way I believe that the horse is not real. But, during those moments, what has no reality whatsoever for me are, in fact, my bed and pillow (Cf. above, 1.4.2). This leads us to the following clarification: evocation is a question of having the image of an object while at the same time being aware that the object is not present. Note that, by denying that the contents of dreams are evocations, we have blocked one route to answering the question of whether animals possess symbolic capability. Nowadays there is a method available to verify “from outside” if a sleeping person is dreaming. Rapid eye movements and so-called delta waves appear to offer reliable evidence. When a person exhibiting these patterns is woken up, he will say he has been dreaming. As a result, we have some grounds for ascribing dreams to animals (specifically mammals, with the exception, it seems, of the platypus) exhibiting these patterns. However, as I have said, all this cannot help us to answer our question. On occasion, evocation has also been defined as voluntary, that is, produced in us at will, in contrast with involuntary perception.1 Nevertheless, I have my objections to this ‘at will’. Remember that a hearer often inevitably evokes the meaning of a received 1. For example, Myin & O’Regan (2009, p. 196): “A perceiver will have more a feeling of control, less a feeling of imposition, when he or she is thinking or remembering than when he or she is engaging in sensory interactions”.

 Becoming Human

message. In my opinion, the grain of truth which makes this formula so appealing is that the occurrence of an evocation, even the inevitable and apparently passive evocation of a hearer, needs some type of simulation to be carried out by the evoker. However, before we go any further, we must first address a more fundamental question. Animal behaviour may be described as goal-driven. These goals, while they are acting as goals, have not, by definition, been achieved. A question thus arises: do animals evoke the object they hope to obtain? After hearing the bell or pressing the lever, do conditioned animals experience the image of food? At first glance, of course, our generation may feel it cannot tackle this question. It is pointless to try to pose such questions – we might think – as long as we do not know how consciousness emerges in the brain. In my opinion, we lose nothing by trying. Giving in to pessimism and resignation is what might be damaging.

7.2

Do animals have the ability to evoke absent objects as such?

7.2.1 Some potentially relevant data The only thing we know for sure is that goal-driven animal behaviour does not always necessarily require the evocation of the desired absent object. It is true that the most reliable data refer only to a very restricted field. Nevertheless, we must examine that field. In order to do so, let us look at some lines from Lorenz (1966), who took inspiration from an old experiment carried out by Craig. A pigeon that has never seen a nest has absolutely no image of a nest. Proof of this is that a pigeon will perform the ‘settling into the nest’ innate motor pattern on a typewriter or a biscuit box. However, once it has behaved in this manner, the pigeon becomes aware that these objects are not correct: it never returns to them and keeps on searching. When confronted with a nest, the pigeon tests it as it has done with all the previous objects. But now the pigeon’s “teaching mechanism” (as Lorenz expresses it) apparently emits an OK signal. The pigeon settles into the nest and does not look any further. Let me comment on this. For centuries this kind of behaviour has been attributed to instinct. Thus, the pigeon would presumably have the instinct of looking for a nest, and young mammal animals would have the instinct of looking for their mother’s milk, even though, as new-born animals, they could have had no previous experience of milk. It was Lorenz who convinced us that we had to go on looking into this: the term “instinct” is nothing but a pseudo-solution along the lines of the ‘virtus dormitiva’ (I believe the criticism in Scholz, 2002, e. g., does not in any way prevent us adopting this part of Lorenz’s theory). Animals’ innate baggage cannot be an image of searching or a piece of knowledge: if this were the case, they would not look for their goal in inadequate and outlandish stimuli. But neither can it simply consist of pure ignorance: if this were the case, they would be unable to opt for the right stimulus. This had already been highlighted two

Chapter 7.â•‡ About evocation 

and a half thousand years ago. Looking for support for his theory of Ideas, Plato formulated what is known as Meno’s paradox: “If you already know, you will not ask; but if you do not know, you cannot ask either”. Of course, Plato was only referring to human questions or intellectual seeking. I will focus on this level of the dilemma in the Chapter 20, and will highlight the differences between the respective solutions for each level. However, Plato’s basic problem coincides exactly with the one illustrated by Lorenz’s inexperienced pigeon. What is required in both cases, the only route off the two horns of the dilemma, is an empty but also well-defined profile.2 Needless to say, the concept of this empty profile coincides with the concept of expectation, which we used above, in 1.4.1. Such a profile, being as it is well defined, can emit the instructive report (i.e. the ‘OK signal’) in the presence of appropriate stimuli, or the ‘NOK signal’ in the presence of inappropriate ones. Likewise, since the profile is an empty profile, there is no contradiction in possessing it and, at the same time, testing inappropriate objects. As Lorenz formulates it, innate consummatory patterns would be a ‘teaching mechanism’. If we accept this, we will have also to accept that the inexperienced pigeon’s behaviour involved no image of a nest and that, consequently, there are at least some cases of goal-driven animal behaviour which can take place without the ability to evoke absent objects.3 This is the point that was of interest to us. The above-mentioned cases involve the search for and recognition of appropriate stimuli that had never previously been experienced by the subjects. The question now is this: do we have the same situation when the appropriate stimulus has been previously experienced by the subject? Let us consider Pavlov’s dog after his conditioning. When, after hearing the sound, the dog slavers expectantly, is he evoking an absent piece of meat? Is he “prefiguring” the piece of meat? (Hollis [1982]). Likewise, in more instrumental conditioning: should we assume that the animals are evoking the reward they had obtained so many times before? Lorenz has made clear that some goal-driven behaviour can function without evoking the absent external object at which it is aimed. What we are asking now is whether Lorenz’s statement can be generalised to all animal behaviour.

7.2.2 Outlining one possibility Let’s start by making some remarks. The ultimate goal of animals’ behaviour is always the satisfaction of an innate consummatory pattern, or a particular combination and dosage of a number of innate consummatory patterns. I want to stress that in the

2. “Docta ignorantia” are the words used by St. Augustine of Hippo when he reformulates this Platonic topic. 3. This would be a paradigmatic example of the convenience that cognitive science pays attention to animals. Cf. Clancey (2009, p. 25): “Perhaps the oddest disconnection in this science is the study of cognition by early AI and cognitive scientists without reference to animal research”.

 Becoming Human

innate consummatory patterns I also include patterns linked to social behaviours such as, for example, those concerning dominance or suitable group integration. The external objects that are the goals of the behaviour of experienced animals (that is, of those animals which have become experts either in a natural environment or through the conditioning procedures carried out in a laboratory) are never the ultimate goal, but only sub-goals, or sub-sub-sub-... goals. The expectation of these subgoals allows the animal to recognize in the environment some of the clues it previously learned to associate with the innate consummatory pattern. These (second, third or fourth order) sub-goals need to have been experienced previously by the animal. However, they do not inevitably involve evocation. Such evocation may not exist in animals: This is precisely the possibility about which we must ask.4 If Lorenz’s hypothesis allowed us to imagine a well-defined goal which, nevertheless, was empty of any content and was therefore entirely different to the ability to evoke absent contents, we must now try to resist as best we can explanations based on the evocation of absent objects. Although the expectation of a particular absent object has sometimes been called ‘search image’, it does not necessarily entail an image of the object as such, but only the expectation of a subjective effect. Of course, the absent object may be either the appropriate stimulus for the consummatory pattern (or particular combination of consummatory patterns) activated at that moment, or a stimulus highlighted during the animal’s training as a means for reaching, step by step, the ultimate goal. In any case, however, the expectation of an absent external object – this is the possibility we are focussing on – would only define a location in the subjective landscape (or gradient) that the animal’s learning would have been shaping around that particular consummatory pattern. It is true that, by granting expectations, goals and some type of consciousness to animals – by granting this even in their conditioning – we are moving away from what we might call a behaviourist framework. However, behaviourism’s theoretical askesis should not be completely dismissed. We should not overlook the possibility that the evocations of the absent goal may be an unnecessary and incorrect anthropomorphisation. Let me summarise what I am suggesting. Animals would have only a defining subjective profile of the external absent objects they are looking for, or, in other words, they would have only a defining profile that includes no objective feature of those objects.5 Is there any evidence to support this generalisation of Lorenz’s discovery? I think that it may be reasonable to interpret some recent experimental results in that way. 4. According to this possibility, animals would possess recognition memory, but they would lack the ability to evoke. Note that even the allegedly sophisticated ‘planning in the context of tool use’ (see, for example, Osvath [2009]) can be explained in this way. 5. It would not only be about the animals’ goals. The expectation without evocation would also take place in many processes with human beings. In this sense, the research carried out by Bargh and his associates works as an argument in favor of our suggestion (see, e.g., Fitzsimons & Bargh [2005]). In addition, Zedelius et al. (in press) argue that (although rewards obviously influence the way we maintain and act on information relevant to attain our goals, and as such

Chapter 7.â•‡ About evocation 

“Trained rats do not press more vigorously when hungry than when non-deprived unless they have had experience with the food when hungry”: Dickinson & Balleine (2000, p. 194). Thus, we have to assume that the instrumentally conditioned responses less immediately related to the consummatory goal could not be re-evaluated as such. This is why in the original experiment the rats did not press the lever more vigourously when hungry. How are we to interpret these experimental results? A possible interpretation is that animals would be unable to evoke the food they are actually looking for. In this case, the rats would only be experiencing the expectation of the satisfactions they had previously learned to associate with their pressing action. Consequently, since the satisfactions had never been intense (rats had never had experience with the food when hungry), the rats could reveal no excessive longing. This interpretation is also compatible with Parkinson et al. (2005)’s experiments. These authors have observed that “whilst U(nconditioned)S(timulus)-directed behaviour was abolished following devaluation, the conditioned stimulus acting as a conditioned reinforcer supported the acquisition of instrumental responding” (p. 19). We would make just the same comment on another experiment by Dickinson & Balleine (2000).6 For some weeks some rats had been trained to eat a certain type of food in place A and a different type of food in place B. In one container there was a very nourishing water solution and in the other some dry pellets that were also very nourishing. Given that during that period the rats were very well hydrated, it did not matter which container they ate from. The experiment as such begins when the rats are thirsty for the first time. Which container will they go to? Will they perhaps go en masse to the container with the water solution? The result was a disappointment for those who believed in the intelligence of rats. The rats went to both of the two containers without distinction. The authors conclude that “shifts in motivational state do not necessarily have a direct impact on instrumental performance. Rather prior experience with the reward in the shifted state is required if the current motivational state is to control performance”. Thus, we might suggest that it is only the rats’ ability to evoke the food available to them in each place which would fail. Since they are unable to evoke such an image, their only guide will be the expectation of repeating the pleasure they had previously experienced in each place (a guide which generally works very well in the wild, although it may become bankrupt given the complicated situations of the experiments). In conclusion: Admittedly, rats’ inability to evoke is possible, but it has not been proven. are central to successful goal-directed behavior) “conscious reflection on valuable rewards can be detrimental to performance when it interrupts an ongoing active maintenance process.” Or in more concrete words, “once one is busy with making money, valuable rewards are best taken unconsciously. 6. This experiment was extensively studied and commented on by Papineau (2001), although with a purpose different from mine here.

 Becoming Human

7.2.3 Is a clear answer to be found in research with chimpanzees? Now we must examine data relating to chimpanzees. It has been shown that chimpanzees recognise photos of known objects. We humans can, of course, interpret a photograph or a painting as a symbol corresponding to a real object. On one hand, we – perceptually – address the photograph and, on the other, we – imaginatively – address the reality photographed, without ever ceasing to consider it as absent. Do chimpanzees recognise photographed objects in the same way? Do the data available allow us to state that chimpanzees evoke an absent object as such? Or, in other words, do chimpanzees interpret these photos as symbols? I think this is not the only possible interpretation. The virtuosity, so to speak, which animal perception has attained can cope with high levels of stimulus degradation. The contrast with old-style computationalism has highlighted this clearly. The brain, both of humans and higher animals, continues to recognise an object even though it is only partially perceived or lacks some of the features typical of its class. That ability was clearly advantageous and for this reason it was continually reinforced in evolution. Compared with the achievements of today’s robotics, animal perceptions are amazing in this regard. Thus, it is possible that chimpanzees see the photo of a dog as some sort of dog. Certainly, a two-dimensional and smooth-to-the-touch dog is quite abnormal. However, if it turned out that animals were incapable of evocation, then the symbolic interpretation of photos would be a resource unavailable to them. In this case, their abilities of perceptual recognition would carry out their task even though the final results so obtained were abnormal. Since photos are not found in the wild, why should there be a device preventing chimpanzees from achieving that result? The only thing they would do is to keep on observing and verifying until they realised that this strange dog did not fulfil their usual expectations.7 The experiment carried out by Premack (1971) with the famous Sarah merits special mention. This chimpanzee had learnt to perfection that a blue triangle meant apple. Likewise, she had also learnt to place a triangle next to something triangular in shape and a circle next to things that were more or less spherical, or pieces of blue plastic next to blue things, and yellow pieces next to yellow things. When Sarah had learnt to do this, Premack asked Sarah to place something next to a blue triangle, giving her the choice between a circle and a triangle, and also between a piece of a blue or yellow plastic. Understandably, Premack had been careful to ensure that all the apples given to Sarah were yellow. The widely-known result was that Sarah chose the circle

7. In addition we must pay attention to the fragility of children’s recognition of iconic symbolic movements prior to the age of 2 years. Cf. Namy (2008, p. 845): “The evidence fails to support the traditional perspective that iconicity facilitates the onset of symbolic development. The ability to decipher the similarity between a representation and that which it represents does not appear to come ‘for free’.”

Chapter 7.â•‡ About evocation 

and the yellow colour to place next to the blue triangle. Does this prove that Sarah evoked the apple?8 My view is that it is not proof. The chimpanzee’s expectation of an apple was clearly activated, as she had seen the blue triangle. Let us suppose that this was an empty expectation, and did not involve any evoked image. This assumption is in no way incompatible with Sarah’s response. Animal expectations are used to recognise any possible stimulus that is appropriate. Consequently, Sarah recognised the features of ‘yellow’ and ‘round in shape’ when she found them. Mere recognition memory might be enough. Retrieval did not necessarily have to occur. Nothing guarantees that the evocation of an absent apple had to have taken place. For the time being, the matter has reached stalemate. This judgement must be applied not only to the specific question about Sarah, but also to our question about the possible generalisation of the lessons learnt from the inexperienced pigeon to the rest of animal behaviours, or in other words, whether the ability to evoke absent objects would occur in non-humans.9 (Many philosophical works have focused on this issue, for example, Langer [1941]) In short, it is possible, but only possible, that the ability to evoke absent objects as such is an exclusively human one. In this case, we must ask ourselves how this symbolic ability came about. Admittedly, the question itself would be hypothetical, not only its possible answers. Nevertheless, this should not lead us to avoid asking it. If we are aware of the level we are working at each moment, then we can work at any level. Thus, I wonder if the ability to evoke has anything to do with the ability we have been describing in this book, that is, the ability to perceive a radically not-own self.

7.3

How then would evocation have originated?

We shall set out in this subsection from the assumption (which is, I repeat, an entirely hypothetical one) that the conditioned responses of animals in no way require the evocation of a goal. Thus, we shall ask how the symbolic ability or ability of evocation would have originated in human beings.10 An initial suggestion might be that the 8. A similar question, although not necessarily with the same answer, has been asked in relation to human ontogeny: see Saylor (2004). 9. Liszkowski et al. (2009) have shown that pre-linguistic infants will point to non-existent entities (e.g. the plate where the biscuits used to be) while chimpanzees will not. Is this conclusive evidence in favour of the human exclusivity of evocation? Unfortunately (for my purposes) I cannot give a clear affirmative answer. We do not know for sure if the key element of the difference is just the power to evoke or if other factors are playing a role too. 10. The terms ‘symbol’, ‘symbolic ability’ or ‘symbolic resource’ have been used in many different ways by different authors (Amongst them, we are obliged, of course, to mention that very important book by Deacon [1997]). Thus, I must say that what I am focusing on is the very ability to evoke absent objects.

 Becoming Human

ability to evoke absent objects will not take long to appear once a second, or not-own, centre has emerged inside one’s own mind and contents impossible to identify with one’s own expectations have begun to be simulated. However, this formulation is very far from even beginning to hypothesise about the origin and development of the ability to evoke. We have merely said that what is required for the pointing gesture offers a platform, so to speak, for the evocation of absent objects. But we must move beyond this empty statement. How did this ability to evoke or symbolic ability originate? A preliminary point we must highlight now is that the symbol goes beyond the pointing gestures designed to direct and guide others’ attention. We must not confuse these two phenomena. Certainly, this confusion is universally rejected. However, it seems at times to be involved in some explicit formulations. If a study investigates language in very young children, then its author will tend to focus again and again on the fact that with linguistic symbols the speaker tries to share a perception with another individual and direct the other individual’s mental state towards something in the environment. As we can see, this does not do justice to the obvious differences between pointing and linguistic symbols. With symbols, but not the pointing gesture, attention can be directed toward absent elements, past scenes or fictional objects. There are indeed pointing gestures ad phantasma, but these gestures cannot take place without the appropriate verbal context. Consequently, the primary pointing gestures can by no means have an objective disassociated from the space that physically surrounds the producer. Symbols, in contrast, can. Why have I used the term “linguistic symbol”, and not the term ‘word’? Firstly, because not all words are symbols. Secondly, because the symbolic ingredient never forms the whole meaning of a word. The words of our language are always elements for syntax, and their meaning can neither be conceived of nor learned apart from syntax. I will deal with all these issues further in later chapters. We must, I repeat, ask ourselves what possible relation there may be between the ability to evoke absent objects (which can occur perfectly well in a person who is completely alone) and the radically interpersonal character of the ability involved in pointing gestures. How does the ability to evoke develop in children? It is now convenient to take a look at children’s symbolic play. (In fact, symbolic play is not only able to provide clues about such a development, but it also reveals that the ability to evoke is a complex and difficult ability to be perfected: the different plays, as we saw in 5.2, are always related to abilities which need to be developed and promoted through exercise. Certainly, symbolic play does not prove in itself the human exclusivity of being able to evoke. Certainly, some behaviours of apes have been interpreted as symbolic play. However, I think that it can be stated the thesis that states that the evoking ability would be present in animals is weakened by the fact that children devote several years to symbolic play.) All this leads us on to the subject of the next chapter.

chapter 8

Symbolic play Developments in the simulatory centre

We need to address how exactly symbolic play operates. (‘Symbolic play’ is the term that Piaget uses; other researchers use ‘pretence’.) Or, in order to further define our topic, we have to see if symbolic play is related in any way to the postulated basic human ability to perceive a radically not-own self. One difficulty that appears to prevent this relationship derives from the fact that symbolic play lies outside communication and interpersonal processes. In children, symbolic play may at times be shared with the mother or with another child; however, on many other occasions symbolic play is individual. It seems unlikely, therefore, to be related to that basic ability. However, let us proceed with the task in hand, and leave behind such pessimistic predictions.

8.1

Describing symbolic play

Children’s symbolic play consists of the performance of motor patterns outside their appropriate context. This will do as a first description of symbolic play. The child acts as though she were galloping, but she has no horse. She may do so using the broom from the well-known example, but, equally, she may also do so with no prop at all. The key factor is the absence of the appropriate object. Another ever-present feature is the following (on which Sánchez de Zavala [1997] places particular emphasis): the motor patterns performed by the child always correspond to a type of conduct that is appropriate only to the symbolised object. A horse, or, to use a different example on this occasion, a telephone, can in reality be the object of a number of very different actions: it can be cleaned, stored in a drawer, be moved toward the end of a table... However, there will be only one action related to telephones in the child’s symbolic play (leaving aside the much later narrative dramatisation games), namely, pretending that she is talking on the phone. Symbolic play occurs in all children. Furthermore, its occurrence is triggered by a biological clock, so to speak. Consequently, we can state that it is a universal trait of the human species. The pleasure the child gets from those games and which pushes her toward them would have been selected by evolution. The immediate question is, of course, whether symbolic play is useful. What advantages could it bring? In 5.2 above, we looked at this question in relation to playing in general and also, more specifically,

 Becoming Human

to interpersonal motor co-ordination games. Now we must answer the same questions, but in relation to a different type of play.1 As Piaget (1945) highlighted, no new motor patterns are learned in symbolic play, nor is the pattern adapted to new circumstances in any way. Symbolic play is, thus, clearly not useful for motor learning. What is more, in one respect, symbolic play presents danger and, consequently, certain disadvantages. The climax of symbolic play is a stage when the child is learning the uses and names of things around her. During that phase, using an object as though it were something else, or calling it by the name of something else, could have a confusing and disturbing influence (see Leslie [1988]; see also a dossier by several authors in Developmental Science, 2002, 397–426). However, despite all this, the reality facing us is that symbolic play occurs in children in all societies and cultures. Symbolic play must, then, provide great benefits: this conclusion is unavoidable. We must now establish what these benefits are. In symbolic play, the child recreates a scene which, either because it has been repeated often, or because she finds it interesting, has left a footprint, we might say, in her mind. She recreates it when that scene has long disappeared, when that particular agent and environment are no longer perceptible. However, the child does not stay still in order to create this evocation. On the contrary, she performs an approximate copy of the movements the agent had made on certain objects. Since those objects are no longer available, the child’s movements are performed in vacuo or on inappropriate objects. These movements can be as acultural as those involved in raising food to one’s mouth and taking a bite. While there is no doubt that raising food to one’s mouth is a primary and animal behaviour, in symbolic play (let us think, for example, of ‘pretend eating’) those movements acquire a new quality and are radically transformed – they are transformed into ‘pantomimes’.2 How are pantomimes carried out? Or, in other words, how are movements in symbolic play controlled? The key to these movements – I shall suggest – lies in them being the reproduction of a previously perceived model. However, let us first analyse this change from a wider perspective.

1.

Here I will answer in a way (or with an emphasis) slightly different from Bejarano (1995).

2. Let us remember that repetitions in vacuo of innate motor patterns have actually been observed in animals. For example, according to Lorenz, when the hormonal drive to build the nest has reached its height, and the caged bird finds no material on which to perform its innate motor pattern for nest building, this motor pattern will be performed in vacuo. If we remove from this description anything that may be influenced by Lorenz’s generally less accepted views (specifically, by his ‘hydraulic model’ on animal drives), we are still faced with a performance in vacuo. However, the differences between this ‘in vacuo repetition of movements’ and the repetition that occurs in pantomimes are very clear: in the animal’s in vacuo performance, motor patterns are, point one, linked to a strong hormonal impulse, and, point two, there has been a long and unsuccessful period of searching for the correct stimulus.

Chapter 8.â•‡ Symbolic play 

8.2

How movements have adapted throughout evolution

Animal behaviour gained in flexibility throughout evolution. Replacing fixed and stereotyped movements, evolution successfully produced an extremely subtle ability to adapt to the specific circumstances of each occasion. Compare, for example, a horse and a car. For all that a car may be more powerful, there is one respect in which it cannot even come close to the achievements of evolution. A horse adapts spontaneously to the terrain, and changes its movements depending on whether the ground is wet or not, if it is on a slope, or there are pebbles underfoot... How does animal conduct achieve this wonderfully flexible adaptation? The movements are not set in advance (Thelen & Smith [1993], for example, insist on this point). The only thing rigidly fixed is the expectation of the objective. If, on the other hand, there was a pre-set motor plan, no adaptation could then occur. The specific circumstances that might arise could not then have an influence over the movements. The transition from pre-set motor stereotypes to flexible adaptation to circumstances occurred gradually in evolution. There are movements in rats where this transition is still not complete. We should remember Eibl-Eibesfeldt’s (1975) observation: the motor pattern of beating the materials during nest-building – the ‘beating-toplump up’ pattern – can only be performed at a specific height. Such motor setting is adaptive under normal conditions. It is precisely that height most suitable for the materials to reach. However, beyond these set patterns in rats, evolution – which is what interests us – continued to improve this flexible adaptation to circumstances. In mammals, this ability would be already well developed. And, of course, it is preserved in human beings. Nevertheless, we human beings have periods during which our movements temporarily reject that wonderful flexible adaptation to circumstances. Those periods constitute the change to which we referred above, the change which occurs in the ‘pantomimes’ of symbolic play. Piaget (1945) saw clearly that this adaptation is absent from symbolic play, but the way he interprets this fact is disappointing. Invoking the Freudian dichotomy of the pleasure and reality principles, he interprets symbolic play as an escape valve for the pleasure principle. At that stage, he tells us, children are constantly learning and constantly adapting to reality, and use symbolic play as an occasional escape from this pressure. Nowadays, such Freudian speculations have been left behind, and we know that a place has to be found for pleasure in the framework of evolution and adaptive advantages. Lorenz made clear that pleasure is originally a teaching mechanism which allows the organism to recognise the appropriate stimuli for each innate consummatory pattern. Consequently, we roundly reject Piaget’s interpretation of symbolic play. However, this raises once more the question of what the absence of adaptation means in symbolic play? What can a game with these characteristics be useful for? Movements in symbolic play no longer adapt to the environment. However, they have shaken free of compliance to their environment, in order only to comply with a model. Control has now been handed to the previously perceived scene. Now, children

 Becoming Human

follow a pre-set plan – the motor plan they forged in imitation of the movements of the agent in the past scene. One might say, therefore, that the adaptation has changed, rather than having been lost. The difference between the old and new adaptation is very clear. During the period of symbolic play, the child’s movements, since they are performed without an appropriate context, and since they do not, therefore, need to adapt to circumstances, would adapt to the movements of the model.

8.3

Is simulation linked to the real movements of symbolic play?

Adaptation to the model and not to the own current environment is also a characteristic of what we are calling here (in the chapters about pointing gestures, or four-hand cooperative actions, or Saussurean parity) true motor simulation. However, we must not forget that the differences between motor simulation and symbolic play are highly marked. Motor simulation, as we have been proposing in previous chapters, would be assigned to a second mental centre. It is to a radically not-own self that the simulated movements are attributed. The connection between that my conduct and second centre of my mind would be broken. For this reason, we had said, it is not necessary to inhibit motor simulation. The latter differs from real movement from its very origin. In contrast, the essential feature of symbolic play is real and manifest movement. Consequently, symbolic play is very different from motor simulation. Certainly, in symbolic play, the child – I repeat – would not be simulating the movements of the model. It is obvious that they are not simulated, but actually performed. However, two characteristics related to the simulatory centre appear – I wish to suggest – in those real movements of symbolic play. Firstly, the kinaesthetic interpretation, made during the child’s perception of the model’s actions, would be controlling her. Secondly, she would be simulating or evoking visually the absent model scene. This double involvement of the simulatory centre is the possibility that we shall explore here. To this end, we shall begin by addressing the difficulties, and by acknowledging that, at the moment, this all appears to contradict what we have said up to now about the simulatory centre. We shall now analyse those two features in order to detail the presumed contradiction or difficulty involved. Only later will we ask if such difficulties are insurmountable. (This presentation could possibly be carried out without having to start it off with its difficulties. If I have chosen this path, it has been – I admit it – because it seemed easier. Or, to put it otherwise, the struggle – real and long – against those difficulties is so close to me, that it is extremely difficult for me to break that link in the presentation. This should work, if not as a justification, at least as an excuse if the reader thinks that the way objections are raised and then solved is rather contrived.) Let us begin with the first of those two characteristics. Before performing symbolic play, the child observed actions in a model. The visual perception of that model scene is accompanied in the child by a kinaesthetic interpretation of the movements of

Chapter 8.â•‡ Symbolic play 

the model. The presence of this prior kinaesthetic interpretation is more than merely assumed. As I said in 6.3.2, it is known that the premotor area of the brain is activated during ‘observation for copying’ of motor patterns, but not during observations that do not have this purpose (Decety & Grèzes [1998]). All symbolic play would thus have, I insist, a prior phase in which the actions of the model are interpreted kinaesthetically by children. Thus far, everything appears to be going well. We should note, however, that the model was not interacting with the child at all: this is the difficulty we had announced. In accordance with what we said in the first chapters, the simulatory centre becomes necessary only at the very moment the observer is kinaesthetically interpreting a model which is actually interacting with the observer. Why, then, would the simulatory centre have to be involved in that prior phase? The difficulty is the same or worse with regard to the second feature. The visual content that would have to be simulated in this case is by no means an intrinsically not-own perception that is impossible for the simulator. Far from this, the content of the evocation involved in symbolic play would consist of the past scene that the child itself had observed. The contradiction with what we have said about the simulatory centre in earlier chapters has reappeared. We shall attempt to overcome these presumed obstacles to the link we have suggested between symbolic play and simulatory centre. But first let me compare that suggestion with a recent hypothesis. This will make it easier to shape our attempt to connect motor simulation and visual evocation.

8.4

What is repeated in vacuo is a previously perceived model, not one’s own behaviour

Let us now comment on the theory put forward in Thomas (1999) about the origin of the imagination or the ability to evoke absent objects. For this author, imagination comes from the ‘repetition in vacuo’ of the activity linked to visual perception. Certainly, it is indisputable today that all visual perceptions, including those where the perceiving subject appears to be most at rest, involve some activity, even if only as a result of saccadic eye movements. Nevertheless, the question of interest to us is whether the evocations would actually be produced by the repetition in vacuo of such movements. The terms that it is useful to compare are, on one hand, the in-vacuo repetition of saccadic movements or of any other activity linked to perception, and, on the other hand, the in-vacuo repetition involved in symbolic play. What is the difference between the two? Or, in other words, why do I believe, against Thomas, that absent objects would be successfully evoked as such in symbolic play, but not in the other type of in-vacuo repetition? The movements repeated by the child in symbolic play are the ones he observed in the model scene, and interpreted kinaesthetically during that observation. It is in this prior kinaesthetic interpretation that, contrary to Thomas, I believe the key lies. The

 Becoming Human

special character of the motor patterns in symbolic play would not occur only in the out-of-context or in-vacuo muscular display in the game. That special character would already have occurred before, during the ‘model perception and kinaesthetic interpretation’ phase. The patterns extracted on that occasion by the child were controlled by the model, and not by the real environment. This is the key for my disagreement with Thomas’s hypothesis. In real perception, saccadic movements are controlled by the environment. In contrast, in the phase prior to symbolic play, control of the patterns that the observer is extracting has been left to the model. However, why should the kinaesthetic interpretation of a model be so important? Why should the evocation of absent images as absent be linked to such a kinaesthetic interpretation of a model? To start with, it should be noted that with this suggestion we obtain a result that is beyond the reach of Thomas’s hypothesis. In our suggestion, symbolic play, with its continual and obvious imitation of previously observed actions, connects with the ability to evoke. Such a connection is not only intuitively comfortable, but, above all, successfully ascribes an adaptive advantage to the child’s symbolic play – no less than the advantage of being able to exercise the ability to evoke. Perhaps, if Thomas had explored the origins of this ability in children, rather than focussing only on adult imagination, he would then have come to a very different type of hypothesis.3 But let us return to the question. Why should the kinaesthetic interpretation of a model make all the difference? A first response might recall the similarity between the imagining, or evocative simulation, of absent objects and motor simulation. Certainly that idea may be very attractive. However, the following two questions would have to be resolved before proposing such similarity: firstly, are the kinaesthetic interpretations of the prior observation phase really true simulations; and, secondly, why should the muscular and displayed reproduction of the movements of the model lead to the visual evocation of the now past model scene? In short, we will have to explain how our proposal might, without contradicting itself, include that interpretation and evocation in the second mental, or simulatory, centre. This is precisely the goal we were setting ourselves at the end of the previous subsection. Let us begin this task without delaying any further.

8.5

The big extension of the simulatory centre (i.e., the new function that got to be performed by this centre): How would a truly simulatory interpretation of a non-interacting model have been achieved?

We will focus for the moment on the first question, where we have already discovered the heart of the problem. Interpersonality in the strong sense – the interpersonality 3. At least, it would have been more difficult for him to overlook the imitation of not-own actions that is involved in symbolic play. When is it particularly easy to overlook this? When we fail to bear in mind linguistic mediation and inner speech, which both would be participating in adult imagination. But this is not the right moment to address this issue.

Chapter 8.â•‡ Symbolic play 

that we saw in the basic ability – is absent in symbolic play. The rider or the horse neither communicated with the child nor did they interact with him at all. How is it, then, that kinaesthetic-postural a posteriori expectation is not sufficient for the child to pay attention to the internal aspect of those movements of the model? Why would the second or simulatory mental centre have to intervene while real models were being observed? On the basis of what we have hypothesised thus far, it is only when the other person is looking at me or communicating with me that kinaesthetic-postural expectation is insufficient to attend to the internal aspect of the not-own movements. That is what we have said in previous chapters. Why then are we now changing, and suggesting that observing the movements of a model is sufficient for the simulatory centre to intervene? (I could be told that the answer is right at hand. It could be enough – an objector could argue – to remember what was said in 1.5, namely, that once a resource has been created, it is possible for this resource to be employed in some functions which, independently, would absolutely not have had sufficient strength to create it. Or, even easier, enough to distinguish between ontogenesis and phylogenesis. But, in my view these answers would not be enough. The distinctions suggested by the objector, although perfect to make us expect the appearance of new uses for the simulatory centre, do not explain why in fact did the new function which we are proposing appeared. Therefore, my question still stands.) I will present some evidence later in favour of one particular answer to that first question. But for now, I shall summarise that particular answer without arguments or details. A bird’s eye view will be useful, as this is an important question within the framework of suggestions I am making. The simulation involved in the ‘basic human ability’ is extended and becomes more powerful when it changes in two inter-related aspects. This is the first big milestone in the derivation from the basic ability. The key for both changes is the imitative learning of complex motor patterns. With the arrival of this learning, what then happens – the first change – is that my kinaesthetic interpretation of the movements of a model which is not interacting with me at all will, nonetheless, pass through the simulatory centre, and also – the second change – the simulatory centre, which was until that point disconnected from real and non-simulated movements, begins to control those real movements which adapt to a model and not to the environment. In the initial appearances of the basic ability, simulation never reached the point of being muscularly displayed. What would have been the use in simply repeating the posture and movements of the person communicating with us by means of a pointing gesture? Understanding this gesture requires me to perceive which object I am being asked to look at, and actually to look at it. The movements I use to look at it are completely irrelevant. In short, the simulation of such not-own actions of which one is recipient is neither a muscularly-displayed imitation, nor does it even have to lead to such imitation.

 Becoming Human

When does this change? When would the simulatory centre become involved with real movements? This would occur with imitative motor learning: A learned motor pattern will only be useful, obviously, if the pattern learned is reproduced in a displayed fashion. Which mental centre will manage that muscularly displayed reproduction? What I suggest is that the primary or non-simulatory mental centre, that is, the one we share with animals, is unable to manage movements whose control depends less on the real world than on the model. As a result, with the performance of movements adapted to the model and not to the environment, simulation would have taken that enormous step of successfully managing real movements. Since then, real movements in the whole body can occur in two different ways: either by adapting to the environment, or by adapting to the model. Since then, therefore, real movements throughout the body will have to be spread between these two modes or levels.4 However, all this – simulation managing real movements – is the consequence of the first change. All this really began when the kinaesthetic interpretation of a not-own body began to be made from a complete motor sequence and not from a simple movement. (It should be borne in mind that it is only on complex motor patterns, or, in other words, on motor sequences, that genuine motor learning can occur.) That kinaesthetic interpretation – or ‘latent imitation’– would complete the sequence before the displayed imitation could even begin. It is precisely this latent sequential imitation that is beyond the reach of mere expectation.5 It was then that the simulatory centre was required to interpret kinaesthetically for the first time the movements of an individual who had absolutely no interaction with the interpreter. Let us see why. Expectations could not in any way enable an entire complex motor pattern to be detected or interpreted. Expectations of any type (a priori or a posteriori expectations) can in no way take another unsatisfied expectation as its starting point. (Otherwise, they would be unable to fulfil their function as guide and selector of conduct. It is only 4. There are some clues to suggest that this distribution has to do with specific human hemispheric specialisation. I have come back here to the question in 3.3.3 above. There I observed that, in order to detect an interiority which approaches me, or which is staring at me, two different implementations would become necessary for the lateral axis, one spatial and behavioural and the other for motor simulation. Then I conjectured that this might have quite a lot to do with the hemispheric specialisation that is peculiar to human brain. How would these two different implementations link with the learning of new motor patterns?, this was my question in 3.3.3. Here I would make the following question: Might the suggested ‘new function that got to be performed by the simulatory centre’ have to do with the link that we looked for? 5. An ability that would have been prior – but also close – to the latent sequential imitation is that of four-hand tasks. In the planning of the four-hand task, the result of at least one foreign movement must be imagined. Certainly, these tasks are interpersonal; likewise, in the initial four-hand tasks there would be one single imagined step. Certainly, these are two strong differences regarding the latent sequential imitation. However, we can see here a germ which would have facilitated the appearance of the true latent sequential imitation. (This interpersonal precedent would be framed – please note – by the Vygotskian ‘General Principle’.)

Chapter 8.â•‡ Symbolic play 

from real postures that the expectation of results will be able to choose the appropriate movement that will successfully give life to these results.) As a result, if the interpretations or detections of a complex motor pattern were based only on expectation, they would get stuck by the second step and could not detect the entire motor sequence. The interpretation of the motor sequences of a model would be an essential prior phase for the imitation of technical actions, for the imitation of articulatory-phonetic patterns and also for symbolic play. In the first two cases, there would be genuine learning of new patterns, and also a clear benefit in employing these new patterns. In symbolic play, by contrast, there would be neither learning nor this type of benefit. However, there would certainly be adaptation to the motor sequence of a model. Here ends our bird’s-eye view. We must now return to the hard slog. And we must try to bring in some evidence in favour of the previous suggestions.

8.6

An indication in favour: Comparing symbolic play and adult-feeding game

8.6.1 Why might it be helpful to pay attention to this type of game now? We were discussing symbolic play; specifically, we were discussing the first of our two questions. Why would the simulatory centre have to be involved in the preparatory phase (that is, during the observation by the child of the model’s movements), despite the model not interacting with the child? We have just suggested that this is because the movements observed by the child form a full complex motor pattern. In order to put together an argument for this suggestion, it will be useful at this point to examine a negative, a perfect inversion of the situation we have suggested for symbolic play. This perfect negative constitutes a small piece of evidence in support of our suggestion. We have hypothesised that two characteristics of the prior phase of symbolic play (namely, a model not interacting with the subject, and a multiple-step motor model) are in relationship with one another, or more specifically, that the second would compensate and dissolve an effect of the first. As a result, it may be interesting, in this light, to address a situation which is similar to symbolic play, and earlier than it, but which nevertheless lacks two of these characteristics. That double lack is just what we will find in the adult-feeding game. At this point, then, we temporarily leave symbolic play, and concentrate on another type of game, a game which appears at a slightly earlier age. Around their first birthday, children take a little food and raise it to the adult’s mouth. Bräten (1998, p. 109) is, to the best of my knowledge, probably the author who has taken most interest in this adult feeding. I found his description in terms of a ‘virtual other’ extremely interesting. (A more Piagetian study of levels of ‘representational play’ can be found in McCune [1995].) However, it is a more specific point that interests me right now.

 Becoming Human

As you will see, what we are now dealing with is very clearly not symbolic play. Here it is real food that is raised to the adult’s mouth. This is very different to the particular type of symbolic play that is ‘pretend eating’. Nevertheless, there is, I will suggest, still some relationship between this activity and that of symbolic play. More specifically, the adult-feeding game would be interpreted as a kind of intermediate link in the chain between the two abilities we are interested in connecting, namely, the basic human ability demonstrated in the ‘third mode of processing the not-own eye’ or in the fingerpointing gesture, on one hand, and the ability of evocation and the child’s symbolic play, on the other. With the emergence of the feeding game, the prior basic human ability would mix for the first time with a pleasurable and playful ‘delayed imitation’, but even so we would still be very far from symbolic or evocation-provoking imitation. Delayed imitation exercised without pragmatic justification is a characteristic in children that has been shown by experiment, in addition to having been observed in symbolic play and in feeding games. Even several days after they have seen an adult perform an action only once, children imitate the precise motor routes used by the adult (unless they had seen that the adult was choosing this route because she could choose no other in the circumstances, for example, her hands were full or tied). For each action several even easier means of performance were possible. However, the child faithfully reproduced the one the adult had chosen. The child visibly enjoyed performing that exact copy. See Meltzoff (1988) and Gergely et al. (2002). We might, thus, think that the adult-feeding game has a specific role as a bridge between the basic human ability and symbolic play. But first we need to compare symbolic play and the adult-feeding game, both in their respective prior phases and in their muscular displays themselves. We will thus understand why the adult-feeding game emerges at an earlier age.

8.6.2 Similarities and differences between symbolic play and the adult-feeding game Receiving food from the adults without being breastfed is one of the most repeated routines of a child’s life. Months before she starts to feed the adult, a child has learned to be attentive to this adult behaviour which is so important for her. This is evident in today’s developed societies; however, we must ask ourselves if it occurs in all types of human society in the same way. We call childhood the period following on from weaning, but where the child is still dependent on the adult for food.6 How adults look forward to this stage has effects 6. Thus defined, childhood is a stage that does not appear in non-human primates: see Locke & Bogin (2006). These authors deal in general with the lengthening and internal differentiation of the pre-adult period in humans. While in the rest of the primates, lactation and youth are the only pre-adult stages, in humans there is also childhood and adolescence. Adolescence would be useful, these authors suggest, for all kinds of learning, principally linguistic. But let us focus on

Chapter 8.â•‡ Symbolic play 

on the previous stage, that is, on lactation, and these effects are what is of interest to us here. Adults begin to offer the child little bites, before she is properly weaned. In all likelihood, this begins to occur from a very early age for children in all human societies, independently of the strong differences observed by ethnographers in the age at which children are weaned. For this reason, I believe that the statement we made earlier may be generalised beyond today’s developed societies. Let us return now to our argument. From before her first birthday, the child would have accomplished something significant in relation to the movements of the adult who offers her bites to eat. This would be a much more difficult achievement than the conditioned salivation response: the child would have succeeded in interpreting those movements kinaesthetically. This interpretation, as we already know, can in no way be made using mere postural a posteriori expectation. It has to be simulated in a second mental centre, i.e., in a radically not-own interiority, since the child could not otherwise simultaneously understand that the adult’s movements are directed at her, the child. Consequently, that kinaesthetic interpretation would be in line with the basic human ability we have detected in the third processing mode of the not-own eye or in pointing gestures. What takes place in symbolic play in this regard? On one hand, prior observation – or, in other words, kinaesthetic interpretation – of a conduct of others would occur in symbolic play also. The child would have been watching the horse with enormous interest, before using a broom (or, indeed, nothing) to ride on. This is a similarity shared by both types of game. On the other hand, however, there is a very important difference between one case and the other. In symbolic play, the model agent was not interacting with the child at all. By contrast, when the child observes and kinaesthetically interprets the adult who gives her a piece of food, the adult behaviour observed was, I stress, a behaviour directed at the child. The attention paid by the child to these adult movements is an absolutely privileged place for the emergence of genuine simulation. Now, comparing the second phases, or displayed action phases, of both games, we find a new mix of similarity and difference. The similarity lies in that, in both cases, the game, strictly speaking, consists of a muscular and displayed behaviour which is inspired by the behaviour of the model, and which is, therefore, a delayed imitation. But let us see how that muscular performance differs in each of the games. In the feeding game, the performance is on real food and in that sense is, therefore, adapted to the environment. The child, I stress, wants to copy the adult behaviour that it has witnessed childhood, during which the child receives food from the adults without being breastfed. Childhood, as we have already said, does not occur in apes. We do not know when this ontogenetic stage appeared in evolution. The ‘life history’ descriptions that Locke & Bogin offer for different hominid species are still insufficient. But let us return to the point we wished to highlight. Childhood, during which the child receives food from adults without being breastfed, would promote the child’s attention to the adult movements addressed to him. This may have been a very important factor in the appearance in evolution of the second mental line.

 Becoming Human

so often before. However, what is being imitated is only the type of behaviour, not the movements with which this behaviour will be implemented. The child’s movements are still being controlled by the environment; in other words, these specific movements are still determined here, that is, in the adult-feeding game, by the type of adaptation which, we said, had been improving throughout animal evolution. This is very different, therefore, from pantomime or symbolic play. Once again, one can clearly see why the feeding game emerges at an earlier age.

8.6.3 The simulatory centre in the feeding game and in symbolic play Up to now, we have alluded several times to the possibility that symbolic play derives from, and necessarily requires, the second, or simulatory, mental centre. We are already aware of the obstacles that appear to oppose this possibility. Limiting ourselves for the moment to the motor level, there are two. The first of these arises in relation to the observation phase: if the model does not interact, how is the expectation resource then insufficient for that model to be interpreted kinaesthetically by the child? As far as the game itself (or, in other words, the displayed phase) is concerned, the obstacle would be formulated as follows: how is it that the simulatory centre will take charge of real behaviour? In order to deal more explicitly with this latter obstacle, it will help to address the second difference described above, between the feeding game and symbolic play. It is only in the action of symbolic play that there would be full motor adaptation to the model. In the feeding game, we said, the movements follow the old type of adaptation, that is, they continue to be under the control of the environment; in symbolic play, by contrast, the movements themselves, by being performed in a vacuum or on an inappropriate object, come under the absolute control of the model. That contrast is almost certainly responsible for another contrast between both games which, although more subtle, is also observable. Only in symbolic play or pantomime do we see this phenomenon which so many authors have highlighted, and which we might call the emphatic transformation of the motor pattern.7 However, let us concentrate on motor adaptation to the model. 7. Recently, it has been observed that this same emphatic transformation also occurs in the routines that the adult wishes to show to the child (Brand et al. [2002]). ‘Motionese’ is the term that, inspired by the linguistic ‘motherese’, has been coined to designate those motor patterns provided for the child’s observation. In those movements, the adult would be focusing above all on the motor model which she is offering the child, and as a result, even though the movements may apply to the real environment and may not, therefore, totally break away from the primary adaptation, there is a considerable amount of pantomime or, in other words, of the new type of adaptation. That would be high-level scaffolding. (Brand & Shallcross [2008, p. 859]: “Infants show evidence of preferring motionese to adult-directed action”). It is significative that multimodal cues from motionese have been studied in order to reduce learning complexity in robots: see Rohlfing et al. (2006), who observe in motionese a greater proportion of pauses relative to

Chapter 8.â•‡ Symbolic play 

Let us summarise what happens in each of the games. Is there motor adaptation to the model? Not in the feeding game. But in symbolic play, there is. Is there interaction between the model and the child? There is in the feeding game. But not in symbolic play. In this description we find evidence – yes, evidence, slight as it may be – to support our suggestion about the first big extension of the simulatory centre (or, in other words, about the new function that got to be performed by the simulatory centre). According to our general hypothesis, the preparatory observation phase of the feeding game is – thanks, as we already know, to the interacting model – a situation typical of the origins of the simulatory centre.8 Given that symbolic play emerges later, it seems advisable at least to ask about the possibility that its preparatory phase also operates through the simulatory centre. But here we find no trace of that factor – the interacting model – which was responsible in the preparatory phase of the feeding game for the involvement of the simulatory centre. What might make up for the absence of that factor here? In symbolic play, the child’s actions would be controlled by a motor plan that would have been extracted from a model. That sequential latent plan would be the key to the involvement of the simulatory centre in both phases of symbolic play.

8.7

When might the kinaesthetic interpretation of a non-interacting model become dependent on the basic human ability? Insisting on the ideas presented in 8.5

8.7.1 The core of my hypothesis: Fictionalisation of postures is required by latent sequential imitation If we insert this point – the point about a possible link between the premotor plan extracted from a model and the basic human ability – into our framework of hypotheses, we shall find a piece of evidence which supports it. The motor imitation that actions and a less smooth path through space than in adult-directed actions. These characteristics may serve to highlight unit boundaries within the flow of motion, as Brand et al. (2002, p. 80) suggested (“If our suspicions here are correct, then infants exposed to motionese for a given action (or class of actions) should show greater sensitivity to the structure within such action than infants who have only encountered the action in adult-directed form”). Even more recently, it has been observed that the communicative movement of taking an object to show it to somebody, even when that movement is targeted to an adult, can have a certain motor particularity which can be distinguished from the grasping which does not have a communicative function (Sartori et al. [2009]). 8. The simulatory centre would also be involved in the displayed phase of the feeding game, and for the very same reason, that is, due to interaction. The child, at this point, is very probably detecting the interiority of the adult (the kinaesthesia of the adult opening his mouth) with which it is interacting.

 Becoming Human

occurs in chimpanzees is restricted to a single step, as we saw in 2.1.2 above. Chimpanzees are certainly able to imitate even movements that they cannot see on their own bodies. However, they are entirely unable to perform the only useful motor imitation or, in other words, they cannot successfully extract from a model a complex motor pattern which is new to them. Their ability to copy conduct does not imply any imitation of new complex motor patterns, only an “emulation” of results achieved in a specific context (Whiten [2000]; Tomasello et al. [2006]; see above, 2.1.2). This fact, I insist, is explained if we accept two suggestions: first, the absence of true simulation in chimpanzees, and, second, the restriction of expectation to a single step. But in this case, this topic is of interest beyond the specific aspect which is the reason we are looking at it. We now see that the same reason that makes simulatory processes necessary for symbolic play makes them equally necessary for true motor learning. This extension of our topic, clearly, is good news: no-one doubts the links between motor learning and culture. But let us return to our core question: why would the kinaesthetic interpretation of a whole motor sequence be so demanding? Let us begin by looking at the ‘imitation of complex motor patterns which are new to the subject’ in Piaget. He maintains that, prior to any displayed reproduction, this imitation or motor learning requires there to have been, during observation of the model, a ‘latent imitation’ of the entire pattern. It is difficult to dispute this point in Piaget, I think. A complex motor pattern has to be sustained by a unitary plan, has to constitute, to use a term very used by Luria, a kinetic melody. Consequently, it is impossible to learn it by muscularly imitating each of its movements as it goes along, that is, by imitating them before the model has finished the entire sequence. We will need, therefore, a phase in which, without there yet being any muscular performance, the kinaesthetic interiority corresponding to the entire succession of movements would be detected. On this there is almost general agreement (cf. 8.3: the premotor area is activated during ‘observation for copying’ of motor patterns). But let us go on. This requirement, that is, the requirement to produce the entire latent plan before performing any steps, would constitute, according to the suggestion presented here, a true achievement. That plan, or latent imitation, would require a fictionalisation of the postures which came from each step. Nowadays we know for certain the following regarding real (that is, displayed) action: “Any body representation which is used for action must continuously track the positions of our body parts as we move” (Haggard & Wolpert [2005]). Why, then, would the same thing not occur in a motor sequence that occurs only latently? This is the point I am proposing. If we accept this, then motor step n can only be selected from the posture that comes from performing motor step n-1.9 But no motor step has been performed in an 9. Note the similarity with the requirement necessary to understand pointing gesture. The only way to understand the second moment of the pointing gesture is to conceive an interiority different from mine, that is, a second mental centre in my own mind. That radically different

Chapter 8.â•‡ Symbolic play 

imitation that is only latent. It is this necessary fictionalisation of postures which the ‘second mental centre’, or simulatory centre, would carry out.

8.7.2 Can forward models refute the previous proposal? At this moment, we must pay attention to the forward-models found in ‘animal motor systems’. These models predict future sensory states. Can these models refute the previous proposal? Let’s start by displaying somewhat what we saw above, in 1.4.1. Modern theories of motor control incorporate the forward model, which would serve to circumvent unavoidable neural delays associated with on-line feedback control. “The question is whether the Central Nervous System makes use of forward models. Although finding applications in fields such as robotics, until recently the evidence for internal models in physiological motor systems is still indirect.” (Miall & Wolpert [1996, p. 1265, 1276]). These authors listed the criteria for neurophysiologically identifying a forward sensory model. Dimitriou & Edin (2010) have carried out a detailed study inspired on those criteria, and concluded that muscle spindles can represent future kinematic states, i.e., can act as forward sensory models. That would then be a forward model actually found in a biological motor system. It is therefore convenient to pay attention to its characteristics. Up to now, it was believed that muscle spindles only represented current musclestate variables. However, Dimitriou & Edin (2010) have found, firstly, that spindle discharges predict future kinematic states (“their estimates are sufficiently advanced to match the minimum delay required for trajectory corrections during reaching movements”), and, secondly that “the fusimotor drive to muscle spindles is not strictly coupled to the skeletomotor drive of their parent muscle”. The functional reason for this, according to the authors, lies in the fact that “the mechanical consequences of activating a limb muscle are not only a function of its own biomechanical state but also the state of synergists and antagonists as well as other limbs (for instance, even with the same efferent drive, the future mechanical state of an agonist muscle will depend on whether its antagonist is relaxed or contracting).” Lastly they indicate that their finding fits into what was expected: “there is a good reason that evolution would place forward sensory models within effectors: current sensory states can be directly incorporated into predictions rather than estimated by means of a probably less accurate forward dynamic model.” As we can see, this forward model is located in the effectors themselves and it operates during real movement. So, it would be quite far from the requirement that we have proposed for the latent imitation of motor sequences. Let’s go back to our proposal then: Fictionalisation of postures is required by latent sequential imitation.

interiority would have his/her own accumulative sequential line, a line where the current moment is the heir of the previous moment (supra, 4.9). See also supra, Chapter 8, note 5, p. 120.

 Becoming Human

8.8

Motor learning and symbolic play: A convenient comparison

We have just seen the similarity between symbolic play and motor learning. However, alongside this similarity there are numerous differences. One difference that we need to highlight at this point is that symbolic play does not imply the performance of new motor patterns by the subject. We should remember here the type of symbolic play where the child pretends to eat (without an adult and without food, I stress). Even more specifically, we should address those cases where the child does this without at all involving cultural movements such as eating with a spoon. Using one’s hand to raise food to one’s mouth is clearly a spontaneous movement. The difference as regards true motor learning is here indisputable. Nevertheless, alongside that difference, the background similarity remains. Movements of symbolic play or pantomime can adopt a specific form only (I stress, only) because they take their inspiration from a model or premotor plan. Movements where food is really grabbed are always constrained by the real environment, namely, by the food’s position, size, shape and weight. There is a wide degree of freedom in the grasping movement, a freedom which concerns varied parameters, such as the angle of the joints or the force applied to the muscles. Consequently, in the absence of food, that movement can adopt no particular form, and cannot become real, unless a complete prior plan takes control. All pantomimes require the same thing. It makes no difference if the behaviour is more or less natural. Any pantomime, any form of symbolic play, would depend on a prior plan supplied by the simulatory centre. But at the very heart of this similarity on which we have just commented, we find another new difference between symbolic play and some (only some, as we shall see) of the forms of motor learning. Let us think of the imitative learning of a technical skill. When the learner later reproduces this muscularly she will do so on real objects and with real tools. It will not be possible, therefore, for this displayed reproduction to depend exclusively on the motor adaptation to the model; it will also have to include the phylogenetically old type of motor adaptation.10 In other words, motor adaptation to the model would fade significantly in the displayed reproduction of the technical skills learned. But then, the problem re-emerges of which are the advantages of symbolic play (or, as we were saying in 8.1, the advantages of the pleasure prepared by evolution which drives the child to practice symbolic play). Why is pantomime or symbolic play useful? Certainly, symbolic play shares the prior observation phase with technical

10. We should remember that real food was also present in the adult-feeding game. Thus, in this respect, that game is similar to the learning of technical skills. However, it is almost certain that no complex motor model is extracted in the preparatory phase of this primitive game (only the general type of behaviour is copied, not the movements). Instead, in technical imitative learning the observation phase will almost always need a latent motor plan to be produced.

Chapter 8.â•‡ Symbolic play 

imitative learning. However, what is the reason for making the displayed pantomime phase, which is precisely the essence of symbolic play, follow on from the prior phase? At this point, we clearly have to address the symbolic ability involved in symbolic play. First of all, we should address an obvious point. If the muscular reproduction of the imitated motor pattern takes place in an appropriate environment, or, in other words, takes place with the appropriate objects present, then there will be no evocation of absent objects. Consequently, if it is indeed the case that symbolic play successfully evokes the absent past scene, then the difference between technical learning and symbolic play is just what we would expect. But we still have not broached the question of how the symbolic ability or the ability to evoke absent objects as absent would emerge. It is time to move on to this. The time has come to complete our description of what we previously called the ‘big extension’ of the simulatory centre (or, in other words, the new function that got to be performed by the simulatory centre).

8.9

How does symbolic play come to be symbolic?

8.9.1 From motor simulation to the evocation of absent objects: Completing the description of ‘the big extension’ The sequential kinaesthetic interpretation that has to occur during the prior phase of symbolic play would have succeeded in situating that phase in the mental simulatory centre. What happens then, when we move on to the muscular reproduction of the plan, i.e., during play, strictly speaking? As you will have seen above, I have proposed that, given the pantomime character of that phase, the simulatory centre would begin there (and this is a significant new development) to be involved in muscularly displayed movements. But that new development would bring another with it. It is precisely because the simulatory centre is now transferring the motor model into the muscular system that the fictionalisation of postures ceases to be necessary. Consequently, the ability to simulate fictitious content would be free to perform a new task, and would thus begin to simulate an environment different from the current real one.11 Let us explore this possibility. During the prior observation of the scene, the child would be performing a simulation of the model in her simulatory centre. Later, during her symbolic play, she goes on to imitate the movements in a displayed or muscular fashion. This imitation, although it differs from prior simulation because it is displayed and delayed, would still be controlled by the plan provided by the simulatory centre. This sums up the previous subsections. Now, if we return to our point about displayed and delayed imitation, it 11. Cf. Anderson (2010, p. 245): “According to neural reuse (versus mere plasticity), circuits can continue to acquire new uses after an initial or original function is established, while retaining their functional role for older uses”.

 Becoming Human

should be noted that the child, since she is, in fact, really displaying the motor pattern, might at each step make use of the simple expectation of the following step. The simulatory centre would have been activated to provide the motor plan that will govern the pantomime, but with the real performance the simulatory centre would be free from the task of simulating postural and kinaesthetic content. This, I suggest, would be the key for the visual evocation of the model scene. Old cooperation with the postural-kinaesthetic area would be replaced by cooperation with the visual area.12 The simulatory resources that have been freed from the old obligation – the obligation to fictionalise at each step the posture resulting from the previous one – could now take charge of simulating the corresponding visual content. Real movement would be accompanied by the corresponding visual or, more generally, exteroceptive evocation. It is along these lines that I would prefer to look for the response to the question (a hypothetical question, we should remember) about the origins of evocation. This link between imitation and evocation would be in accordance with ideas that have been known for many years. Numerous authors have formulated, in one way or another, the hypothesis that it would be possible to facilitate long-term retention by allowing infants to imitate event sequences immediately after their presentation. A recent empirical demonstration of this facilitation can be found in Lukowski et al. (2005). In my terminology, the displayed motor imitation of which these authors speak would have caused an evocation of the model scene. It is, therefore, no surprise to us that the retention of the model scene in long-term memory should improve, since it has been doubly experienced i.e. it has been evoked and not just perceived. (See also Roediger & Butler [in press], who propose other, more profound, explanations: “One idea is that retrieval of information from memory leads to elaboration of the memory trace and/or the creation of additional retrieval routes, which makes it more likely that the information will be successfully retrieved again in the future. Other idea is that memory performance is enhanced to the extent that the cognitive processes during learning match those required during retrieval.”) We might also cite Glenberg (1997). This author belongs to one of the fields that characterise second-generation cognitivism, namely, embodied cognition. But what specifically interests us at this point is the fundamental role he ascribes to motor memory within retrieval memory. This is in accordance with our link between motor imitation and evocation. Obviously this “big extension” relates to the “mimesis” of Donald (1991) or the “complex imitation” of Arbib (2005). What my hypothesis adds is the derivation of the ‘big extension’ from the second mental centre, that is, from the basic ability which also explains pointing or four-hand-actions (or even ‘self-conscious emotions’: see above 3.5). In addition, as we shall see in Section Four, this very ‘big

12. Cf. Anderson (2008, p. 240): “Differences in domain functions will be accounted for primarily by differences in the way brain areas cooperate with one another.”

Chapter 8.â•‡ Symbolic play 

extension’ will bring about, by means of protodeclaratives, other exclusively human capabilities. But there is a question that we can delay no longer. Is there a basic flaw in what we are proposing? Are we falling into a vicious circle by viewing symbolic play as an instrument the child uses to self-induce specific evocations?

8.9.2 Addressing a seemingly vicious circle: What is in the mind of the producer when he or she decides to bring about a certain specific evocation? How would the subject go about selecting the evocation she wishes to provoke in her own mind? It would appear that she would have to have first evoked the object in question, in order to then select it. However, if this is the case, it is not then possible to talk about the subject being able to self-induce an evocation. This criticism, I repeat, appears solid, but we must now ask if it really is. Before we address this question, let us clarify that the paradox may of course dissolve in the communicative use of symbols. The producer of the symbol may have the image already evoked in advance, and would be looking only to induce it in the recipient. Nevertheless, as far as symbolic play is concerned, I would say that it is clearly impossible to exclude the possibility of an individual performing symbolic play alone (we have already seen this in the opening paragraph of this chapter). Must we then give up the idea of seeing symbolic play as an adaptive practising of the ability to evoke?13 Piaget (1959), proposed that muscularly displayed reproduction of the motor patterns shaped by imitation would be the cause, and not the effect, of the evocation or image of the no-longer-perceptible model scene. He states (although without adding any further explanation as to why this causal relationship should occur) that this is how a truly ‘embryological’ derivation of the ability to evoke would be obtained. Despite, in my view, being right on this point, Piaget’s account does not, however, even consider the question of the possible paradox. In general, I would say that the paradoxical appearance of any symbol used for oneself has been highlighted less than might have been useful. This question may, perhaps, be of some interest even beyond the study of symbols and evocations. We might think, for example, of an ability for regulating one’s own attention and, in this way, also succeeding in regulating the course of one’s own thoughts

13. The argument put forward by Fodor (1978) in favour of the innatism of his ‘mentalese’ certainly shares common ground with this. Nevertheless, our presumed paradox does not relate to how concepts are learned, but to how specific instances of evocation are possible. Therefore, neither the impossibility of learning nor innatism would be the apparent conclusion. Here, as we have seen, what we would presumably conclude is the impossibility of the subject self-inducing evocations. Needless to say, I disagree with the two conclusions (this impossibility and Fodor’s innate mentalese).

 Becoming Human

or behaviour.14 It is clear that thinking about this question (a crucial question for anthropology, if ever there was one, it scarcely needs to be said) we meet the presumed paradox once more. But let us, finally, face our question. Does this vicious circle appear here or not? I think the concept of expectation might free us from the dreaded vicious circle. Expectation has been mentioned at length in the previous chapters. We turned to it in relation to mirror neurons, and also when we asked ourselves if goal-oriented behaviour could occur without the goal being evoked. In the second question, expectations (and, more specifically and originally, those which form Lorenz’s innate consummatory patterns) were of use to us precisely to dissolve another paradox. In that case, the question was how exactly animals can look for an object (for example a nest) when they have no nest image, as their behaviour clearly demonstrates. These two points are compatible, even though the opposite may appear to be the case. Expectation may be (and this is the key) at once defining yet empty, since it corresponds only to the subjective aspect of the goal sought for. This may now be of use to us again. Just as the expectation of the desired results generates and selects behaviour in the animal, the expectation of a specific stimulus may select and activate in human beings the symbolic resource that will enable the evocation of the image in question. In comparison with animal behaviour, the symbolic resources would certainly be completely new and sophisticated elements. However, what is relevant here is that empty expectation has been able to activate and select the corresponding evoking symbol prior to every self-induced evoked image.15 Compare the evocation to what we saw in the child’s feeding game. There, the child had succeeded in forging a visual expectation of what the adult will do when it receives the food. However, during that game the expectation is satisfied by a real perception. The child actually sees the adult open her mouth and chew. By contrast, the expectation to which we are referring in this subsection would be satisfied by an evocation, and not by a perception. In other words, rather than provoking and orienting a behaviour which would develop in the environment, this new expectation would provoke and select the symbolic behaviour. 14. See above, Chapter 4, note 9, p. 75. 15. If this is the case, then brain memory would have ‘secondary intentionality’, similar to that of a written text. It is necessary to act on what the memory provides if we want to obtain an evocation memory or retrieval. This comes into contact with the so-called ‘external mind’ as per Andy Clark (2001, e. g.) The idea that the prostheses (whether written or of any other kind of external character) may be considered to be mental is very often rejected. This rejection, or, at least, different posture, which we see in Searle (1983), but not only in him, occurs precisely because ‘mental intentionality’ is considered to be primary, and not secondary as in written texts. (In Searle, ‘intentionality’, unlike my expectation, is always representational and semantic: See Hutto’s [in press] criticisms of the excesses of the representationalism.) However, if, as we have just suggested, the process of self-producing a memory involves the performance of a symbolic behaviour, this rejection would have no raison d’être. (Cf. Vosgerau [2010])

Chapter 8.â•‡ Symbolic play 

8.10 From the basic ability to symbolic play: A proposal on ontogenesis and a question about evolutionary-historical origins Let us return to the suggestion. Evoking absent objects, which we might say is the culmination of ‘the big extension’ of the basic ability, would involve a functional restructuring of the simulatory centre. As with all transformations or restructurings, here too we can emphasise two opposing directions. In other words, we can comment on the closeness and also the difference between the kinaesthetic simulations of the prior phase of symbolic play (the observation phase, in other words), and the evocation of images that is caused by symbolic play itself. Let us start with the difference. It is clear that any interpersonal feature has already disappeared when absent objects are evoked as absent. This is not now a question of the model lacking any interaction with the child. That interaction was already absent in the prior phase. What is happening now, that is, in symbolic play itself, is that the model is simply not present at all. It is in precisely these conditions that evocation must occur, by definition. Given this difference, we might say that evocation has moved completely away from the prior phase of symbolic play. However, as I said, we can also stress the closeness. The prior simulation phase has handled motor and postural content. Those movements and postures, although fictional, are not the simple empty profiles which operate in a posteriori expectation (in all expectation, in reality). They are full contents, however different they may be from real ones. Therefore when we talk about them we must talk of evocation of kinaesthetic-postural contents. In this way, the jump to the evocation of visual or, more generally, exteroceptive contents begins to look like a logic and plausible event. We can trace a gradual trajectory toward symbolic play. Let the basic human ability be the starting point. The second step would then be the feeding game. Next would have to come the prior phase of symbolic play, where the model, although not actually interacting, is still present. Finally, we would come to the complete absence of interpersonality seen in the evocation of absent objects. The reader will be wondering if, beyond my focusing on the different games, the proposed ‘big extension of the simulatory centre’ might perhaps allude to something more than an ontogenetic trajectory. The only thing I can do is list what the possible responses would be. The first: the trajectory would correspond to evolution through different species that are now extinct. Or, more concretely, different biological species were involved, with one of them reaching the basic ability, and another, ‘the big extension’. The second: both the basic ability and the ability for the ‘big extension’ would be not only exclusive to our species but also present in it from the beginning, such that all progress in our species would be merely cultural and not biological. The third: this response, which we might call intermediate, and overlapping substantially with the second, would allow some universal and, however, intraspecific evolutionary retouching for our species. (Here I am only dealing with the evolutionary pathway of the

 Becoming Human

hypothesised basic ability. Needless to say, many other anatomical, physiological and behavioural changes could be likewise dealt with.) As regards the second, today we might say that it is completely out of the question. (Let us leave the famous FOXP2 gene aside.16) The remarks we have been making above about the pleasure of the symbolic game (and also the motor co-ordination game) already took for granted some evolutionary retouching after the basic ability appeared. Those types of pleasure (we said in 5.2.1) would have been consolidated in evolution because with them children practice and thus reinforce human abilities. Consequently, the selection of this pleasure had to come after the beginning of such abilities. We certainly do not know to what extent the pleasure of each of those games was specifically constructed for that game or if, on the contrary, evolution simply mobilised a prior generic pleasure. However, even if it were only the new application of an old pleasure, some universal evolutionary retouching would have to be involved after the basic human ability. We have already seen in 3.5 what lies in wait for us today if we wish to decide between the first and third. If it were discovered that Neanderthals lacked the ‘white in their eye’, then the first would be considerably weakened (although the possibility that the basic ability emerged in a species which was a direct ancestor of Sapiens Sapiens would remain). If the opposite were discovered, then the third would have to be ruled out completely. I acknowledge that my indecision in response to these alternatives limits the space for my hypothesis to be refuted, or, to use the Popperian term, makes it less falsifiable. However, what can I do if I have no answer? My hypothesis states that the basic human ability, namely, the second mental centre, is crucial for exclusively human characteristics, and also states that chimpanzees do not possess this ability. However, where did the development occur? Were the various steps spread through different biological species, with one of them reaching the basic ability, and another, ‘the big extension’? Or, in contrast, was a mixture of historic derivation and intraspecific biological retouching sufficient for the entire derivation from the basic ability to be sustained? My framework of suggestions says nothing about this.

16. It was said that the FOXP2 gene, alterations to which give rise to a specific type of linguistic deficit, would have emerged in its specifically human version around 100,000 years ago, in other words, in organisms which had already been ‘anatomically modern’ humans for a long time. (In that scenario, of course, the gene in question would not bring about a change of species. At least, not a literal change of species. However, we should remember that the deficit appears with even one defective allele. And we should also note that this deficit would be very harmful to the success of the individual. Consequently, once the mutation occurred, the ‘old-style’ individuals would be extinguished within a few generations. The mutation would therefore have drawn a dividing line, irrespective of how we might describe that line.) However, according to later news, it appears that the human version of FOXP2 has been discovered in the genome extracted from some Neanderthal bones.

Chapter 8.â•‡ Symbolic play 

In summary, the task we set ourselves of finding the connection between the basic ability and the ability to self-induce evocations has been a long one. We have seen that several modifications to the starting point which is the basic ability or simulatory centre would be required. Firstly, we have had to add one feature (namely, the kinaesthetic interpretation of entire not-own motor sequences) and remove another (namely, personal interaction). Luckily (luckily for our framework of suggestions), there seems to be some evidence, however weak and indirect, that these necessary modifications would actually occur in children’s development. We have seen how the adult-feeding game constitutes one of those pieces of evidence. Secondly, we have suggested how the ability to evoke absent scenes could have come from the simulatory centre. As regards to this, we must move on to focus on another source of evidence, namely, the similarities and differences between the movements of symbolic play and the articulatoryphonetic patterns learned by imitation.

chapter 9

From symbolic play to linguistic symbol This chapter explores the relationship between symbolic play and an important component of language. Our aim will be to show that the characteristics found in symbolic play, namely, both the symbolic ability and the imitation of complex motor patterns, would have a slightly more sophisticated analogue. We shall call this analogue ‘linguistic symbol’ (not ‘word’, because we would not fully have reached words yet). The ultimate goal we are pursuing in this chapter is, of course, to show that the linguistic symbol would also be connected to the basic human capacity via the intermediary of the ‘big extension’. But this comparison between symbolic play and linguistic symbol must proceed slowly.

9.1

Adaptation to the model, a feature shared by the movements of symbolic play and the articulatory-phonetic movements of language

9.1.1 Evidence that encourages us to search for a similarity In addressing symbolic play, we have stressed how the movements would have ceased adapting to the environment and would have adapted to a model instead. These movements would constitute a pause in the behaviour we humans usually share with higher animals. Contrary to what occurs in that usual behaviour, movements in symbolic play or in pantomimes would be controlled by a rigid plan hatched by the child in imitation of the agent in the model scene. It might be useful to bring in a piece of information on this point regarding sufferers of certain types of aphasia, who, in addition to their linguistic impairment, are unable to carry out the instruction to perform specific movements in vacuo. For example, while the sufferer is perfectly able to drink from a glass or clean her teeth, she is unable to do just those same movements without a glass or toothbrush (see Kempler [1993] for an overview of this subject). This suggests the possibility of a link between linguistic symbols and the pantomime ability.1

1. Let us pay attention to Pettenati et al. (in press), who have explored the form of representational gestures produced by children between two and three years of age asked to label pictures in words. Children produced words and also representational gestures. “These gestures were analyzed. The percentage of action gestures was much higher than the percentage of

 Becoming Human

Note that in no way am I suggesting a manual-gestural origin of language.2 The only thing I want to stress is the idea that the power to evoke, either by linguistic or non-linguistic means, would always be linked to the ‘adaptation to the motor sequence of a model’, regardless of the parts of the body involved in the motor pattern performance. Remember that children’s symbolic games are by no means restricted to hand or mouth muscles. That is why the fact (which is inevitably a serious concern for the thesis of the manual origin of language) that “current evidence supports completely independent limb praxis and speech/language systems” – Barrett et al. (2005)– is completely irrelevant for my current proposal. It has been known that the left hemisphere is the responsible for these in vacuo activities, or manual pantomimes. Frey, Funnell, Gerry & Gazzaniga (2005), who have researched this question in split brain patients (i.e. patients who lack the corpus callosum connecting the hemispheres of the brain), put forward an even more significant piece of information: the left brain predominates for manual pantomime even in lefthanded people. In addition, it has been proven that “this effect is not attributable to differences at the conceptual level, as the left and right hemispheres are equally and highly competent at associating tools with observed pantomimes”. We might conclude, therefore, that the fact that a movement is part of a pantomime is sufficient to situate it in the hemisphere responsible for language (or, to be more precise, for language except in regard to intonation: Peretz & Hyde [2003] confirm this classic distinction). It is true that there are brain lesions that preserve pantomime yet cause aphasia of sign language (Corina et al. [1992], Marshall et al. [2004]). But that does not affect the previous statement. Sign language is much more than the mere linguistic symbol. Sign language has syntax and, consequently, true words. At this point, we are obliged to formulate a new question. Can we relate language to the movements that are under the control of the model? I suggest that this relationship would be able to explain one undoubted ingredient of language (the ‘linguistic symbol’), even though it may by no means explain language in its entirety. As we stated earlier, we will relate the two aspects of symbolic play, that is, the motor and the symbolic aspects, to that ingredient of language. Initially, however, it will be useful to look at each aspect separately.

size-shape” (my emphasis). In conclusion, pantomime and symbolic play moulded these representational gestures. 2. Arbib (2005) or Corballis (2003) (or also Gentilucci & Corballis [2006]) suggest that language emerged in direct relation to manual gestural communication. I do not agree with this opinion (in due course I will be more precise, when dealing with intonation). My emphasis on the hand (in Chapter 1) did not intend to support the manual origin of language, but was intended to underscore self-visible movements and the possible consequence of these movements, namely, the ‘perception of the homology between one’s own body and someone else’s body’ and the ‘second mode of processing the gaze of others’.

Chapter 9.â•‡ From symbolic play to linguistic symbol 

9.1.2 The similarity in the respective motor aspects Let us focus first on the motor aspect. What is the main similarity between language and the movements involved in symbolic play? It is obvious that articulatory-phonetic patterns are undoubtedly motor patterns. (However, little attention has been devoted to exploring the relationship between movement and language. As Iverson (2010, p. 230) says, “One reason for the lack of attention to the relationship between motor development and language development may be the general neglect within psychology of ‘movement’ as a topic of study”. But see Sheets-Johnstone [1999]) The range of movements that can be chosen and performed is by no means restricted to visible movements of the whole body or the limbs. In sharp contrast with this, vocal movements may well be the epitome of motor skills in human beings. We know for certain nowadays that any controlled production of a given phonetic pattern requires extraordinarily precise handling of both the particular small muscles to be triggered and the exact moment of that triggering. Only hand and finger motor sequences that have reached a high skill level (such as those of experts in typewriting or playing musical instruments) require motor articulation abilities to be at a level similar to that required by speech. We need more than the qualifier ‘motor patterns’, however. The articulatory-phonetic patterns of language belong to that special category of complex motor patterns which (at some point) are new to the subject and which must, therefore, be learned by imitation. Language motor patterns, there is no doubt at all, have to be learned. Consequently, everything we have hypothesised about the ‘motor adaptation to a model’ can be applied to motor patterns acting as linguistic signifiers.

9.1.3 Back to Saussurean parity once more This view of linguistic signifiers allows me to expand and extend the treatment given to Saussurean parity in Chapter 6. Then we explained Saussurean parity as follows: if a receiver is going to kinaesthetically interpret communicative signals directed at him, then (according to the hypothesis) his kinaesthetic interpretation must inevitably be channelled through the simulatory centre. This mechanism would account for the Saussurean parity both of a linguistic request and also of a pointing gesture. On the one hand, the recipient would receive those signals in production-format, and, consequently, would identify them as being identical to the requesting or pointing signals that he would also be able to produce. On the other hand, the recipient, being aware that such signals are being directed at him, would simultaneously perceive them as radically of somebody else. What can we add here? Saussurean parity can be achieved even when the perceived motor patterns of others are not being directed at oneself. From 8.5 and, more precisely, from 8.7 on, we have been stressing the idea that sequential kinaesthetic interpretation

 Becoming Human

cannot be made on the basis of the old resource of expectations. In other words, if the kinaesthetic interpretation is of a complex pattern, then the simulatory centre will have to be involved, even if the patterns are not being addressed to the subject. How are we to understand these two suggested ways of enabling Saussurean parity? My view is that we are in no way obliged to choose between the two requirements. It seems more reasonable to believe that they occur at different moments in the ontogenesis. Saussurean parity would be achieved first for pointing gestures. At that stage, it would coincide with what we have called the basic human ability. This would be the period when only the first enabling requirement is active. Only later, once the child has come through the babbling stage (more precisely, that stage of babblings such as dad, mum or nana, which adults have incorporated into language), would she be able to perform a true imitation of complex articulatory-phonetic patterns. Furthermore, learning of the first articulatory-phonetic patterns ‘adapted to a model’ is consolidated especially in protodeclarative holophrases and in the reception of adult messages similar to these holophrases, all of them productions where the articulatory-phonetic pattern is accompanied by finger- or eye-pointing gestures. As a result, we might thus think that the second route towards Saussurean parity would initially need to be prompted and accompanied by the first route: Indeed, young children follow pointing over words in interpreting acts of reference (Grassmann & Tomasello [2009]). We have focused up to now only on motor similarity between ludic pantomime and the articulatory-phonetic pattern imitated. We need, therefore, to look at the differences between these two ‘motor adaptations to the model’. Or, in more general terms, we will address the contrast between the three types of motor imitation: technical imitative learning, symbolic play, and learning of complex articulatory-phonetic patterns.

9.2

The special character of articulatory-phonetic imitations: Arbitrary and with no adaptation to the environment

9.2.1 Articulatory-phonetic patterns and imitation of movements: The social or phonemic model The relationship between the articulatory-phonetic movements of language and meaning is arbitrary. Articulatory patterns are given form only by the social code of each group. Linguistic signifiers would represent, therefore, the example par excellence of the new kind of motor adaptation; in these, ‘motor adaptation to a model’ is necessarily the only adaptation in operation. Compare this with the performance of technical actions. In this case, the adaptation to the real environment can never be completely ignored, no matter how much these actions have been copied from a model and depend on cultural learning. In other words, three elements are inevitably present when copying technical actions:

Chapter 9.â•‡ From symbolic play to linguistic symbol 

emulation of results, adaptation to the environment, and, strictly motor imitation or, in other words, adaptation to a model. Gattis et al. (1998) are relevant here: accuracy in motor reproduction increases when there is no object, or, stated in the reverse, the temptation to turn to mere emulation grows stronger when the movement is acting on objects. See also Armstrong (2003), who points out that the preference for the right hand appears earlier in sign language production than in object handling. This would appear to indicate that movements which are adapted only to the model are controlled by the left hemisphere before movements with hybrid adaptation. (We must also compare the imitation of articulatory-phonetic patterns with the symbolic play of riding on a broomstick, for example. But we will address this comparison in 9.2.3, below.3) In contrast, imitation of the movements themselves reigns supreme in articulatory-phonetic patterns. It is the model’s motor patterns themselves which are the ones copied. Clearly, what is being copied here is not a subject on a particular occasion; in spite of this, however, there is still a model (a social model, as we might call it) being meticulously followed. Let us examine this issue in more detail. Every occurrence of a given articulatory-phonetic pattern is part of the social model. Since each occurrence will differ considerably from the others in respect of voice pitch, sound volume and intonation, all these aspects will have to disappear in the true model in operation here. Similarly, the ‘irrelevant variations’ within what a given language views as a single phoneme must also disappear. We must distinguish between the ‘social model’ or ‘group treasure’, on one hand, and specific occurrences, on the other. The ‘phonemic’* and ‘phonetic’* dichotomy was conceived a long time ago by linguistics precisely for this purpose. Only the phonemic level is the object of learning here. Only the phonemic level acts as the model that learners really need to imitate. It will be useful here to refer to Byrne & Russon (1998), who try to reduce articulatory-phonetic imitation to what they call ‘program-level imitation’. (see above, p. 38)4 They argue that there is no true motor imitation here – or, as they say, no true ‘motor 3. Perhaps I should already mention a difference. Perhaps I should say that in the symbolic play, the child must somehow adjust her movements to the characteristics of the broomstick. As we have already seen above, despite at times being performed in vacuo, symbolic play is, however, performed on unsuitable objects on other occasions. But this is of little help for my purpose. The mentioned difference is not only restricted to symbolic play involving an object, but even here the difference would be very weak, since on those occasions, the ‘emphatic transformation of the motor pattern’ (which we spoke about in 8.7, above) is accentuated, and as a result, the apparent difference (the apparent greater ‘adaptation to the environment’ of these occasions compared to in vacuo games) is cancelled out. 4. See also Byrne (2009, p. 114): “Program-level imitation is nicely illustrated when a child copies a word she has not heard before. The child’s sound pattern is typically quite different to that of the adult model, with much higher-pitched vowels and often systematic simplification of consonant clusters. This shows that what is copied is the program-level gist of the word, a new way to assemble the motor programs for producing vowels and consonants – which are already in the child’s repertoire.”

 Becoming Human

mimic’– because neither the volume, nor the intonation, nor any other phonetic aspect is copied. Of course, it is perfectly true that none of these aspects is copied. However, it does not follow from this that motor imitation of a model is not present. Those aspects do not – and absolutely should not – belong to the linguistic model.5 See Singh (2008, p. 833): “Results point to positive consequences of affective variation, both in creating generalizable memory representations for words, but also in establishing phonologically precise memories for words”. See also Richtsmeier et al. (2009, p. 376): “Talker variability may be just one type of variability that facilitates learning, establishing a robust representation of a word”. This social, linguistic model, I insist, is copied by means of true ‘motor imitation’, or, as Byrne & Russon (1998) say, ‘motor mimic’. (On the successive definitions of ‘true imitation’, see Byrne [2005].) The imitation of movements would, therefore, be absolutely in command in articulatory-phonetic patterns. The link between language and this kind of imitation would be a very close one. More concretely, in articulatoryphonetic patterns, this kind of imitation has no competitors, since there is neither adaptation to the environment nor emulation of results.

9.2.2 The fuller the control exercised by the ‘motor adaptation to a model’, the more stable the model We have just seen that ‘motor adaptation to the model’ exercises control with neither interference nor competitors in articulatory-phonetic patterns. At this stage, we wish to suggest that this is the situation that had to occur, the one that was most useful. Useful for what? So that a model could remain stable throughout successive transmissions. First, I wish to emphasise that it is not possible for the real environment ever to exert any effect on the motor pattern in articulatory-phonetic signifiers, those very special motor patterns. Unlike other motor patterns, animal cries were never modelled by a given environment or object. Among animals, cries express emotions and obey only the same. Such emotional modulation is carried out by much more primitive mechanisms than those which control the astonishing ‘motor adaptation to the environment’ in higher animals. Later on, motor adaptation did indeed give rise to vocal productions, with humans and articulatory-phonetic patterns; this characteristic of cries survived these changes, however. The ‘motor adaptation to a model’ peculiar to articulatory-phonetic patterns did not have to combine with any motor adaptation to the environment or to real objects. Let me underscore my key point. It is not only that the old kind of motor adaptation is absent in the imitator of articulatory-phonetic patterns, but also that it would not occur even in the model. For comparative purposes, let us look at the models 5. (As it is said in almost every textbook) when faced with non-speech sounds, we do not know what is to be imitated. By contrast, when faced with speech sounds, we know that, as Liberman & Whalen (2000, p. 190) put it, “/ba/counts, but a sniff does not”.

Chapter 9.â•‡ From symbolic play to linguistic symbol 

imitated in pantomime or in symbolic play. Those models did have to adapt to real objects and circumstances, to the glass of water, to the telephone, or to the horse. That is why the copies in vacuo (or also the copies made on an inappropriate object) of those models are playful movements or pantomime. By contrast, it would be impossible to make a pantomime out of an articulatory-phonetic type of motor pattern.6 The model itself of these patterns was already as completely alien to the environment as any pantomime can be. It is for this reason that articulatory-phonetic patterns are ideally suited to guaranteeing the fixed character and stability of the model. Let us examine this. Imagine that pantomimes had come to act as a model for learning a code. In this case, the risk of the model being altered would have been higher than the risk involved in articulatory-phonetic models. At the end of the day, no matter how much the pantomime had taken on the role of model, the real actions underlying the pantomime would continue to be performed for real, and to be observable by learners. As a result, adaptations to the real environment could filter into the learning and prevent the learned model from assuming a stable, crystallised shape. Nothing like this occurs, I repeat, when learning articulatory-phonetic patterns. (Or, more in general, when learning arbitrary signs. At least in present sign languages, signers are typically unaware of any iconic component in their signs.) In these, the model, perfectly stabilised and crystallised, would assume absolute control over the movements. Why should the stabilised shape of the model be so important? The two requirements for the development of a culture are preservation and change. Indeed, human cultures have been characterised throughout history more by their ability to be transmitted than by their speed of change. However, the stabilised shape to which I am referring would have to do with successive interpersonal communication, and not with the freezing of the model in history, which would, in the end, be impossible. In short, what matters is the creation of the ‘social code’ (see above, 9.2.1).7 6. Note that the articulatory-phonetic pattern and the (communicatively shaped, as we saw in 4.4) pointing gesture coincide in this same impossibility. This is no surprise, because both behaviours also coincide in that their form has been moulded from the beginning (that is, in the model itself) with the sole purpose of being perceived by an addressee. Here we are tempted to connect with Cochet & Vauclair (2010) (“The right-sided bias for pointing gestures, and specially for declarative pointing, was reported to be stronger than for manipulative actions”) and to conclude that a connection exists between the human left hemisphere and the imitation of that type of patterns (i.e., ‘patterns whose model itself is already completely alien to the environment’). But we must remember Frey, Funnell, Gerry & Gazzaniga (2005) about pantomimes (see supra 9.1.1). Thus, if we want to define the requirement so that the human left hemisphere would have control, we will have to say that reproduction by the subject (not necessarily the production by the model) is what has to be performed with adaptation exclusive to the model. 7. One prevalent feature of natural languages is that they avoid unpredictable variation. This feature has recently been studied in an experimental way. Certainly, such studies have mainly

 Becoming Human

Before we move on to another point, it will be useful to stress the feature we had detected which the performance of symbolic play and the reproduction of learned articulatory-phonetic patterns would have in common. In both cases, the motor adaptation to the model is absolute, not only in the observation stage, but also during the displayed reproduction. Therefore – according to what we suggested in the preceding chapter – even the muscular and displayed reproduction operates through the simulatory centre in both cases too.

9.3

Articulatory-phonetic pattern and evocation: The analogy with symbolic play

In the preceding subsection, the aspect of the articulatory-phonetic pattern that interested us was the motor pattern itself and the kind of adaptation it involves. But it is now time to address a further possible aspect of the relationship between symbolic play and articulatory-phonetic patterns; that is, to compare the symbolic aspect in play and in language. As far as symbolic play is concerned, we differentiated between, on one hand, the preliminary phase involving observation and sequential latent imitation and, on the other hand, the later phase involving the muscularly displayed game. According to my hypothesis, the evocation of successive movements and postures would have been required in latent imitation, but in muscular repetition, by contrast, that kind of evocation would no longer be necessary, thus freeing up the resources previously mobilised for that purpose. These resources could then be used (as we suggested in the preceding chapter) for a new task, namely, the exteroceptive evocation – the visual evocation, for example – of a past scene. What happens if we try to apply this argument to linguistic symbols? On the one hand, we will find it straightforward to transfer almost all the elements examined in symbolic play. On the other hand, we will come up against a great obstacle. Is our project of viewing linguistic evocation through the lens of symbolic play doomed to failure? Responding to this question will be our purpose here. But we should stop talking about what we plan to do and actually get down to the task. Let us begin, specifically, not with this great obstacle, but with what we have referred to as the easiest part of the transfer. What relationships might we find between these two types of evocation? Here we are now leaving behind the similarities between the motor aspect in symbolic play and focused on the trend to regularisation of gramatical resources. (See, amongst other possible examples, Smith & Wonnacott [2010].) However, it is obvious that unpredictable variation is also avoided in articulatory-phonetic patterns. According to my suggestion, this preventive role would be performed by this peculiar feature which appears in the articulatory-phonetic model (and not in a pantomime which works as a model).

Chapter 9.â•‡ From symbolic play to linguistic symbol 

the motor aspect in linguistic symbols that we established at the beginning of this chapter. We must now examine what kind of content is evoked in each case. The content evoked in symbolic play would be the model previously observed in the preliminary phase. The motor pattern performed by the child when she rides on the broomstick was drawn from a real horse and rider she had seen during the preliminary phase. It is this now absent scene that she evokes in her game. Consequently, the visual image generated in symbolic play would also have to include the riding movements that had already been seen. In short, the visual aspect of the movement (i.e., the aspectfor-the-spectator) and the kinaesthetic aspect correspond, respectively, to the evocation achieved and to the instrument by which it was achieved. This is the case of symbolic play. We might formulate this idea by saying that in symbolic play the evoking symbol or instrument is analogic and non-arbitrary as regards the evoked content. Linguistic symbols, by contrast, are arbitrary. The difference is indeed obvious. However, in my opinion, the distance between these two evoking processes is relatively small in this respect and does not prevent us establishing the anticipated similarity. Let us examine this more closely. Let us assume the parallel was an absolutely strict one. In this case, the evocation achieved by the imitator through the displayed reproduction of an articulatory-phonetic pattern would have to be rooted in the external aspect (i.e. the aspect-for-thespectator in the broad sense) of the movements involved. But such content cannot, by definition, be evoked. The key point is obviously this: as far as vocal movements are concerned, the external aspect is accessible not only for the spectator, but also for the producer. The person reproducing an articulatory-phonetic pattern in a displayed fashion hears himself. Incidentally, the language of deaf people is based on another modality of ‘self-perceptible’ movements: one’s own hands are visible just as one’s own voice is audible. Consequently, the evoked content can no longer be centred on the external aspect of the movements. Then, since living beings always tend to find new uses for any abilities already acquired, another route would emerge where the evoking or symbolic ability could continue to be performed. It will be useful at this point to remember inner or inhibited speech. In inner speech, that possibility (that the evoked content focuses on the external aspect of the movements) is no longer automatically ruled out. Voices might be evoked. This evocation may give rise to serious pathological cases when attention is not paid to the sequential order of the process, that is, when it is not clear whether the voices were present at the beginning or at the end of the simulatory centre’s activities (There has been increasingly clear evidence since the late 80s that the core symptoms of schizophrenia are mostly caused by the inability of patients to determine which actions are self-generated and which not. Thus, “the attenuation of self-produced touch is not shown in individuals with schizophrenia who have hallucinations”: Blagrove et al. [2006, p. 285]). In other cases, it would have the disadvantage only of slowing and weakening the processing of meanings – e.g., I suspect that the evocation of the auditory aspects of the premises may be detrimental in complex reasoning.

 Becoming Human

Leaving aside inner speech, let me now comment on a more relevant point. Selfperceptiveness is, in my view, an important issue. Both oral language and deaf people’s language, or even – not forgetting the remaining possibility – the language used by Helen Keller, are all equally based on self-perceptible movements. (The producer can hear his own voice, see his own hands or feel the finger spelling which he is performing on his own skin. These perceptions are essential for learning and only after language acquisition will the producer be able to dispense with them.) In my opinion, there are two reasons for this. On one hand, self-perceptible movements make imitation easier. On the other hand, as we have just said, by ruling out the evocation of the external aspect of the producer’s movement, this ‘self-perceptiveness’ would thus establish a great difference between the linguistic symbol and symbolic play; this would allow the evocation of contents arbitrarily related to their motor signifiers to begin.8 The kind of content that would be used for the new kind of symbol was found very near to the former content. The scene that will be evoked by the imitator during the displayed reproduction of the articulatory-phonetic pattern is the one that acted as the background for the model being observed. There is no longer such an obviously close correspondence between the evoked content and the evoking instrument as there was between the external and internal aspects of a motor pattern. However, in the absence of such an inherent correspondence, a spatio-temporal contiguity would be involved in the linguistic symbol. The perceptions contiguous to the imitator’s perception of the model are the perceptions which provide the content for the evocation which the imitator will self-induce by means of displayed reproduction. Let us think for a moment about how children learn their first words. The meaning of these first words (more precisely, pre-words or linguistic symbols: see above, 7.3) is always an element which is present in the learning scene. What the child learns initially is certainly not adult ‘displaced speech’ (i.e. speech referring to things which are absent). In addition, that element is not only perceptible, but will likewise have been determined by communicative interaction based on pointing gestures with the eyes or fingers. What interests us here is that when the child is a little older, this referential correlate (even if it is absent by then) would be evoked whenever the learned motor pattern is repeated. Here we must make space for the phenomenon we saw above in the motor aspect of linguistic symbols. The true motor model for linguistic signifiers is, we said, the social model or phonemic abstraction. The repeated specific occurrences of the articulatory-phonetic pattern each contribute to the generation of the model. For that reason the individual peculiarities of each occurrence disappear in the model. If we apply this same process of abstraction to the symbolic aspect, then the suggested analogy between playful evocation and linguistic evocation becomes less striking. It is true that the successive occurrences in which the child succeeds in learning any one of his 8. See Stamatopoulou, in press, about scribbles: “Scribbling, unlike pantomime tasks, leaves perceivable consequences that gradually act as a stimulus to the child’s own scribbling actions”.

Chapter 9.â•‡ From symbolic play to linguistic symbol 

earliest words, or pre-words, undoubtedly take place in different scenes. However, each of the occurrences supporting that initial learning undoubtedly shares some element or feature. It is this element or feature that will be evoked at a later age when the learned motor pattern is repeated. Consequently, the abstraction from a particular dog to all dogs is automatically given by the social character of the linguistic symbol ‘dog’ and its being used on different occasions. This variable use would cause the meaning of the linguistic symbol eventually to become similar to the abstract category peculiar to a prelinguistic or animal perception. Certainly, in my view, the meaning of symbols and animal perceptual categories are indeed very different. (Only when the category is attached to the symbol can the former be independently attended to. In other words, in perception the category is by no means an element which can be thought of as isolated from the other elements surrounding it: I will return to this issue in later chapters). However, the fact that the category has been working in perception since earlier times (since much earlier times, both ontogenetically and phylogenetically) makes the transition of the symbol, from the particular referent to the abstract category, easier. This analogy between these two kinds of symbols (i.e., symbolic play and linguistic symbol) is little more than vague and imprecise. However, things would seem to have gone quite well thus far if we remember that we are not talking about genuine words but about what we have been calling linguistic symbols. Up to this point, my attempt to establish an analogy has found no serious objection and remains intact. But we should now address the difference that establishes a genuine distance between symbolic play and linguistic symbol.

9.4

Is the great difference between playful symbol and linguistic symbol a genuine obstacle to the connection between them?

9.4.1 What does this difference consist of? Let us examine the major obstacle to my projected analogy. Evocation would have to be achieved by the receiver of the linguistic symbol, but the receiver does not have to display the motor pattern. The receiver would be receiving in production-format, or, in other words, would be imitating the received speech, that is true; I accept this, as I said in 6.3. However, this imitation will normally be latent, not displayed. By contrast, the very essence of symbolic play is displayed action. The child really gallops; he does not stand still when he wants to evoke the horse that previously caught his eye. But let us focus on what really interests us: According to the hypothesis, this muscularly displayed action allows simulatory resources to be used for the exteroceptive evocation. (Here, we must remember the fragility of children’s recognition of iconic symbolic movements prior to the age of 2 years: cf. Namy [2008].) How then can the process by which a silent and motionless receiver evokes meanings be similar to this?

 Becoming Human

The difference becomes more prominent when we consider that in language the receiver is precisely the person who would have to be most involved in the task of evocation. Certainly it is not my view that this task is incompatible with the speaker. As we have seen in 8.9.2, I do not consider the idea of a subject inducing an evocation for herself to be contradictory in the slightest. However, accepting such self-induction of evocations does not make our problem easier to solve. Quite the contrary. In fact, it becomes more complicated, since we must then make room for the evocation that can be self-induced by means of inner speech. As a result, there would be not only one but two kinds of evocation without muscular display. What then? Should we abandon the project? I have no intention of giving up, but will instead try to justify the difference and build bridges between these two modes (displayed and latent) of the evocation process. Firstly, I will suggest that evocation is made easier for the recipient by repeating the message heard.

9.4.2 Children and the comprehension of displaced speech: The possible role of echolalic repetitions The strategy guiding this first argument is obvious. If I were right in envisaging a similarity between symbolic play and the articulatory-phonetic symbol, then we would have to conclude that displayed repetition could make it easier for ‘displaced speech’ to be understood. We should, therefore, ask ourselves when, under what circumstances or at what age so-called echolalic repetitions (repetitions of the message by the receiver) usually occur. Let us see this hypothetical deduction more specifically. In order to more easily evoke an absent object corresponding to a received linguistic message, the child would display the repetition of the word in question. When we adults receive a ‘displaced’ message, we can evoke a visual image without resorting to displayed motor reproduction of the word; this is beyond question. This would, initially, be no easy task for children, however. Their displayed repetition of the heard word would be almost analogous to symbolic play, much in the same way as any displayed production of linguistic symbols does, as I have already said. This is one possibility that would be deduced from the hypothesis. But do things really happen this way? I think it would be important to systematically research children’s reactions when they initially receive displaced-speech messages. We should investigate, in particular, if the cases of echolalic repetition tend to be more frequent with displaced messages than with non-displaced messages. In this respect, my impression as a mother is that children initially display in their repetition at least the key element of the displaced-speech message that they hear. However, these impressions would require real experimental data. Likewise, the late occurrence of the ability for silent reading could also be related to this. This kind of reading appears at a late stage not only in children, but also in history (in this respect, the long-established tradition is to cite a paragraph from the

Chapter 9.â•‡ From symbolic play to linguistic symbol 

Confessions of Saint Augustine). Of course, there is an additional factor involved in reading that is absent from displaced-speech comprehension. Undoubtedly, human adults do not need to repeat the displaced speech they hear (it is not essential, at least), not even in societies without writing systems, not to mention societies with only the old kind of reading. This contrast between oral reception and reading is due to another difficulty (mainly. the lack of intonation) being added to the difficulties already caused by ‘displacement’ in the reading of real communicative texts. However, according to this suggestion, the difficulty involved in silent reading could, to some extent, be related to the role of the muscular display in evocations. Furthermore, moving beyond children, echolalia is a very common symptom in many different kinds of brain disorders. In addition, echolalic repetitions tend to be more frequent as the patient’s interest or motivation in understanding a particular message increases. All evidence suggests that echolalic repetition may facilitate the comprehension process.

9.4.3 Latent, displayed, latent: Is my hypothesis on simulation overly complex? In the previous subsection, we put forward a very poor argument that was both incomplete and weak. In addition to this, and to make matters worse, we must also acknowledge that we have not described our great obstacle as we should have. We shall now properly describe this problem, and see the shape our general hypothesis on simulation has taken. What have we been saying about simulation thus far? In the first chapters of this book, I rejected the idea that simulation requires inhibition. Simulation was then ascribed to the second mental centre and, consequently, dissociated from the subject’s behaviour. Later, in 8.5, I accepted the involvement of this centre in real motor sequences. Finally, as the icing on the cake, we are now suggesting that evocations (which we had made dependent on that muscularly displayed simulation, when we examined symbolic play) might also be achieved with merely latent simulation. All this seems unacceptably complex. Simulation would be first latent, then displayed, and then latent once more. How can we accept so many twists and turns? I think it is better not to mix our specific question with the general methodological or epistemological question. Later on we shall see what happens to our specific question on simulation and the analogy between symbolic play and linguistic symbol. At this point, however, our question must be: how much of a negative impact should we allow this excessive complexity to have? I would say that we should at least avoid automatically rejecting it. Certainly, limpid linearity is very in tune (it is in tune, by definition) with what our reasoning might lead us to expect. However, we must be prudent as regards these comfortable impressions of harmony. It is not only that sometimes we have to accept other more tortuous and clumsy possibilities; also, and above all, our vision of things might be excessively partial and inadequate. What would appear lineal and what would be

 Becoming Human

clumsy if we could take the whole of reality into account, with all its levels? Has anyone an answer to this?9 But let us now leave these dizzy heights. Is there a way to rescue our project of establishing an analogy between symbolic play and the linguistic symbol? Let us look again at the small differences we described above between symbolic play and the linguistic symbol. Perhaps one of them can help us to overcome the obstacle in the way of our project.

9.4.4 Why might the latent route to evocation occur more readily in language? The arbitrariness of the linguistic symbol could be one of the reasons why motor display is less important in language than it is in symbolic play. We have already seen that in play the child’s movements correspond strictly to the core of the visual content evoked. This correspondence, which has its roots in the matching achieved by chimpanzees between one’s own body and the body of others, undoubtedly facilitates the process that leads from the child’s displayed movements to the visual evocation of the model’s movements. However, when this inherent correspondence gives way to the arbitrary connection in the linguistic symbol, this facilitation would no longer occur. Consequently, the muscularly-displayed route toward evocation would lose its privileged condition, at least in part. But there is another factor which also contributes to the weakening of that privileged condition and which is probably much more decisive than the arbitrariness. I am referring to the social or phonemic model. Let us remind ourselves of the essence of the phonemic model, i.e. what distinguishes the phonemic level from the phonetic level. Phonemes can never be pronounced nor can they be heard. As soon as a phoneme is pronounced or heard, it ceases to be a phoneme and is incorporated into the phonetic level. What are the consequences of this? The activation of a linguistic symbol becomes independent from the motor level. A merely latent activation can be as efficient as displayed activation. The phonemic level is inherently alien to muscular reproduction. Compare this to the kinaesthetic simulation corresponding to the prior phase of symbolic play. Certainly, the child is performing only latent imitation at that moment. 9. History has often revealed how myopic the procedures initially considered as more direct and efficient really were. In evolution, for example, I admire the tortuous and opportunist paths or bricolages which end up being so extremely efficient. In other words, real evolution is starting to look much more beautiful and impressive than anything we had ever suspected. Let us now look back at the two dramatically different aprioristic models made on this issue: on one hand, the ‘spirit’ advocated by dualist philosophy; on the other, the ‘planning without blunders’ invoked nowadays by the philosophy of biology to proclaim the inferiority of real evolution. What do we see? That these assumed perfect models are applicable only to unjustifiably restricted fields (this makes me think of the old, paradigmatic case – the pre-Keplerian ‘perfect circles’–), and also that, on the other hand, the presumed blunders in fact form a procedure which is impressively capable of general efficiency (or, more precisely, it is satisfying, in the sense that Herbert Simon [1957] gave to this term).

Chapter 9.â•‡ From symbolic play to linguistic symbol 

However, with that latent imitation the child is adding the relevant kinaesthesia to his visual perception. Likewise, during the game phase itself, the child adds the appropriate visual evocation aspect to his real movements: thus, from that motor core, will be able to evoke the no-longer-perceptible model scene. In other words, what happens in the two stages of symbolic play is this: when a subject experiences one aspect – kinaesthetic or visual – of a real movement, the other aspect, the aspect which the subject’s current experience lacks, is then assigned to the first. By contrast, things differ slightly for linguistic symbols. Here, there is a revolutionarily new intermediate stop between the observation of a movement and the reproduction of that movement, and this new stop is the only really important one. The true model does not coincide with any real activity. The social and phonemic model is disassociated from all the displayed movements, both those of the learner and those of the ‘teachers’.10 Now we are getting to the point. The great obstacle, the enormous difference between a playful symbol and a linguistic symbol is beginning to dissolve. We need only to accept that the model followed in language is truly phonemic and social. If we accept this, there would then be no reason why the merely latent imitation of a linguistic symbol to be insurmountably different from its displayed reproduction. Of course, the latent activation of this model may occur either spontaneously or after perceptions are analysed. These two modalities are very different. Speech reception and inhibited or inner speech need to be kept well apart. The inability to distinguish one from the other is a serious pathology.11 However, we are interested here not in comparing both modalities of latency, but in comparing latent activation with displayed activation. We have seen that the phonemic model – the truly important point, and the one to which the semantic content is connected – is not identified with muscularly displayed reproduction. This solves the objection raised against the analogy between the two modes of evocation.

9.5

The phonemic-social model, the expansion of working memory and inner speech

The phonemic model would have an extremely important consequence (although this probably would not arise immediately, but only some time after the phonemic model 10. Nevertheless the social and phonemic model is always and exclusively connected – I am stressing the other side of the coin now – to the production format. 11. We return here to the issue already dealt with in 9.3. The latent activation in the recipient and the latent activation peculiar to inner speech must be differentiated by the subject herself. This is not only a very important task, but one which is also probably difficult and vulnerable. Clearly, in my opinion, the current trend that considers schizophrenia as a kind of quintessence of the human cognitive peculiarity is exaggerated. However, we can almost certainly assume that the normal processes which have been altered in this pathology are exclusively human.

 Becoming Human

had taken root). I am referring to the ease and speed it would give to speech reception, and, even more so, to inner speech. Every previous case of latent imitation (i.e. the sequential latent imitation already seen in the learning of new motor patterns and in the observation phase, or preliminary phase, of symbolic play) would require the successive motor steps to be aligned. Consequently, the limits of the latently-imitated sequence are determined by the number of motor steps involved. Any reinforcement of the working-memory would require, therefore, a dramatic change. (Coolidge & Wynn [2005] focus on the reinforcement of the working-memory) That change came about as a result of the social or phonemic model peculiar to linguistic symbols. This model is per se impossible to perform. A pattern is performed because one of the many possibilities that make up the model has been selected, or in other words, because what is being performed is not really the model. As a result, in motor speech reception, the abstract unit that is the social model will be preferable to the simulation of each of its motor steps. If it is indeed possible to work with the model itself in the reception, why would this possibility be rejected? Motor speech reception can, therefore, dispense with attending to the motor steps involved in each learned pattern. This is a (completely new) kind of simulation that would have been beyond the reach of a language learner. But an expert recipient no longer needs to activate each of the motor steps in his simulatory centre, only the coded unit. The codified, abstract unit acts as one single item for the working memory.12 It was not until I read Lieberman (1991), that I fully realised the importance of this question. Lieberman insists that the working memory involved in speech reception cannot be compared with the working memory involved in any other kind of perception. Certainly the working memory involved in speech reception must include not only words but also the emotional tone or the sound volume: cf. e. g. Port (2007).13 However it is only words that (through the unperformable social-phonemic model) can stop being sequences and become units. So it is from them that some relief can be got for the working memory involved in speech reception. In my view, the fact that these unperformable social models are supplied by cultural learning would be crucial.14 It is as a result of this alone that motor reception, in the case of speech, can be dramatically more efficient than it is in other fields. In other 12. Here, our hypothesis – the ‘second mental centre’– is focussing on the working memory. Consequently, here it seems similar to the ‘second mind’ in the ‘two minds hypothesis’ (Evans, [2003], and [2009]). However this is a very superficial similarity: the interpersonal origin of the ‘second mental centre’ is not shared by Evans’ construct. 13. See also Kappes et al. (2009, p. 148): “Phonologically irrelevant acoustic or phonetic information contained in a spoken model may survive the translation from perception into action, even though the instruction to repeat a word or nonword is only meant to induce a reproduction of its linguistic information”. 14. Phonemic model and social group are linked. “If a single talker’s voice was used for training, the learning only generalized to new tokens, not to other talkers.” (Samuel, in press, p. 64).

Chapter 9.â•‡ From symbolic play to linguistic symbol 

words, only as a result of that unperformability can the ‘double linguistic articulation’, with its different-order units, really affect the brain’s reality and provide a dramatic increase in working memory.15 We are saying that, when the phonemic-social model becomes fully rooted in an individual, the reception of the motor patterns constituting language can then begin to treat as a unity each pattern constituting a linguistic meaning, and no longer as a sequence of motor steps (See supra, 6.3.3). Can we place this moment within the trajectory of a child’s development? I would suggest that the inner speech can indicate that the change has already taken place. Or rather, the way towards the inner speech would begin after the phonemic-social model had fully taken root.16 Firstly, because, as Oppenheim & Dell (2008, p. 534) found, “inner speech slips exhibit lexical bias” (where the unity of the word is the key), “but not the phonemic similarity effect” (where the sequence of motor steps is the key). Secondly, because, if (even when we are on our own and we do not need to, therefore, defend the privacy of our thoughts) we adults prefer thinking in inner speech and not out loud, this might be due to the advantages provided by the phonemic-social model, namely, greater speed, reduced effort, and the consequent expansion of working memory. Thirdly, because, if instead of being based on the phonemic-social model, inner speech respected the sequence of motor steps within a word, then (as I have said in 9.3 above) it might more easily provoke the unwanted evocation of auditory aspects. In addition to inner speech, there would be another indicator of the change, namely the predominance of semantic associations over articulatoryphonetic associations (a predominance which in small children and also in some pathologies is weak or is inverted: Luria [orig. 1979], and Huang & Snedeker [in press].). What exactly would be failing in such cases as small children or pathologies, where reception cannot entirely do without each word’s motor sequence? In cases like this, the word would no longer be being used as an instantaneously-produced unit. This would therefore cancel out not only the expansion of working memory, but also (we are addressing now the relationship between this inability and inner speech) the possibility of a not very costly instrument to regulate one’s own attention.17 (See Chapter 4, note 9, p. 75 and also 8.9.2). The unperformable social models would have led to the increase in the working memory. But was it enough to set up this model (and not each single motor step) as the new unit for latent imitation? Or would this substitution have required strengthening 15. Immaturity of working memory is probably responsible for immature or impaired language. (See Tuller et al. [in press], who focus on impaired production of third-person accusative clitics in French.) 16. Guillaume (1949, p. 52): “Three elements in the language – phonemes, the signified and inner speech – are psychisés” (i.e., turned into psychic elements, if we want to traslate this oldfashioned French term). 17. “Working memory, in particular, lies closest to the intersection between external and internal attention”: Chun et al. (in press).

 Becoming Human

in the brain? In other words, was a mutation required in order to create space for this increase, to enable us to cope with it? Perhaps the FOXP2 gene? I am not overly tempted by this because I find the very idea of brain reinforcement unconvincing. We should remember that chunking may be indefinitely progressive. One label would be sufficient to transform a former list of three or four elements into one single attentional unit. In an outline I can, for example, include the three issues addressed in a lecture and consider this triad as a single attentional unit. However, each point in the outline will include several sub-points. Would we have to postulate a reconfiguration of the brain for each similar extension of these skills? Graphics, even words such as ‘exposition’, ‘climax’ and ‘resolution’: would each of these have required a modification of our brain?18 But let us leave this question and pick up our main discussion once more.

9.6 Language and the adaptive advantages of symbolic play As a result, therefore, of the social or phonemic model, or, more precisely, as a result of the ‘impossible-to-perform’ model, our great obstacle would have been cancelled out. It is immaterial that, contrary to what happens in symbolic play, there is no displayed movement in linguistic reception. Despite this difference, we can establish the analogy between symbolic play and the linguistic symbol. But let us set out clearly what this analogy would be like. By no means am I placing the pretence (or ludic symbol) and the linguistic symbol on the same footing. The ludic approach to evocation is earlier in children (earlier than ‘displaced speech’). But this is exactly what we might have expected. If this were not the case, the adaptive advantages of symbolic play would simply not exist. In my view, that play is a preparatory exercise in which some of the abilities associated with language production and reception could be developed in order to eventually reach the high level later required by language.19 (Scarcity of symbolic play and linguistic impairment 18. Of course, in perception it is always possible to change the zoom: a person perceiving a wide scene can become occupied with a small area immediately afterwards. Is there any difference between this possibility and the phenomenon we have been looking at? I would say that an important difference lies in that the words which respectively label the different sections of the conference have this possibility (the possibility of a more detailed action) being built into them in a more objective and normative way. The same occurs with anaphora that summarise an earlier part of the text. In perception, on the other hand, a given continuation is as valid as any other. However, this is not the moment to address this question, which clearly involves both semantic and episodic links of words (that is, not only the ‘typical’ links, but also those exclusive to a single text). 19. However, this argument implies the usefulness of language and this usefulness is too controversial for us to let it go without any further comment. Language would be advantageous for the individual (adaptively advantageous, not psychologically rewarding, I mean). (“For the individual?”, someone might object, invoking ‘group selection’. However, I prefer the opinion that “group selection can never apply in situations when kin selection, or, in Hamilton’s [1964] words,

Chapter 9.â•‡ From symbolic play to linguistic symbol 

are both typical features found in the majority of autistic children. “There is good evidence for an impairment in the spontaneous symbolic play of autistic children”: Jarrold et al. [2005, p. 281]. The exact relationship between these two features is not yet clear. However, the correlation is most suggestive.) In light of the above, we can go over the differences between symbolic play and language once more. In this way, the obvious difference between the muscles involved ‘indirect fitness benefits for the focal subject’, cannot explain cooperation”: West et al. [in press]). Why would sharing information be advantageous for the individual? (Lies could certainly be useful for the individual. However, lies, obviously, must be a small subset of the larger set of messages, otherwise the very possibility of lying would self-destruct. Consequently, the objector’s question still stands.) A first reply might be this: in a group able to handle language, the inability to use language would be absolutely lethal for an individual’s reproductive success. It is not that individuals with language are at a greater advantage; instead, something much more dramatic occurs, namely, that individuals unable to use language are completely marginalised. Is the objection raised that I have not explained the origin of language within the group? Does someone reply that with this apparent solution I have pushed a mosquito aside just to swallow an elephant? I would state that in my hypothesis, as we shall see shortly, full language would have emerged as a historical and not biological derivation. But beyond this, we come face to face with the hard nucleus of the question. The basic human ability, which, without, without any shadow of doubt, I consider a biological change that would have been selected by evolution, also serves to transmit useful information to other individuals – a behaviour which (with the exception of ‘kin altruism’) would allegedly be scandalous for ‘individual selection’. What is happening here? A first response is sketched out above, p. 83, in Chapter 5, note 4: collaborative tasks would bring immediate results for each of the collaborators. Likewise, ‘reciprocal altruism’ and ‘reputation building’ will bring important, although non-immediate, advantages. In addition to these two resources, a slightly different type of advantage has been emphasised by Dessalles (2007). But there might be something more. Let us think about animal signals which express pain. It is highly unlikely that the howl from an animal hit by a predator or by falling rocks provides any advantage to the animal. Certainly it is possible that the howl (even if it originally had useful functions for the individual producing it –“either of startling a predator, inducing mobbing by cospecifics, or attracting a competing, larger, predator to attack the initial threat source”, Owings & Morton [1998, p. 115], or also human help got by dogs’ vocalisation, according to Miklósi & Topál [2005]) it is later maintained – in the situation of fallen rocks, for example – as a mere relic, with no function left. Certainly, that is a possibility. However, there may be another explanation: When the animal’s howl is heard, its peers (not only its close family), who possess the corresponding innate response to the receiving howl, will move away from that mortal danger. Do we need to call on something different from individual selection? Perhaps it would help to focus on the first individual who produced howling as a response to pain or fear, and possessed the (merely potential at that moment) corresponding, fearful, reception and who transmitted this feature to its offspring. This individual’s reproductive success would receive a boost each time the howl of one of its offspring saves another’s life. The advantage for the total of the offspring means an individual advantage for their ancestor, no matter how far-related that ancestor is. (Cf. West et al. [in press]: “It is genetic similarity, not close kinship per se, which drives indirect fitness benefits (...). It is wrong to think that relatedness is only high between members of the nuclear family”.)

 Becoming Human

in each will be highlighted. Hands and body, on the one hand; vocal tract, on the other. We might bring in at this point the fact that the motor abilities of limbs develop earlier than those of vocal tract. It is possible to state, without any recapitulationist bias, that this is the case in the evolution of primates as well as in ontogeny. Remember that children who, for example, are already able to imitate the hand clapping or ‘good-bye’ gestures and even children playing at feeding an adult, are still at the initial, not-controlled, babbling stage (It is worth noting that manual gestures dominate the use of words in communication into the second year: see Capirci & Volterra [2008].) Kinaesthetic interpretation of any type and, more specifically, motor simulation, are therefore very likely to follow this same order in their development. Thus, the preparatory function that we are ascribing to symbolic play would be fulfilled in this respect. The analogy between the movement involved in symbolic play and the content evoked by it is, needless to say, another issue worth raising. We have already seen that there is a clear contrast between symbolic play and the arbitrariness of language in this respect. We can now add that this analogy probably facilitates the task of evoking. But we need to address the core question here. Why is it we can talk about symbolic play’s exercising role? Why do I think it is plausible that symbolic play is subordinate to language? Certainly, we could see in symbolic play some sort of relic of previous stages – e. g. the presumed ‘pantomime communication stage’ (Arbib, [2005, p. 110]). Provided these stages are interpreted as historical, intraspecific developments, this possibility could not, in principle, be dispensed with.20 It would certainly be possible, I repeat, to explain current symbolic play as the now almost useless relic of a behaviour that had adaptive advantages in previous periods of our species. This trend is, of course, widespread in Evolutionary Psychology. But let us suppose that certain adaptive advantages for today’s children are discovered in symbolic play. There is no doubt that this would be good news. Should we start looking, then, for candidates for such possible advantages? The choice is obvious. Symbolic play would be adaptive because it exercises and strengthens some of the abilities required by language. In addition, of course, we must emphasise that pantomime and mimesis retain their communicative utility even after the emergence of language: Certainly, without these abilities we could not communicate with foreigners or deaf people. But, and I stress this idea, the true usefulness of symbolic game derives from that exercise and strengthening.

20. It would, in contrast, be reasonable to dismiss any attempt to explain a feature by virtue of the adaptive advantages obtained in this way for a different species. In 1.1, we referred to an explanation of this kind where the adaptive advantages were invoked for a subsequent species. In our present case, the dismissible explanation would be focusing on adaptive advantages that would only have been advantageous for a preceding species. Although this second scenario would not be quite so easy to reject, the explanatory strategy involved is also rather unhealthy. Symbolic play is an absolute universal ruled by a biological clock, so to speak. Can this be explained only by the advantages it would have obtained for a species different to our own?

Chapter 9.â•‡ From symbolic play to linguistic symbol 

9.7

Taking a step towards the next section: ‘Linguistic symbol’ versus linguistic meaning

What we have so far been calling linguistic symbol is one of the elements of human language. There is, however, considerable distance between this element and genuine linguistic meaning. We have already referred to this issue but it will help to open it out a little further here. What happens with genuine linguistic meanings? On the one hand, they can often dispense with all evocative content, for example, prepositions or conjunctions. On the other hand, linguistic meanings must inevitably belong to a specific syntactic category or part of the sentence. However, what happens in the case of what I have called the linguistic symbol? In this case the evocative content is indispensable. It constitutes the very essence of any symbol. By contrast, the appearance of the syntactic ingredient is not so clear, at least it does not appear as indisputably as in language. An evoked content certainly includes a subject, an action, a place, qualities...; but they all appear as a whole at the same time, without the need to add them one by one. Can we call this a syntactic combination? Is the richness of detail enough to transform this unitary magma into syntax? (See above, at the end of 9.3). A clear answer, I repeat, is not easy to find. To better understand the difference between linguistic symbol and word, let us focus on a word – a linguistic meaning – that might seem, at first sight, more closely related to a symbol, i.e. the meaning of a non-abstract noun phrase. Are they the same? Certainly, in this rough formulation, the question can be interpreted as relating to several aspects. But I would ask for benevolence and ask the reader to understand it as limited to the requirements of our argument. Are that linguistic symbol content and that word the same thing? The evocation of an object is complete in itself. By contrast, a noun phrase requires links with a verb or adjective. The semantics of words cannot be dissociated from syntax. If we want to define the categories of verb, noun, adjective and so on, we will find no other way to do so than to consider them as parts of a sentence, that is, to consider them just as elements designed for syntax. Why am I repeating these familiar, timeworn ideas? My aim is to prepare the transition to the next Section. In plain terms: my point here is that if we wish to investigate the emergence of linguistic meaning, we will inevitably have to address syntax. The reader will probably have guessed my aim for the following chapters. I will try to investigate if syntax will also ultimately be derivable from ‘the basic human ability’. What is the relationship of syntax with the perception of the self of others (i.e., with the second mental centre)? This is our next step.

section four

The origin of predication and syntax

chapter 10

From the general exposition to the crucial requisite achieved by the protodeclarative 10.1 The origin of syntax: Overview of the hypothesis In this section, we will seek to link the origin of syntax to the perception of beliefs of others. In so doing, we will also move forward in our review of the studies of the ‘theory (or rather grasping) of the (one’s own and of an other’s) mind’. The ‘basic human capacity’ that we have been suggesting certainly belongs in an area addressed by those studies. For some time, topics such as symbolic play or pointing with a finger have been included within the ‘Theory of Mind’. Nevertheless, it cannot be denied that the perception of false beliefs of others was, at first, the nucleus of those studies. It is now time, therefore, for the present study to tackle this. As I have said, I shall focus particularly on the relationships (bidirectional relationships, as we shall see) between beliefs of others and language. However, all this will have to wait. What is it I understand by the ‘origin of syntax’? In order to immediately give the reader the key to what will follow, I shall begin by setting out a non-argued overview of my suggestion. All perception provides a (relatively1) rich repertoire of features and details, but none of them can be attended to separately. In other words, perception lacks genuine compositionality. This compositionality would have emerged taking as a starting point communication by means of (at least, two) words or pre-words. What I will be calling syntax is that compositionality, either pregrammatical or already grammaticalized. Syntax must have arisen for some communicative function for which it was an essential requirement. There is no essential need for syntax in the conative-appellative function, i.e., the communicative function that takes the form of calls, requests or commands. Certainly, there are conatives that use very complex syntax. But my only point is that syntax would not be indispensable for simple conatives. Likewise, syntax is totally superfluous for the emotional release usually labelled as ‘expressive function’. Certainly, this latter is less relevant, given that the emotional release would move beyond 1. How rich is perception? This is a controversial issue. Although many theories of consciousness posit a dissociation between ‘phenomenal’ consciousness (which is rich) and ‘access’ consciousness (which is limited), Kouider et al. (in press) persuasively argue that an impression of phenomenally rich experiences can arise from illusory contents, i.e., they argue in favour of ‘the partial awareness hypothesis’. Anyway, my nuclear claim (non-compositional perception versus compositional syntax) is not influenced by this controversy.

 Becoming Human

communication intentionally directed to a recipient, and would be found, therefore, more distant from language. Nevertheless, it is useful for our summary. Syntax would not have originated either for the conative-appellative or for the expressive function. As a result, this would lead us to think that syntax appeared for the communicative function of predication. As we will see later, questions also fundamentally need syntax, but they would have been secondary and derived from predication. Thus, we are returned to the question of how the communicative function of predication originated. This communicative function would be linked, rather than to the ‘noun phrase or subject’ and ‘verbal phrase or predicate’ hinge, to that other hinge that has been labelled as theme (or topic), and rheme (or comment). The ‘theme/rheme’ structure would have been the original form. Only after a long history of progressive ‘grammaticalisation’ would that original form have given rise to the syntactic forms of different languages. At this point, then, for me, the question about the origin of syntax has become the question about the theme/rheme structure. How well defined are these concepts? Do we understand fully what the theme is? This question is where my hypothesis about the origin of syntax will begin. What we shall propose here is that the theme, instead of being, as it is normally considered to be, an object known to the hearer (‘known to the hearer in the opinion of the speaker’), would be, in contrast, an insufficient, incomplete or erroneous piece of knowledge that the hearer has about the object in question. We thus connect with the perception of beliefs of others. In the mouth of the speaker, the theme is really the belief of the hearer, that is, a second-order mental state. The origin of syntax would immediately follow the origin of the perception of false beliefs of others. How would those beliefs have been originally perceived? A core point of the hypothesis consists in specifying the route that such perception would have taken. Further on, we will pose other questions – we shall pose them, at least, if not always answer them. Are those beliefs which are completed or corrected in predication always of others? How do proper syntactic rules come to be constituted? However, these matters will remain outside the bird’s-eye view we are laying out here. The preceding sequence will be enough for now. Arguing for it will keep us busy for several chapters.

10.2 Some initial clarifications 10.2.1 Biology and history Before we move on to this, we will have to clarify one or two things. What origin have I been talking about? Am I locating myself only on the level of ontogenesis? No, I would reply, I am situating myself not only on the ontogenetic level, but on the historical level too.

Chapter 10.â•‡ From the general exposition to the crucial requisite 

History, I should make clear, not phylogenesis. The reader will recall my indecision surrounding the ‘big extension (or new function) of the simulatory centre’ (in 8.10). Did this extension occur in the same biological species where the ‘basic human capacity’ first appeared? Or in another different species? That was left hanging. In contrast, my bet is clear on this point about the genesis of syntax: History and not evolution; interpersonal genesis and not biological change. Certainly this may sound as a shock, since this book devoted several chapters to macaques and chimpanzees. Thus, this is a good moment to focus on the approach I have taken to the relationship between biology and history. Human exclusivity clearly has a biological foundation, but there would be no need to include within the biological foundation itself any feature which could be historically derivable from it. The biological innovation either would be identified with the ‘basic capacity’ itself, or, according to other less extreme alternatives, with what we have called ‘the first big extension of the simulatory centre’. (In addition, of course, as I said in 8.10, other biological changes, which probably arose in earlier species than Homo sapiens sapiens, were also required, such as the development of the vocal apparatus and its neural control – see Ploog [2002] and Jarvis [2006] –, in short, “a patchwork of different precursor systems that were present in ancestral species” – Gervain & Mehler [2010], or also Yip [2006]). Given this biological foundation, we can now turn from biology to the question of whether and how syntax could emerge through a historical process that did not require further biological evolution. However, this raises a question. What do I need to say as regards the possibility that syntax might be an innate mechanism? I do not believe we have the resources today to reject syntactic innatism*. However, neither do the resources exist to reject its opposite.2 The only thing we can do in a case such as this is what is called ‘wager and see’ (see Chapter 6, note 6, p. 96). What is incumbent on us is to draw inferences, derivations and conclusions from the content of our wagers. Let us place the topic against a slightly more historical background. Classic antigenerativist critiques – let us call them first-generation critiques – located their efforts 2. Will the next step be to debate who has the ‘burden of proof ’? I see no usefulness in this. Certainly, it is not just convenient, but necessary, for scientific research to be interpersonal, social, and almost judicially controversial in nature (cf. the ‘scientific controversy’ as defined by Dascal [1995]). The classic saying ‘From debate comes light’ has more and more been revealed to be right. In reality, individual scientific research itself must, by demanding evidence and by raising self-objections, copy the processes of interpersonal debate. (Note that this would be an example of Vygotsky’s sequence of development – the sequence from interpersonal to intrapersonal processes. In my opinion, it would be a better example than some of the dubious processes Vygotsky mentions in this regard. I am thinking in particular of how he explains – see above, 4.1– the origin of the pointing gesture.) All this, I repeat, is entirely true. However, there comes a moment when to keep focusing on the relationship with the opposite theoretical side, and not seek progress directly – progress which is the common goal, after all – becomes an absurd and useless attitude. I shall dispense, therefore, with any discussion about the ‘burden of proof ’.

 Becoming Human

in the wrong place: Plenty of arguments against, but no alternative explanations. Generativism remained intact, absolutely untainted, faced with such commotion. Nowadays, the situation has improved and new critiques at least contain promises of alternatives. However, I sense that much still needs to be done before models of cerebral processing that can extract grammar with all its details from the input can be produced, and which – this is the question – can do so without assuming innate syntactic ability of any kind. This cerebral processing would have to be much more sophisticated than all those that have been modelled up to now. In short, if I believe in this processing, in this alternative explanation, it is only because of my wager on it, not because I believe it is available. We shall suggest here – as I announced earlier – an interpersonal and historical genesis for compositionality. This suggestion might strengthen our confidence, may calm our fear of losing the wager. However, what this suggestion in no way achieves – nor, I would say, do the other suggestions available today – is to offer the longed-for alternative explanation. To be patient and take a long-term view, therefore, is our lot.

10.2.2 Ontogenesis and historical origin A second question is that of origins. It has always been clear that a child merely learns language, whereas at the historical origin language had to be created. Attempting to relate these origins to each other is, therefore, an extremely dangerous task. Of course, there is nothing to prevent us taking inspiration from children in order to shape a hypothesis about the historical origins of language. However, the risks are enormous since there is no form of validation involved. What can we take from children? What can we not take? There are no criteria available in advance to guide us. It may help us somewhat to look at children’s failures, i.e., their progressively decreasing inabilities, rather than at what they have attained or acquired. We cannot say that children’s failures have been taken or absorbed from their social groups, at very least. But we do find plenty of failures and mistakes in linguistic skills in children. Which of them might be of interest to us? A sensible option would seem to be those mistakes relating to aspects common to all languages and which also appear universally in all children. However, such an option – this much is clear – falls very short of pointing us toward the specific facts we might extrapolate from the child to the murky historic origins. So, then, what do we do? Once again, we are left with what we have called placing a wager. Since we do not have a clear set of data from which to work, let us do things inversely and take those observations that are relevant, according to our hypothesis. Thirdly, one may ask if the mechanisms of grammaticization and coinage we see at work in historical linguistics would suffice for language origins. I am inclined to say no. My pessimism has to do, obviously, with my assumption (at the heart of my hypothesis, as I have said) that language compositionality differs dramatically from richly detailed prelinguistic perception. In other words: in my view the genesis of compositionality involves an extremely difficult and demanding journey. For the same reason

Chapter 10.â•‡ From the general exposition to the crucial requisite 

I am equally pessimistic about the possibility that the research on the Nicaraguan Sign Language3 or the classic studies by Bickerton on pidgin languages (see in Bickerton, 2008 his most recent survey on the issue), may clarify that journey (although they certainly shed some light on the mechanisms of grammaticization we see at work in historical linguistics). I think that the pidgin speakers (and even those Nicaraguan deaf people4) would have already perceived the core of linguistic compositionality. But let us not despair.

10.3 Communicative functions and the need for syntax: Presyntactic conative messages? Let us move on to other more specific issues. Why have I said that syntax would only be indispensable for predicative communication? Where do I find authority to separate conative communication and syntax? Is it so clear that orders or petitions can be alien to syntax?

10.3.1 A first (invalid) response If we switch from a discussion of historical origins to child development, we observe that during their holophrastic stage (a holophrase is a single-word message) children mainly voice messages which have a conative function, that is, messages whose immediate objective is to attempt to adapt the world to the sender’s interests. This convergence between conative function and the absence of syntax is, of course, suggestive. Nevertheless, by no means can we take it as our base. We must take note of the fact that conatives are not the only messages children voice during this stage. The rest of their holophrases have a function known as protodeclarative. The child points at an object with its finger, alternates its gaze between the object and the addressee and speaks the object’s name. As many authors have said, in this there would be an attempt at ‘meaning negotiation’. The child would look for confirmation for this correspondence between signifier and signified. In addition, and more importantly, the child would be looking to provoke linguistic comments from the adult about the 3. Sign language that has evolved through successive “cohorts” thanks to the communication between the deaf people themselves: Senghas & Coppola (2001). This study is extremely interesting. 4. In these cases, the idea of compositionality would have already been transmitted to the individual. Cf. Flaherty & Goldin-Meadow (2010, p. 404): “Hearing parents not committed to training their deaf children orally are more open to producing gestures in absence of speech. And gestures produced without speech (versus gestures produced along with speech) have been found to display linguistic properties of segmentation and hierarchical combination. We observed that the majority of the utterances that Nicaraguan hearing parents produce contain gesture without speech.”

 Becoming Human

object in question (cf. Southgate et al. [2007]). It is evident that the protodeclarative creates the ideal context for learning new words. Although the words that appear in the adult’s comments may be unknown to the child, it will rely on the trick of knowing which object such comments refer to. We can conclude that protodeclarative holophrases form part of a ‘toolbox’ for linguistic learning. However, this does not respond to the question of interest to us about protodeclarative holophrases. Do they constitute a piece of communication with a predicative function? Or, to connect more directly with our topic, can we say that predicative communication would have a pre-syntactic root? If it did, it would then be incorrect to suggest that predication fundamentally needs syntax. Looking for a common origin for predication and syntax would thus be a strategy entirely lacking in basis. In short, the question of whether protodeclarative holophrases are related to predications is absolutely crucial to our purposes. I have only one fact to hold on to. From a strictly communicative point of view, protodeclarative holophrases are completely redundant and useless. By using the finger and the gaze to point, we communicate to the recipient the same information as with the linguistic protodeclarative (at least within a situation and within customs established between producer and recipient – we clarify in honour of the ‘Gavagai’*). As a result, then, it is not plausible that protodeclarative holophrases should have a direct connection to the advantages provided by language. The connection would be only the indirect connection to which we have already alluded. For language to carry out any function, linguistic learning has to occur. And the only advantage that protodeclarative holophrases provide is to ease that learning. This conclusion to which we are inclined takes on greater strength if we take the following into account. When children reach the stage where they are able to combine words, they will produce some messages which, although superficially resembling protodeclarative holophrases, are really very different from everything we find in the holophrastic period. In the new kind, children (and also adults, of course) do not say the word that designates the object, but a word that signals a quality or fact about it (cf. GoldinMeadow, 1999). Here, then, unlike what occurred in the protodeclarative holophrase, the verbal message carries information not contained in the pointing gesture. But let us return to our question. We were asking if protodeclarative holophrases are an obstacle to our idea that syntax originated historically with predicative communication. We hold that predications provide information to the hearer, and that protodeclarative holophrases, by contrast, do not do this at all. Thus, the pre-syntactic nature of the latter does not force us to reject our idea. The predicative function would not have its initial root outside of syntax.

10.3.2 Toward a second response With this, the way toward the conclusion we wanted seems clear. There are conatives in the child’s holophrastic phase. There are no predications in this phase. Therefore, it

Chapter 10.â•‡ From the general exposition to the crucial requisite 

is plausible to conclude that syntax came into existence for the communicative function of predication. But all this argumentation has, in reality, feet of clay. This response – a first response, we called it at the time – is not enough. The difficulty lies in our having all too easily accepted a clear opposition between holophrase and syntax. Holophrases contain, it is true, a single word. However, is this fact enough to rule out the involvement of syntax? Children’s holophrases consist either of names (‘Mummy!’) or adverbs (‘More!’, ‘There!’), etc.; in short, they consist of a meaning which is a ‘part of the sentence’. These meanings, as I have already said at the end of the previous chapter, inevitably involve syntax. The semantics of our present language necessarily comes after syntax and depends on it. Certainly, we might think that holophrastic children, precisely because they are holophrastic, that is, because they lack syntax, would not adequately understand the meanings they are using. Something of this is indeed present, in my view. Undeniably, we cannot project our mature and full meanings onto the child’s holophrase. However much we may hear one of our words from a child’s mouth, the child’s meaning is not the same as ours.5 All this is, I repeat, true.6 However, this lack of understanding of meaning cannot be total in the production of the holophrase. At the end of the day, the child has learned from its social group the meanings it uses in its holophrases. Admitting a certain degree of syntactic involvement in the meaning produced by the holophrastic child would seem inevitable. But then the first response we gave will fail to that same degree. As a result, contrary to what might have seemed the case at first, the point that syntax would only appear in children when they access predicative communication has not been established. What do we do at this point? We have to begin by allowing, I repeat, that the holophrastic child’s conative messages are contaminated by syntax. In my opinion, syntax, since it came into existence for predications, has been unavoidably involved in every message that uses words. All this must be admitted. However, this involvement does not perhaps derive from an 5. One of the authors who most insistently has highlighted the difference between the meaning of the same word in children and adults was Luria (1979). Clearly, he was not referring to the radical difference between the preword constituted by the holophrastic meaning and the true linguistic word. What he points to is that the meaning that a child may have for ‘coal’, for example (this example in Luria, today, for our children, no longer works) is very far from the meaning that adults with their knowledge of economy or industry will see in the same word. Nevertheless, in my view, both differences derive from the same fact. The acquisition of a word is all the more complete the more past textual links are held in the meaning of that word. Later, we shall examine this point more carefully. 6. Ninio (1993, p. 317): “Even if adults use single-word utterances as elliptical expressions, referring back to a fuller sentence where they did combine with other elements, from the point of view of the young child who is unable to process complex expressions and may well ignore them, single-word utterances have all the characteristics of independent signs that function on their own without possessing relational meaning.”

 Becoming Human

intrinsic need, but would be only a ‘retroactive’ (so to speak) effect of the syntax (I am thinking of a type-relative, not time-relative, retroactivity). Think of how the human shout of pain, which really needs no resources from the linguistic code, ends up, nevertheless, adapting to the producer’s code – at least, to the phonemic code. Something similar would occur in the relationship between conative messages and syntax. Therefore, if we assume there is a holophrastic phase in the historic level, we might perhaps postulate a conative message that is absolutely disconnected from syntax. No one has observed that Holophrastic Era, it is clear. Thus, we are no longer following the risky strategy we were speaking about before. It is no longer a question of choosing some aspect of the child’s language acquisition for it to inspire us about historic origins. The strategy we are now suggesting is an even more risky one. The premise we are taking is not a fact observed in children’s learning, but a mere piece of speculation about misty historic origins. What can I argue in my defence? I would say that it is not only that the ‘wager and see’ method is an available route in questions such as this. It is a question, rather, of any method being available to use, provided we know what we are doing, and are aware of the limitations of the method that we are using. We move on, therefore, to consider the assumption we have mentioned. In a historic phase prior to full language (in other words, in the Holophrastic Era), conative messages would lie absolutely and completely outside syntax. The linguistic meanings that would be used at that time would be completely different to our words. We have already seen something of this in Chapter 9, when we contrasted linguistic symbols and genuine linguistic meaning. At that point, the linguistic symbol was presented as an ingredient of our present language. However, we are now facing the possibility that at some point in history the presyntactic type of meaning constituted language in its entirety. We could never faithfully translate these pre-words. We could only translate them into the words of our own language, that is, to words that would be either a noun or verb or any other part of the sentence. But those meanings were neither one nor the other. It should be noted that the pre-syntactic condition I am talking about has nothing to do with the double meaning as noun and as verb of the signifier ‘love’. There the word really used is always either one or the other. Its presumed lack of definition resides only in the metalinguistic level that is able to focus only on the signifier – on the signifier ‘love’ that is common to both meanings. By contrast, the lack of definition in primitive signs would have been a radical one. Clearly, for us it is almost impossible to conceive such meanings.7 Nevertheless, if we accept them, that is, if we enter into the assumption of the Holophrastic Era, we can

7. Perhaps when we play with prelinguistic children, there would be something a bit similar. After the child’s and the parent’s experiences of play, Spanish ‘arre’ – muleteers’ order: gee up!– becomes both the order to the horse and the horse running, and is also the horse being able to

Chapter 10.â•‡ From the general exposition to the crucial requisite 

say what they were not, we can understand which characteristics they lacked. Those meanings would have to lack any characteristic that is a consequence of syntax.

10.4 Learning by imitation and the protodeclarative: Toward genuine linguistic meaning 10.4.1 The importance of the imitative learning of signs But, if we accept this, then there is a question we will have to resolve. These conativefunction articulatory-phonetic messages would be little different to the conative communication made by the higher mammals. How is it, then, that their future will be so different to that of animal cries?8 Let us explore this question. Think of the cries that we might designate as ‘calling’, which chimpanzees issue, e.g. the young calling their mother, or the cry issued by a young individual wanting to play and who calls a playmate, the cry from the one being attacked to call for help from a dominant individual.9 These cries would bring about almost identical results to the radically pre-syntactic meanings we have been assuming for the Holophrastic Era. The communicative results that can be achieved by a scream lacking both articulation and imitative learning must not be underestimated. The other members of the clan, when they hear one of these cries, understand who has issued it, the emotional state it reflects, and who is its addressee. Certainly, these calls, the call to a playmate, for example, that we usually translate as a vocative (‘Friend!’) can in fact be translated as an order (‘Come and play’, for example). A vocative is just as suitable (and, therefore, just as unsuitable, clearly) as an order at reflecting what these cries mean. However, this impossibility of translation is the same impossibility we assumed for the initial symbol of the Holophrastic Era. The only characteristics of this linguistic meaning that separate it from the chimpanzee cries are, as we have already said, phonetic articulation and learning by imitation. These differences seem at first glance too slight to support the very different futures that await each type of message. However, is this really so? It is my view that those characteristics of linguistic meaning are very far from lacking in importance. At this point, we must focus on the following. Through them we can explain the origin of a meaning whose referential connection is in the end less ride: Personal observation. See also McCune (2008): ‘Rock-ie-rock’. In my opinion, these prewords would be farther from syntax than pidgin languages or Nicaraguan signs systems are. 8. This astonishment and this question have a lot of option or bet. I clearly admit it. Instead of invoking the smaller size of the chimpanzee’s brain or some explanation of the sort, I have opted to continue asking hoping to find a more enlightening answer. 9. Chimpanzees’ shouts are fixed and they involve no improvisation whatsoever: see a review of this issue in Tomasello (2008). However, this is not our current issue.

 Becoming Human

ambiguous than that of the cry that fluctuates between a call to a playmate and an imperative asking him to play.10 The key would lie in the protodeclarative (Bejarano, 2008). This forms – we shall propose – the bridge that connects the characteristics of the original linguistic symbol with that disambiguation. Let us look first at the link between learned patterns and the protodeclarative. Since linguistic symbols are learned by imitation, their use as protodeclaratives became necessary during the learning stage.11 We can say that it became necessary, during learning, for looks alternating between the object and the addressee, and normally also a pointing gesture, to accompany the name of an object when it was spoken. Of course, these protodeclarative uses would imply the third mode of processing the eyes of others, or, in other words, they would only be possible as a result of the basic human ability. In previous chapters, we insisted repeatedly on this point. The question of how the protodeclarative uses are possible is not what I wish to highlight at this point, however. What interests me here is that the protodeclarative became necessary. Linguistic symbols are learned by imitation. Therefore, the ‘internal loop’ communicative use, that is, those messages that only communicate in order to learn (or teach) how to communicate, had to appear.12 In summary, the sign learned by imitation gives rise to the protodeclarative use. Allow me to stay with the teacher’s protodeclarative messages for a moment. Normally, the term protodeclarative is reserved as a label for a certain kind of children’s messages. However, it is my understanding that the highly specialised communicative function constituted by protodeclaratives occurs both in the learner and in the teacher. The protodeclarative maintains its truly characteristic features even when, in adults, it incorporates syntax. Firstly, the adult protodeclarative communicates absolutely no useful information about things. Secondly, and in keeping with what has been said above, its usefulness has to do with the language-learning process. Thus, the protodeclarative function would occur not only in the holophrastic child but also in adults. In this respect, we must not overlook the difference between deictic and non-deictic signs. That a speaker performs a pointing gesture while using a linguistic sign such as ‘That’ or ‘There’ cannot be considered protodeclarative.13 For the same reason, 10. Or, rather, “less vague”. “An ambiguous term can be assigned various contradictory structures; a vague compound has no such structures to fall back on” (Progovac [2009]). 11. The negotiation of meaning must take place not only in language learning, but also in the creation of communicative signs (Galantucci [2005]) and in the founding of a new referential use (Williams [2008]). 12. At this point, it is appropriate to remember a classic work. Bruner’s (1983, e.g.) riders and additions to the Language Acquisition Device, the comments with which he sought at one and the same time to connect with Chomsky and also to critique him, would be perfectly applicable to protodeclarative messages. 13. It is possible that the pointing gesture which accompanies the protodeclarative has motor qualities which differ from that which accompanies a demonstrative. See supra, Chapter 8, note 7, p. 124 about ‘motionese’.

Chapter 10.â•‡ From the general exposition to the crucial requisite 

no perceptual abstraction can be implemented out of successive occurrences of a deictic sign. By contrast, in protodeclaratives, the lesson being taught is that the linguistic sign corresponds to the object independent of its particular location on each occasion, and, after a longer or shorter learning time, also independent of the details exhibited by each instance of that kind of object.14 This is called the Genericity Assumption. Csibra (2010) extends it beyond language: “8-month-old infants learn about people’s personal preferences by observing their object-directed emotional expressions whether or not these displays are preceded by ostensive signals. However, they seem to learn different things in the two situations. When they observe the emotional expressions in a non-communicative context, they do not think that the actor’s personal preference extends to other people. In contrast, when the emotional expressions are presented for them ostensively, they apparently think that the same preference applies to others as well”. The distinction between these two different kinds of learning – with pedagogy (and, consequently, normativity), or without pedagogy, so to speak – is extremely interesting. In short, we can say that in protodeclaratives and also in pedagogic emotional expressions, the lesson being taught is general. Returning to the point we were exploring, we hold that it was from the sign learned by imitation that the protodeclarative would have been reached. How does this help us in relation to our purposes? We already know the other bank our bridge needs to reach. What interests us about the protodeclarative is that, with it, meanings would be disambiguated. Instead of a meaning which, prior to differentiation, was, for example, a mixture of ‘Friend!’ and ‘Play!’, or of ‘Chief!’ and ‘Defend me!’, or of ‘Mummy!’ and ‘Look after me!’, we would now have a meaning that is ‘friend’, or ‘chief ’, or ‘mummy’. The new meaning that arises as a result of protodeclaratives could now be used as a call without any ambiguity. Its interpretation as a command or petition would have ceased to be available. For other examples that I consider more suitable candidates for primaeval linguistic signs, disambiguation would more likely occur between the ordering of an action to be performed (an interpretation that would have ceased to be available) and the asking for an object (an interpretation which will at last no longer be ambiguous). Clearly, we must not see a noun in there yet. Syntactic categories only take shape in the contrast with each other. A noun is, first and foremost, a sign that requires syntagmatic union with a verb (or with an adjective, etc.). In other words, as long as there were only nouns there still would be no nouns. However, although it is not assimilable, I stress, to our nouns, the meaning that arises after the protodeclarative function became linked to an object and, with it, was freed from ambiguity. This is a consequence that will affect later conative uses, and from which we shall derive other subsequent consequences. However, before this, let us analyse what is 14. Perhaps Tallis (2010) wants to focus on this when he says that “pointing permits the discovery of a ‘deperspectivalised’ objective world”. But I am not sure (see also Moore [2010]).

 Becoming Human

novel in protodeclarative use itself. The meaning of the protodeclarative message would have been unhooked from the conative function, and, at the same time and for the same reason, it would also have been unhooked from conative intonation.

10.4.2 Articulatory-phonetic patterns, intonation, and (inserting a digression in the argument) manual versus vocal origin The question of the change of intonation is interesting. Let us view it in a wider context. Intonation, with its continuum graduatum (that is, a continuum where all variations are relevant) and its spontaneity at the level both of reception and production, refers us back to the animal cry. In our chapter on Saussurean parity (specifically, in 6.1.2), we presented the contrast between the intonation component, which is natural and immediate, and the articulatory-phonetic component, which is learned by imitation, and consequently, through an act of kinaesthetic interpretation, and is always received in production-format. The intonation component, with its link with the original interactions, is what transmits the communicative force. (Certainly, in our language, intonation can be syntactic as well as emotional. When language becomes syntactic, intonation certainly assumes a function related to syntax. However, at first, when there was no syntax, intonation was only naturally expresive.) Therefore, language always, but even more so at its origins, needs this component. In its pre-syntactic origins, it would be, I insist, entirely indispensable. The original articulatory-phonetic patterns would have needed to insert themselves into the very heart of intonation, and this, in my opinion, was a determining factor in the selection of the auditory channel, or, to be more precise, in the selection of this channel over the equally ‘self-perceptible’ manual movements. Why did the selection have to be restricted to self-perceptible movements? Exact, faithful motor imitation is made easy (or, more accurately, becomes originally possible) when the result of the motor patterns made by oneself can be perceived via the same sensory channel through which the model has been grasped. In addition, the evocation of objects different to the model would be eased. We have spoken about this already in 9.3. Thus, let us ask once more about the selection of the vocal-auditory channel. Many authors have given various highly plausible reasons – free hands, transmission in darkness – to justify the victory of the auditory sign. Nevertheless, in my opinion, a reason more important than these would lie in the voice allowing the learned articulatoryphonetic meaning and the naturally expressive intonation to be produced simultaneously and within the same stimulus.15 This differentiation and, at the same time, union

15. Certainly it is also known that some facial gestures correlate with intonatory features (Foxton et al. [2010]). However, manual and facial gestures require a divided visual attention in the receiver. By contrast, articulatory-phonetic patterns and intonation converge in a much narrower way.

Chapter 10.â•‡ From the general exposition to the crucial requisite 

of the learned component and the natural component seems more difficult to achieve in the manual and visible gestures channel.16 Certainly, I admit that the idea that language evolved from manual gestures would entail a smaller break-up compared to how the different types of movements were organised in nonhuman primates. In chimpanzees, it seems that vocalizations cannot be elicited from the motor cortex (Ploog, 2002)17. By contrast, “an important subset of chimpanzees’ manual gestures are individually learned and flexibly used” (Tomasello, 2008, p. 20). In addition, “in non-human primates, vocal (versus gestual) signals are related to a whole group rather than to a specific recipient”: Meguerditchian & Vauclair (2010, p. 453).18 However, in spite of all of the above, I remain convenced of the originary primacy of the vocal modality. In my view, the key for that primacy would stem from the synthesis between the two elements that give rise to the human voice: On the one side, the communicative force of intonation (inherited from animals and whose reception is completely off the margin of the production-format) and, on the other side, the articulatory-phonetic imitation. In the same manner, I could add that the often stressed flexibility of chimpanzees’ manual gestures never actually reaches the imitation of new complex motor patterns (or, put otherwise, true motor learning). Therefore, the change that this type of imitation entailed would also be radical in the gestual modality (not only in the vocal modality). We can even say that in the vocal modality that change would have had evolutionary precedents, although they have to be looked for outside primates. Certainly, “chimpanzees use their right hand (and left hemispher) to communicate with each other.” (Meguerditchian, Vauclair & Hopkins, 2010, p. 40). (“By contrast, the hand preferences for a non-communicative self-directed action did not correlate with the hand preferences of any category of communicative gestures and did 16. We have already spoken in 9.2.2 about the advantages of articulatory-phonetic symbols over any type of pantomimes, or analogue symbols. Faced with setting in place a stable social model, articulatory-phonetic movements would be in a favourable position, since they are intrinsically arbitrary. However, this alone is not enough to explain their advantage over any other possible type of arbitrary movements. The reason we have just given here explains the advantage of the vocal-auditory channel itself. 17. It may be convenient to research deeply into this statement. It has been described that chimpanzees voluntarily produce two novel atypical sounds exclusively in the presence of both out-of-reach food and a human experimenter in order to request food (Hopkins et al. [2007]). These sounds, when produced simultaneously with food-beg gestures, induced a more pronounced right-hand preference than when the gestures were produced alone. Are these sounds also adjusted to Ploog’s statement? 18. Perhaps primate screams, unlike it is the case of bird singing (see supra, 6.2, about birdsinging, its kinaesthetic interpretation and its lack of addressee), can sometimes be related to a specific recipient. Nevertheless, these possible exceptions would not blot in the slightest the general rule.

 Becoming Human

not reveal group-level manual asymmetries” [ibidem, p. 46]). However, in my view, this datum does not have to support the manual origin of language. Manual communicative gestures of chimpanzees are more “flexibly used” than non-communicative self-directed touching actions. That is why it is logical that (given the asymmetry present in these animals) a particular hand and hemisphere are in charge of such gestures. If in another subsequent species, motor flexibility appears more in vocal modality than in manual modality, then in that species vocal movements will be controlled by the left hemisphere. It is not necessary to assume that the initial communication based on the extreme type of motor flexibility (that is, on the imitation of new complex motor patterns) were manual. Let us summarize. Certainly, I believe that the pointing gesture already involves the basic human ability. However, language, in my opinion, would have only started when the emotional expressivity of old vocalizations was joined by the imitation of new complex motor patterns. But let us finish the digression and let us move on finally to the point that matters to us at this stage. How does intonation operate in our language and how did it operate before the protodeclarative? The simultaneity clearly has not been lost in our current language at all. Clearly, in our linguistic production, the neutral meaning supplied by the articulatory ingredient is added to the communicative force supplied by intonation. The resulting final set may even appear identical at times to the one at the origins: this is what would happen in some conative messages. In the latter, that is, in the continuum graduatum of mandates, petitions, and pleas (or of more or less urgent and imperative calls), we find the same intonation that probably occurred in the initial prelinguistic messages. However, the processes would be different. When only conative intonation existed, the type of intonation did not need to be chosen. This choice, I stress, would have arisen as a result of the protodeclarative and of the use of learned articulatory-phonetic patterns. Since then, the two elements of the voice would be recruited independently (almost certainly from different cerebral hemispheres) and only later brought together to be produced jointly. We might add a question here about details. Must we talk about a protodeclarative intonation? Or would it be better to speak of the absence of conative intonation? I would say this depends on what we are focusing on. Is our focus that there would be no naturally expressive modulation for the protodeclarative, or, to put it more exactly, no modulation which phylogenesis would have made sure to connect with a certain emotional state? In that case, we might speak merely of the absence of conative intonation. Or, do we prefer to focus on the two different options (conative or, on the other hand, non-conative intonation modulation) that have been available since the protodeclarative appeared? In that case, it will be useful to talk about protodeclarative intonation. However, it makes no odds what we call it. What interests us, I repeat, is to highlight the novel process set in motion by the protodeclarative. Two elements (the articulatory-phonetic and the intonational) have to be united in language subsequent to

Chapter 10.â•‡ From the general exposition to the crucial requisite 

the protodeclarative holophrase, because these two elements have disassociated themselves from one another since the emergence of the protodeclarative use.

10.4.3 Bleaching and the protodeclarative However, if the meaning can now receive one function as soon as another, one intonation as soon as another, this means there is a neutral meaning. The appearance of this availability and neutrality would mean an absolutely drastic bleaching, a disappearance of the communicative force earlier present in the sign itself.19 This emptying of communicative force (which, it goes without saying, is parallel to the aforementioned referential disambiguation) would have been a crucial event. Sapir (1956 [orig. 1933]) was able to see this idea when he stated that language is language not because of its admirable expressive capability, but in spite of it. Consider that in our language, meaning, far from merely existing to request or call, exists to be denied, for example, or to be asked about something, to be predicated about something or even to be defined metalinguistically. In short, the meaning of full language is an element without its own force, and as a result of this, it is available for any use. In contrast, animal cries are always loaded with their corresponding communicative force, and can never be unhooked from it. Therefore, the emergence of the protodeclarative use, by getting rid of the previous impossibility of choice, was an enormous step forward. Clearly, we cannot say that proper linguistic meaning would have been attained with the protodeclarative. Nevertheless, an absolutely novel type of communicative signal did begin to appear there. In addition, the novelty of protodeclarative intonation would have another beneficial effect. We should begin by realising how difficult it must have been at the beginning, and must also be in children, to reject the immediacy of animal understanding. During those difficult beginnings, intonation, because of its immediacy of reception so typical of animal communication, could be dangerously tempting for recipients as a means of short-circuiting the complex process of reception. The specific content of the petition could be merely conjectured, and the contribution of the sign learned might consequently be concealed. This was, I repeat, a serious danger, but one which would only arise for intonations that referred directly to those of the animal cries. Therefore, the use of an intonation that, like the protodeclarative, is not found in animal communication would have been crucial (See Singh [2008], cited above, in 9.2.1). The peculiarity of such intonation had not only the immediate usefulness of averting the danger in the message in question, but also the more general usefulness of exercising the complex reception processing.

19. Although it may be taking it beyond its normal limits, I think the concept of bleaching, so beloved of Langacker, might be used here.

 Becoming Human

10.5 Toward our next question However, after placing such stress on the peculiarity and novelty of protodeclarative messages, and above all, on their ability to disambiguate between petition and call, or between the ordering of an action to be performed and the asking for an object, we should look at the other side of the coin. The only occasion in which the new meanings would lose their conative force and intonation is in protodeclarative messages which, as we have been saying, are a communicative function subordinate to learning. There would be, thus, no immediately useful communication where the meaning would be separated from its conative force. In addition, and with this we reach the point that interested us, there is still nothing even remotely like a predication. We had left it that the privileged destiny that awaited the meanings of the Holophrastic Era and which differentiated them from animal calls would have to be explained. However, we have still not achieved this at all. We have indicated, indeed, some features that would be characteristic of the linguistic symbol, and especially, the protodeclarative holophrase, contrasted to conative communication in animals. However, we have still not established how those features might be responsible for the appearance of syntax. As I suggested earlier, the decisive element for the genesis of predication would be the perception of the false beliefs of others. We will, therefore, now examine this last issue. Our purpose will be, it goes without saying, to see if any type of perception of false beliefs of others might have to do with those characteristics of the linguistic symbol. However, for the moment we will have, I repeat, to tackle the whole question of false beliefs such as it appears in ‘theory of mind’ studies.

chapter 11

Toward the original perception of false beliefs of others The importance of the learned sign

11.1 The perception of false beliefs of others: The debates in the bibliography If it is possible to talk about theory of mind research having a flagship, the obvious candidate is the perception of false beliefs. Experiments on the ‘perception of false beliefs of others’, or on the ‘memory of one’s own false belief ’ were present in most of the work on ‘theory of mind’ in the 80s and 90s. However, I shall not deal with own past belief at this stage, as it will be examined in a chapter somewhat later. What we need to do at this point is analyse the possible access points to the perception of beliefs of others.

11.1.1 The classic experiments and the first attempts at lowering the age of perception The classic test, or ‘the displaced toy’, was performed as follows. A little play is performed in front of the children. Act One: Maxi hides a toy, such as a marble, inside a vase. Then he leaves the room. Immediately afterwards – this is Act Two –, the mother character comes in and, seeing the marble in the vase, takes it out and puts it in a drawer; having done so, she leaves the room. Later, Maxi returns – this would be a third act which only ever begins. At this point, the researcher asks the child undergoing the test: Where will Maxi look for his marble? While we adults understand that Maxi will necessarily have a false belief about the location of his toy, children, until they are about four years of age, reply that Maxi will look for it in the drawer, in contrast. These children evidently do not successfully perceive Maxi’s false belief. (We can also invoke the Sally-Anne version, instead of that of Maxi.) New studies and experiments soon appeared which tried to lower the age at which children come to possess the abilities related to false beliefs.1 The experiment by Mitchell & Lacohee (1991) was presented in support of innatist theses (Innatism here refers specifically to the ‘concept of belief ’, but even so this short article still makes explicit mention of Fodor’s mentalese). According to these authors, the competence to perceive 1. One of the earliest studies in that sense is Siegal & Beattie (1991), who suggest that children do not follow an experimenter’s “implicatures” in conversation.

 Becoming Human

the concept of ‘false belief ’ would not actually be absent in children under four years of age’; instead, several factors (such as the ‘seduction of reality’) would be masking it. As proof of their thesis, they presented the results of a new design of the experiment. It was known previously that until about the age of four or four and a half, children were unable to remember their initial answer (e.g. their answer to the question about what there would be in a box of sweets with which they were familiar) when this answer had been shown to be false (e.g. when the child had seen that there were erasers, and not sweets, in the box). With this datum available, Mitchell & Lacohee added a new step to this ‘deceptive box’ experiment. Immediately after the initial verbal response (‘Sweets’) the researchers asked the child to put a picture card corresponding to what the child had replied into a mailbox, that is, a picture card with a sweet painted on it. This was followed by the box being opened and the discovery of the unexpected erasers. Then, in order to examine whether the children remembered their past false belief, it was asked what it is they said was in the box. With this new design of the experiment, children remembered their own past false belief a year earlier than in the classic test. By addressing Mitchell & Lacohee’s study, we have moved away from the kind of false belief that interested us in this chapter. Their experiment has to do with the memory of own past false belief, not with the perception of false beliefs of others. Nevertheless, if the critique of their hypothesis which we shall put forward here were valid, then it would also be valid, at least to the same degree, within our approach, as a critique of the innatism of the general concept of false belief. We shall continue, therefore, with our analysis of this experiment. Critical voices were soon raised against Mitchell & Lacohee’s interpretation. Threeyear-old children who correctly answered were not remembering any false belief, but were reporting a very real and true fact, namely, the picture card in the mailbox was the one with the sweet on it. What really poses difficulty for the child is not remembering previous facts, but remembering a belief that clashes with reality (and which, therefore, would be useless and counterproductive as an appropriate guide for conduct). Thus, this first attempt to show an earlier competence regarding false beliefs soon fell into disrepute. More subtle and intriguing was an observation by Clements and Perner (1994). These authors were putting children of three years of age through the false belief test of others. The children’s verbal response to the question ‘Where will Maxi look for the marble?’ was, as was expected, incorrect. The children replied that Maxi would look for it in the drawer, and showed that they were not able to perceive the false belief in which Maxi was immersed. However, alongside this, Clements & Perner discovered something more original. Although they were giving the incorrect verbal response (‘in the drawer’, to continue with the initial example), these three-year olds repeatedly made quick glances at the vase. It appears – or so Clements & Perner interpreted their discovery – as though those children could solve the task at an implicit level, even though they might fail at the level of verbal response.

Chapter 11.â•‡ Toward the original perception of false beliefs of others 

Nowadays, Perner rejects his former interpretation. A lively debate has today opened up about precisely those points. However, we will not be able to describe that debate if we do not first take a look at other even more recent attempts to lower the age of perception of false beliefs of others.

11.1.2 The most recent attempts to lower the age of perception of false beliefs of others Onishi and Baillargeon (2005) claim to have shown this perception in children of 15 months. As we would expect, these authors have had to profoundly modify the classic test since children of 15 months do not have the command of language that tests such as the displaced toy test require. What Onishi & Baillargeon have done is as follows. First, they familiarised the babies with an actor who repeatedly placed an object on one of two boxes that are in front of the baby. The actor would then either leave (‘Condition of false belief ’), or keep looking (‘Condition of true belief ’), and in both cases the object moved by itself toward the other box. The children’s non-verbal ‘response’ consisted of the period of time they continued watching while the actor made his way either to the box where at that time the object really was or the box where he had put it. Experts who research with babies have known for many years that the ‘violation of a prediction’ occurs along with greater looking time. Onishi & Baillargeon recorded, in the condition of false belief, a greater looking time when the actor made his way to the object’s real location than when he made his way to where he had reason to believe falsely that the object was still located; they also recorded that, in the condition of true belief, the children looked longer when the actor made his way to the box where the object was not. These results have been adopted enthusiastically by Leslie, an author of the innatist camp. Perner & Ruffman (2005) immediately attacked this pro-innatist interpretation. The explanation they put forward is rather different. They suggest that babies of 15 months are indeed surprised when they see the actor in the condition of false belief make his way to the object’s real location, but this surprise does not imply that babies attribute any false belief. They are simply being guided, in each case, by the last occasion they had seen the actor, object and box together. The most recent associated triplet was the one that guided the child’s predictions in each case, and consequently the violations relating to this triplet were those which brought about the greater looking time. In short, the key to these results for Perner and Ruffman is that in the condition of true belief, but not in the condition of false belief, the child’s last association of the actor with the object and box was with the box that does in fact contain the object. We should also mention another aspect of this same article by Perner & Ruffman. It is the point where, searching for consistency with their critique of Onishi & Baillargeon, they suggest taking the earlier observation published in Clements & Perner (1994) and reinterpreting it in a deflationary way. At that time, the children’s looking had been seen as demonstrating ‘their success in the implicit task of perception

 Becoming Human

of beliefs of others’. However, according to Perner & Ruffman (2005), such perception would not be present at all. The three-year olds who look as though they were attributing false belief to Maxi would in reality only be responding to the association formed by the most recent triplet. This change of view by Perner is, to my understanding, a hardly surprising result of this most recent phase of the debate. A response was made to Perner & Ruffman by the pro-innatist Leslie. Leslie (2005) asks why, when the three-year olds were asked, they would not also come to rest on the so-called triplet; he states that the difference is brought about by the verbal presentation of the task. This statement is, in my view, correct, but, contrary to what Leslie claims, it would not necessarily have to be water for the innatist mill. Let us open this question out a little. According to Leslie, there would be true perception of beliefs of others in children of 15 months. If such perception is not displayed in three-year olds, this would be caused by the verbal format in which the task is presented to these children. Alternatively, we may believe that it is only in the verbal presentation that we actually have a perception of belief task. The babies’ pseudo-success as well as the, also pseudo-successful, looking of the three-year olds do not at all derive from carrying out such a task. This second alternative is strengthened by the specific and detailed explanation that Perner & Ruffman offer of the pseudo-successes. It is my understanding that the explanatory resource of the triplet might tip the scales in their favour. From what has been said to now, we may perhaps conclude that there has not really been any reduction in the age of success in the perception of false beliefs of others. If this occurs at too early an age, it turns out not to be genuine perception of false beliefs. If it is genuine perception, then it appears only toward the age of four. This is, I repeat, what has been found from an analysis of the bibliography. However, even though this might be my conclusion regarding the debate summarised above, I, nevertheless, believe there is a means of accessing false beliefs of others that is much easier than what is considered in the classic experiments. What I shall hypothesise – the point that really interests me – is that the reception of a linguistic message may lead very easily and at a very early age to perceiving the speaker’s false belief. This new means of access is what we need, both for our hypothesis about the origins of predication and also to link the questions of false beliefs and predication with the basic human ability. However, before moving on to look closely at the connection between false beliefs of others and the linguistic messages received, it will be useful to consider another question. I refer to the possible similarities between symbolic play and false belief. This was the question posed by Leslie (1988).

11.2 False belief and symbolic play. A question to be slotted in At this stage it would be pointless to insist on the importance of Leslie’s work. Nevertheless, although it may take us down to the level of irrelevant anecdote, I will say that his

Chapter 11.â•‡ Toward the original perception of false beliefs of others 

work had a great influence on me. At that time I was occupied with the issue of predication and with what at that time I used to call ‘meaning according to the hearer’ (and which, later, after connecting with the Theory of Mind, I moved on to call ‘false or insufficient belief of somebody else’). On this I had written my doctoral thesis (Bejarano [1985]), which I had just presented, and was still focussing on it. Reading Leslie (1988) forced me to broaden my horizons. I understood that mental states different to one’s own current reality go beyond the contents that are attributed to another person. This is what excited me in that article. The least of it was that two authors such as Frege and Piaget who I liked so much came together there. But enough of my autobiography. That work by Leslie highlighted that false beliefs, of others as well as own past beliefs, shared a key characteristic with the contents of symbolic play.2 The disconnection from reality had always been connected to false beliefs. But this disconnection, Leslie warns, is no less noticeable in symbolic play. The ultimate conclusion he was pursuing was the same as in what he wrote in 2005. The concept of false belief would form part of the mind’s innate baggage. This is why the disconnection from reality is observed at a much earlier than the age at which children begin to show success in the classic experiments on perception of false belief. According to Leslie, symbolic play would be an early manifestation of the same ability that will later allow success in the Maxi test. Innatisms apart, the point that the disconnection from reality occurs in symbolic play is indisputable. The contents evoked are different to the reality the child is observing. The horse or the imaginary food are not in the real environment, and children know this very well. They never really eat the handful of earth. And if its mother starts looking for the brush, the child realises that it will soon lose its horse. What relationship is there between this and what occurs in the perception of false belief? Everyone agrees that the two disconnections from reality are similar, on the one hand, and different, on the other. What exactly is the difference, however? (Here I shall take into account only false beliefs of others. In a later chapter, we shall widen this question to own past belief). We can formulate the question as follows: the contents of an evocation do not appear false to us, yet they are disconnected from reality. How is this? What is the bonus that has to be added to the disconnection if we want to obtain the perception that a content is false? Evoked contents are not stated. In contrast, the contents of the false belief that has been perceived in another person, even though they are not stated by the subject who has perceived the false belief, have been stated by the other person. Contents that are disconnected from reality turn out to be false only when it is realised that someone believes them to be true.

2. But this idea is very prior to Leslie. See, for example, Ryle (1949, p. 245) on pretence and oratio obliqua.

 Becoming Human

As we can see, this fits very well with the origin that we began to attribute to false beliefs of others at the end of the previous subsection. The cause of the primary perception of these false beliefs would be, we said, the reception of a linguistic message. The declaration of the content of the belief would be expressed, more or less explicitly, in this linguistic message. Of course, false beliefs of others will be perceived later without linguistic mediation. This is what occurs in Maxi’s test, as well as in all the highly complex inferences that we adults make at every step in our social lives. However, the place where the concept of falseness originates would be, according to my proposal, the reception of a linguistic message. It is here, facing this message, that it would be easiest to notice the clash between belief and reality (or ‘reality just as the subject sees it’, as we would have to say if we move from the subject’s to the theorist’s point of view). It is, in short, after the reception of a false linguistic message that the ‘double mental line’ would become a double archive on the world. From then on, there may be two different and contradictory contents about any area of reality – on the one hand, reality itself, and on the other, the belief that other people have about it. Nothing in this contradiction, in contrast, would apply to the fictitious contents of evocations. The content evoked by symbolic play is not asserted about anything, it is not located in any spatial-temporal point of reality.3 Why do contents of the false belief remain perfectly located (i.e. spatially and temporally fixed) in memory? Because in the centre of the falseness there is a point of truth which the subject is interested in keeping. This point is the fact that the contents are believed by a specific person. If it were not for this truth floating on false contents, there would be no usefulness in the costly duplication of files in the ‘marble’ and the ‘marble according to Maxi’ duality.

11.3 The origins of the concept of ‘truth’: Inserting an even more peripheral point than the previous one Once the perception of false beliefs of others had been attained, arguments of others and the dialogues which followed could gradually give rise to another achievement. The subject might see her current own belief eroded, and thus become discontent with the beliefs of others as well as her own beliefs about the matter. That would be the scenario where the concept of ‘truth’ would begin to appear. The concept of ‘truth’ could only have originated in order to create a contrast with uncertain contents. Can there be uncertain perceptions? Although this point really belongs in subsequent chapters, I will jump ahead slightly and put forward my opinion 3. As you will see, this comparison (between evocation and false belief) takes an opposite direction to the one set out in a previous chapter (8.9.2) between evocation and expectation. While the perception of false beliefs would be a superior and later ability than evocation, expectation, on the other hand, would be possessed by animals and would precede evocation.

Chapter 11.â•‡ Toward the original perception of false beliefs of others 

that an animal’s current perception cannot ever be uncertain. Animals accommodate hesitations or doubts only in regards to their immediate future perception. They may accommodate ambivalent expectations about what they will find in a specific hole, for example, but they are very clear about what they are seeing at that very moment. My point is that it is only when we add the reciprocity of a dialogue to the concept of false belief that we would have reached the concept of ‘truth’. This concept is necessarily separate from prelinguistic perception. Many authors have maintained the opposite, namely, that the dangers for truth would have begun with the introduction of language, with the possibility of pointing to contents having absolutely no perceptive basis. For a recent version of this old argument, see Premack & Premack (2003, p. 133). However, I suggest that it is only as a result of the fact that linguistic communication offers various contradictory contents about a single topic that the best human cognitive characteristics could come into being. In this way, the exclusively human desire for truth would have appeared, and also the idea of an impartial and higher vision able to judge the contents of different minds. But it is now time to return to our main thread.

11.4 The most easily-perceived false beliefs of others: ‘Second-person’ and received through language If we are looking for a perception of false beliefs of others that may have brought about the origin of syntax, that perception would have to have been of the simple type suggested above. It is unlikely that access of the type found in the classic tests could be operating at the origins. Remember that care was taken in the design of the experiment to provide the greatest opportunity for the perception of beliefs. The three acts of the little play thus follow each other very quickly, in order to prevent what occurred in the previous acts being forgotten. Similarly, when, at the end of the first act, the protagonist leaves the stage, a door closes strikingly, or, if not, the researcher mentions the fact: ‘Maxi is going, he has gone out to the street’. It is implausible that all these circumstances that ease processing would have been present at the historical origins. It is true, indeed, that much of the difficulty in Maxi’s test has to do with the fact that the scenes in the test are a piece of theatre, which is outside the life of the child. This is all very true, I repeat. Nevertheless, the non-verbal perception of beliefs of others, as integrated into real life as it may be, would probably be an extremely demanding task. We must remember that the four-year olds who fail these tests have had training in full human language. Without this training and without those circumstances that make the experiment easier, could humans from the Holophrastic Era have accessed this type of perception of beliefs of others? If there were no other option, we would, obviously, have to accept that it was so. But there may be an alternative possibility, which we shall see below. What now becomes more pressing is to look at the features that make the Maxi test so difficult. In this test, the beliefs that I have to perceive belong to another person who

 Becoming Human

(point one) is not interacting with me and who (point two) is not even situated in my own sphere of conduct, but in the world of theatrical performance. The first of those points, that is, the greater difficulty in perceiving ‘third-person’ beliefs of others compared to perceiving ‘second-person’ beliefs of others, is addressed by Frith & de Vignemont (2005) or Reddy (2008). See also, more importantly, Southgate, Chevallier and Csibra (2010). More specifically, let us analyse what it is that the Maxi test (or, in other words, the ‘third-person’ inferential route toward false beliefs of others) demands even in the simplest of situations. Once visual perceptions have and have not (in Act One and in Act Two, respectively) been attributed to Maxi, this information will have to be stored and then brought out to intervene in ‘Act Three’. Then, in this third act, an interpretation of the visual field will have to be attributed to him. This interpretation – this is the key point – will have to be based on what was previously attributed to him. In other words, in this third act, the simulation of the mental states of Maxi has to be sequential. It is not enough to imagine his visual field oneself. His mental state will have to be derived from his past. As you can see, there is a very close parallel between this mode of perception of false beliefs of others and the latent imitation of new motor sequences (see above, in 8.5 and 8.7). Just as imitating new complex motor patterns is much more difficult than imitating a simple movement, so too attributing a sequence of mental states is probably more difficult than detecting a specific, non-sequential state. See also 4.9.2, where we said that only an interiority different from one’s own, that is, only a second mental centre in one’s own mind, has his own accumulative sequential line, a line where the current moment is the heir of the previous moment. Chimpanzees would be able to ascribe visual perception, or lack of visual perception of a fact to a conspecific. However, they are almost certainly incapable of a sequential and accumulative process of ascribing (namely, ascribing knowledge of a fact to the conspecific, ascribing unawareness of another fact, and drawing out the consequences of these two ascriptions). In short, chimpanzees’ ability to ascribe to conspecifics would be analogous to their ability to imitate a simple movement. In both of these fields, chimpanzees are absolutely incapable of sustaining the respective sequential process. They have no latent imitation of complex motor patterns, nor therefore do they have motor learning. Analogously, their ability to attribute visual perception or lack of visual perception does not enable them to infer false beliefs of others. Sequential interpretation fails chimpanzees in this area too. Humans, in contrast, are clearly capable of sequential simulation. It is, however, also the case in humans that the nonsequential (second-person) ascription of a false belief is easier and earlier than the sequential ascription we have detected in the Maxi test. However, the difficulty of perception as in the Maxi test is neither the only nor the most important reason to seek an alternative. I would stress that what we need for our hypothesis about the origins of syntax, much more than ease of perception, is a perception of beliefs of others that can link with and lead to predicative communication. We must therefore seek beliefs of others that come in a linguistic format. This belief is

Chapter 11.â•‡ Toward the original perception of false beliefs of others 

what would provide the best platform for the appearance of a predicative message that sought to complete, correct and update that belief.

11.5 What types of linguistic messages are able to reveal their producer’s false beliefs to hearers? 11.5.1 Weighing up the candidates Receiving beliefs of others in linguistic format is certainly a very simple task. You tell me something that I believe to be false, and I immediately perceive a false belief of yours. This occurs every other minute. Nevertheless, such a simple solution is ruled out for us at the origins of language. Remember that we are looking for the origin of the communicative function of predication. We cannot therefore situate a predication prior to the perception of false beliefs of others that would presumably have given rise to the first predication. Another common way of receiving beliefs of others linguistically, in this case not a false belief but an incomplete one, and one that is therefore in need of completion, are questions. I receive a question from someone and, immediately understanding my interlocutor’s insufficient level of knowledge, I produce a predication of response, also immediately. This occurs, I repeat, every other minute. However, once more, this cannot be extrapolated to the origins of language. Questions, as will be qualified and seen in more detail in a later chapter, need syntax, or, at least, some meaning shaped by syntax, and are derived from predication. For this reason, we are forced to rule out questions also.

11.5.2 On historical origins and added constrictions: A methodological reflection It is worth pausing a moment at this point to weigh up what we are doing. The impossibility that we have pointed out (namely, the impossibility that the reception of a predication could be the origin of the linguistic reception of false beliefs of others) only occurs at the level of the historical origin. This vicious circle disappears in ontogeny. Nothing prevents the child receiving adult predications from the very outset. Can we say then that the search for the historical origin would be causing a wrong approach to the problem? Are we perhaps letting ourselves get lost because of some foolish fondness for the Palaeolithic darkness? I do not think this is the case. Note that the historical origin, far from advising an excessively simple solution, does exactly the opposite; that is, it adds new difficulties, without failing to respect any of the previous constrictions. Consequently, the historical approach could only ever lead us to the impossibility of finding the solution. Therefore, if, in spite of everything,

 Becoming Human

we do manage to find a solution that also works for historical origins, the additional functionality will have to create no reason for it to be mistrusted. We move on.

11.5.3 Conative messages and what their producer believes We have ruled out predications and questions. What now, then? At the origins we do indeed have conative messages. But this would not appear to solve our problem. As they say in logic, orders or calls lack truth value. How then could they transmit any belief? I think the famous distinction between what is said and what is implied would be useful here. When I call someone, for example, when I say ‘John!’, I am not stating any specific belief I hold about Juan. However, I would be implying my belief that John is near, within the sound of my voice, let us say. In addition, when I make the request ‘Water!’, I am implying my belief that there is some amount of water available (Alston [1964]). That implication would reach a well-informed recipient, and would be conceived by her as a false belief of the speaker. The thoughts implied by the message (John conceived of as being nearby, or a specific material conceived of as still not completely consumed) would, I stress, be understood by the recipient of the linguistic messages, but not as her own thoughts. This, perhaps, would be the simplest way of perceiving false beliefs of others. It is not only that these beliefs fulfil the characteristics mentioned previously – namely, being ‘from a second-person’, and being understood when a linguistic message is received. In addition, the linguistic message in question would also succeed in being integrated into a particularly primitive type of linguistic communication. Neither predication nor syntax would be involved.

11.6 Developing the importance of protodeclaratives: The bridge between imitation of complex new motor patterns and predicative communication Clearly, the level of complexity required by that linguistic message would be minimal. There is one requirement that would be essential for it, however. The meaning used in this conative message would already have to be enjoying the disambiguation that would have come in with protodeclaratives. Only if the meaning fits tightly to an object, I stress, will conative messages be able to transmit their producer’s false belief. Let us compare these conative messages that are able to reveal the speaker’s false belief to the hearer with the animal conative messages to which we referred in the previous chapter. Think of the cry of a young ape or monkey calling its mother. It is known that the members of the clan correctly understand the cry of their young: they look toward

Chapter 11.â•‡ Toward the original perception of false beliefs of others 

only the mother in question, not toward other females.4 However, as adequate as it may be, this understanding does not differentiate between call and order or request. Given that this cry, since it has not been learned by imitation, has not needed protodeclarative uses, it would contain no possible difference between ‘call to mother’ and ‘request for maternal care’. Therefore, if it so happened that some individual in this clan of chimpanzees had seen the mother of the young chimpanzee fall, for example, in a ravine some distance away, that individual would not for this reason have to perceive the young chimpanzee’s false belief. The alternative ‘request for maternal care’ is available and does not require the perception of false beliefs of others at all. In contrast with this, those conative messages that, although still pre-syntactic, have been learned with the aid of protodeclaratives, would act as a means of accessing the perception of false beliefs of others. This access satisfies the requirements we were seeking. In addition to entailing much less difficulty than the classic experiments, it presents beliefs of others in linguistic format. If, as I put forward earlier, we are proposing in this book that the genesis of syntax would depend on the perception of beliefs of others, then we are ascribing immense importance to the protodeclarative. It would therefore be useful to address the protodeclarative at this point. Although we have already discussed this question, this is a good moment for it to be redeveloped.

4. This observation (others looking at the mother of a crying infant) has been done with captive vervet monkeys (Cheney & Seyfarth [1980]).

chapter 12

Between motor learning and the perception of beliefs of others The crucial role of the protodeclarative

In the previous chapter, we reached the ‘protodeclarative – disambiguation – perception of beliefs of others’ triad. Later, we will attempt to show that the (false or incomplete) belief of the hearer coincides with the theme of predication. However, before adding this fourth member to our causal sequence, we shall rethink its first term. What exactly does the protodeclarative consist of? As we saw above, in 10.4, the protodeclarative has no other function than to aid linguistic learning. Put another way, the protodeclarative would be inconceivable for communication signals not produced through social learning. Thus, all the effects we have already attributed to the protodeclarative and which we intend to continue attributing to it, the whole extremely important package of derivations, would have been unleashed by the learned nature of the signs. But that approach is too schematic. What are we referring to when we talk about the learning favoured by the protodeclarative? Two aspects are clearly included in this: first, the motor imitation of articulatory-phonetic patterns, and, second, the learning of the connection between that articulatory-phonetic pattern and its stipulated realworld counterpart. With the protodeclarative, the learner exercises one aspect as much as the other.

12.1 From learned signs to the protodeclarative The basic capacity, that is, the second or simulatory mental centre, manages, through what we called its big extension, to sustain the ‘motor adaptation to a model’, or imitative learning of motor sequences. The possibly most accomplished type of this adaptation would occur in the imitative learning of articulatory-phonetic sequences that are new to the imitator. Having been thus learned, the articulatory-phonetic patterns are what would connect with the protodeclarative. But we should give voice here to an objection. Why such insistence on motor learning in relation to the protodeclarative? Would it not be possible to think that the learning achieved by the protodeclarative is principally about the connection between the arbitrary signifier and its signified? In other words, the possible objection is the

 Becoming Human

following: However much, once it had taken root, the protodeclarative use may collaborate in motor learning, motor learning alone would never have given rise to protodeclarative use. How shall I respond to this? I suspect, on the one hand, that learning about an arbitrary connection is not at all a difficult task. Think of animal conditioning. As will be clear from previous chapters, I sharply differentiate not just words in full language, but also symbols of any type, from conditioned association. However, I believe we need to heed the lesson it has to teach us: we must not consider arbitrariness to be the key to the great demands of the symbol. See Houston-Price et al. (2005, p. 175): “Infants learn new word-referent associations with ease, and conditions that allow such learning are less restricted that was previously believed”. On the other hand, in contrast, I have already made repeated mention of my high evaluation of the ‘motor mimic’. But enough of suspicions; let us investigate if the difference between my opinion and the objector’s really would create disparity in our final conclusions. I think that it would not. If motor vocal patterns are learned, then they will not be innate.1 From this it follows that their communicative impact will not have been designed by evolution. Consequently, their meaning will have to be learned, and there will be a use, therefore, for the protodeclarative. Motor learning will, therefore, necessarily be found in the causal basis of the latter, without it mattering if the cause is more or less immediate. In summary, if we accept vocal learning, linguistic protodeclarative would be the necessary derivation.

12.2 Vocal Saussurean parity and motor imitation of complex and new patterns But why would we have to accept that in the historical beginnings of our language or, more precisely, in these beginnings where the vocal protodeclarative appeared, was there true motor learning? Let’s recall that, according to my hypothesis, the pointing gesture is a type of communication which, despite deriving from the second mental centre, does not imply imitation of complex motor patterns. Could something of the sort have happended in the first linguistic signs? We are going to try to answer this question. A linguistic sign, we said, implies Saussurean parity. Both recipient and producer perfectly differentiate the inverse acts of requesting and receiving the request, and nevertheless, at the same time, its meanings are also identical for both interlocutors. The same thing occurs with calling and receiving the call, or, analogously, with signalling 1. However, we must admit that the relationship between learning and innateness is not so simple. Articulatory control in humans is based on a finite (innate) set of speech sounds that can be produced by a human vocal tract. All human languages are based on the same set of vocal units, although they only use a subset, which is determined by learning.

Chapter 12.â•‡ Between motor learning and the perception of beliefs of others 

an object and receiving the sign about this object. Of course, this parity occurs in the gestural element of the conative or the protodeclarative – that is, in the looking or pointing gesture that always accompanies the protodeclarative linguistic message, and normally also the conative. But it also has to occur in the linguistic element. Saussurean parity was addressed in Chapter 6 and again in 9.1.3. We saw that it involves something much more complex than mere neutral identity. The interlocutor’s meaning has to be seen as both identical and opposite to one’s own. I put forward this complexity in relation to ‘deictics which cannot be repeated as an echo’. However, from the outset we insisted that this case was only the tip of the iceberg. In any vocative, the speaker calls, and the addressee receives the call. This sameness and difference are perceived not only by the theorist but also by interlocutors themselves. In addition, Saussurean parity would be prior to language. As we have already stated on many occasions, the pointing gesture is understood by the recipient in its motor (i.e. productive) format without this being an obstacle for the recipient to perceive it as being addressed to him. We should remember also the mechanism, according to our hypothesis, underlying Saussurean parity. Essential to it is the second, or simulatory, mental centre. Only when the signal received is placed in this second centre can the signal be processed in production format and still be understood as a received signal. The second mental centre would have appeared primarily with and for pointing gestures: The gesture received meets the two conditions required – the condition of being kinaesthetically interpreted, that is, in production format, and the condition of being addressed to the interpreter. But let us move on to vocal communicative signs. Even once the second mental centre has been established, neither the intervention of the second mental centre nor Saussurean parity will automatically ensue for these signs. A sign from the vocal-auditory channel might follow the phylogenetically ancient mechanism, and dispense with ‘motor reception’ (or ‘reception in productionformat’). In human beings, intonation would operate in precisely this way (we have seen this already in 6.1.2). Kinaesthetic interpretation would be absent for intonational modulations and, therefore, the involvement of the second centre and Saussurean parity would also be absent. However, beyond mentioning the example of intonation, we must see why this type of direct emotional impact could remain even after the second mental centre is set up. In other words, we have to ask ourselves: when might the reception by a human being of a vocal communication not involve the second mental centre? I would reply that it would be when kinaesthetic interpretation (or, more generally, reception in production-format) is absent. This absence would involve a relatively accessible possibility, because kinaesthetic interpretation (this is where I wanted to get to) is much simpler in the bodily-visual channel than in the vocal-auditory channel. Let us remember what was hypothesised in previous chapters. Kinaesthetic interpretation of movements of others, although certainly only at the level of a posteriori expectations, already appears in non-human primates. But we should remember that,

 Becoming Human

in this group, this ability does not operate on all movements of others. Chimpanzees successfully match their view of the body of others to the internal sensation in their own bodies. In contrast, as we said already in Chapter 6, it is probable that vocal movements would find no kinaesthetic reception at all in non-human primates. In the case of cries, with their evolutionarily ancient communicative use, the step from direct emotional impact to kinaesthetic interpretation would have been much more difficult. Indeed, that step required a neurological change (which would be parallel to the neurological change in vocal production – Ploog [2002]). Besides, according to my general hypothesis, it would have required the second mental centre. Now, my additional hypothesis is that such kinaesthetic interpretation would not have been required in auditory-vocal types of communication until imitatively-learned articulatory-phonetic patterns were reached. As a result, whether or not there is Saussurean parity for the auditory-vocal sign would thus depend on what the motor patterns involved were like, and would no longer depend on whether the recipient is in fact the addressee, or, contrastingly, simply an accidental recipient. Thus, the motor sequences learnt by imitation would be, not only sufficient, but also necessary for the Saussurean auditory-vocal parity. (As you can see, this a stronger proposal than what was suggested in 9.1.3, where we did not go any further than proposing that complex imitation is enough. Now, we are focusing mostly on the increase of difficulty that going from the visual-corporal channel to the auditory-vocal channel implies for the kinesthetic interpretation.) In short, if Saussurean parity is present there, it is because the articulatory-phonetic patterns of language involve the imitation of complex motor patterns that were at some point new to the subject. Are there any arguments to support the connection that we have just suggested between articulatory-phonetic imitation and vocal Saussurean parity? I think we can offer something useful here, as small and indirect a pointer as it may be. Our suggestion said that it is only those vocal communicative signs that depend on motor learning, only those privileged vocal signs, which reach Saussurean parity. Now, without moving away from articulatory-phonetic signs, we shall show a perfect negative of those privileged signs. If there really is a relationship of cause and effect between the imitation of complex articulatory-phonetic patterns and vocal Saussurean parity, then the absence of one of those features in a type of signs must be accompanied by the absence of the other. This joint absence of both features is what we will find in a very specific type of articulatory-phonetic signs.

12.3 From ‘choral holophrase’ to the Saussurean vocal sign As a first step, let us point out that children produce their first linguistic signs in chorus with adults. The adult has pushed them with phrases such as ‘Look, there’s nanny. Let’s call nanny. Nanny, nanny!’ or ‘Look, dad is about to leave. Bye-bye, bye-bye!’. And the

Chapter 12.â•‡ Between motor learning and the perception of beliefs of others 

child joins in these social rituals.2 We might call them ‘choral holophrases’. (I have just invented this word. But the relevant distinction, namely, the one that occurs between actions where an individual addresses or does something to another individual, and actions where two individuals each do the same action but without relating one to the other, is a distinction much used by researchers of children’s understanding. See, to cite only recent works, Premack & Premack (2003), or Gleitman (1990). The second type of action, which is sometimes called ‘parallel play’, would perfectly include ‘choral holophrase’). These specific holophrases are undoubtedly a very important element of linguistic learning. It cannot be chance that children get so much enjoyment from these situations. Why has evolution designed the conduct the child is practising here to be so enjoyable? Undoubtedly, because practising that conduct is useful for the child’s learning. Furthermore, the child’s evident pleasure ensures the frequency of this type of invitation (or teaching conduct) on the part of the adult. At this point we should remember that the child at this age is just coming out of the babbling stage. Their articulatory-phonetic motor patterns are just then beginning their journey of adaptation to the model. As a result, our hypothesis might predict that around this age the reception of adult model words would not yet imply a full and complete kinaesthetic interpretation. And, again according to our hypothesis, that as yet nonmotor (i.e., as yet non-Libermanian) reception would prevent true Saussurean parity. These predictions from our hypothesis fit very well with choral holophrases. Saussurean parity is still not required. The latter relies, we said, on the recipient recognising the request received as being identical to her own production of that request, and, nevertheless, continuing to understand that she is now the one receiving it, in spite of this. As we can see, this complex reception is not at all necessary for choral holophrase. In choral holophrase, both parties, the adult and the child, are doing just the same thing: both are calling nanny. Let us explore the point that the choral holophrase sign would not have been shaped by motor imitation. In this sign, the model’s only influence has been to select the most similar pattern from the child’s previous motor patterns. In 6.3.2, we saw that Marler’s description of how sparrows learn dialects would also cover how the child acts in the babbling stage. (Babbling in speech and ‘‘subsong’’ in birds, appears to be necessary for adequate vocal learning: Catchpole and Slater [2008]). At that age, the child still cannot manage to accommodate its vocal patterns to the model. Its ‘mama’, ‘dada’, ‘baba’, ‘tata’ are spontaneous and not learned patterns. The only thing that the child succeeds in doing at this time is to select from its own spontaneous patterns the one it prefers at each step, i.e., the one it see the most similar to the model. 2. Greenfield & Smith (1976) called ‘bye-bye’ and ‘ta’ “pure performatives”. But from the point of view that we have now taken, there would be a difference between those two signs. ‘Bye-bye’ can be used as a choral holophrase. On the other hand, in ‘ta’ (that is said to mark the transfer of objects from hand to hand, involving different objects, persons and settings), there is an intrinsic reciprocity. ‘Ta’ belongs to “the class of signs that express fundamental ‘object relations’ of incorporation into, and ejection from, the personal” (Ninio [1999]).

 Becoming Human

In that child there still would not be, I repeat, true motor adaptation to the model. (This may be similar to what de Boer & Zuidema [2010] call superficial combinatorial structure: “We define superficial combinatorial structure – versus productive combinatorial system – as combinatorial structure that can be observed by an outside observer in a system of signals, but that is not actively used by the agents using the signals”). By the same token, the acquisition by the child of the adult sign is still not a sequential latent imitation. If we apply our hypotheses onto this, it seems to us – as we already said – that the reception of vocal signs during that stage will not flow through the second mental centre, and it also appears, therefore, that at that age those signs still cannot be used Saussureanly. In other words, we would have understood why, as far as vocal communication is concerned, the first thing the child produces are choral holophrases3. Any other use, from the protodeclarative and conative of the holophrastic period to subsequent uses linked to syntax, in general all of these, would involve Saussurean parity. But we have to recognise that this proves nothing. The same would occur with the argument that the kinaesthetic interpretation of vocal signs lacks roots in nonhuman primates. All this is persuasive only within the general framework, of a general framework that (and this is the bad news) has not been found but constructed. There are reasons to be very humble. And by humility, I do not mean, of course, the humility of the legendary falsationism, but something much more serious. But we have already discussed such reasons above on several occasions. Here, now, in this chapter, we are clear, firstly, that each specific hypothesis must be viewed within the general network, and, secondly, that what we are doing in relation to the general network is precisely to wager on its success. After all these healthy warnings, we shall pick up our thread again.

12.4 Why did vocal signs based on complex motor imitation come into being? A question which can no longer be put off When dealing with children’s choral holophrases, we have seen the difference between these vocal patterns, which both in their reception and production would be gestaltic, unanalyzed units, and the articulatory-phonetic patterns of language. This leads us to the question of why sequentially analysable articulatory-phonetic patterns began to be used in primaeval times. The question has long been answered. As Bickerton (2009, p. 230) says: “The more words you have, the harder it becomes to distinguish one from another”. There are certainly indivisible, gestaltic, patterns of sound easily distinguishable one from another. However, the number of those indivisible, gestaltic, 3. Was also there a time period of choral holophrases on the evolutionary or historical planes? It could be thought that, at least in the tasks that we have called four-handed, some shouts to beat the rythm would have been useful.

Chapter 12.â•‡ Between motor learning and the perception of beliefs of others 

vocal patterns cannot exceed a limit without risk of confusion.4 For that reason, when the number of required vocal signs exceeded that limit, the signs had inevitably to change their configuration and began to be acquired by sequential, latent imitation.5 The increasing number of required symbols would have caused an inflection point. Why did that number increase? To answer this question, let us remember my proposal at the beginning of Chapter 10. The first communicative messages would have been used for conative purposes. I also suggested then that the conative function can often be complied with without using symbols. If we accept these suggestions, what other possibilities are there? I think that, once both the human basic ability (mainly four-hand actions) and what I have called its big extension were achieved – only after this achievement, I stress –, the technical resources would have also increased. In short, there would be an ever increasing number of tools and varied techniques. Consequently, the requests during the implementation of tasks would have become more and more differentiated. Hammer/hammering, rope/tying, torch/setting fire, recipient for liquids/bringing water, branch/propping up... Once the gestaltic vocal patterns were not enough, vocal utterances had to incorporate the – at that moment only incipient – power to imitate sequential vocal patterns. To avoid confusion, that imitation had to be exact and faithful. Since it was important that children should be able to recognize and produce those vocal patterns at an early stage, the use of protodeclaratives would appear almost immediately. This really is a ‘just-so story’. Is it falsifiable? Perhaps the genome of Neanderthals might throw light on the following question: What emerged first, the ability for precise, sequential, articulatory-phonetic imitation or the ability for technical, hybrid imitation? But maybe this is only a remote possibility. However optimistic we may be, in no way can it be said with any certainty that the hypothesis is a falsifiable one. Consequently, the only merit it can really claim is to fit coherently with the rest of the hypotheses I have presented here.

4. These gestaltic units would be followed by articulatory-phonetic sequential imitation. But there would be a third stage, where each word becomes an abstract unit. According to my hypothesis (see supra, 9.5), this occurs in the productive format of adult reception and also in inner speech. 5. Certainly a different and more difficult question can be focussed on: how the transition from holistic to combinatorial repertoires of speech sounds could have taken place. “This is an evolutionary deadlock since fitness in language evolution is typically frequency-dependent; that is, since the first agent that has a mutation does not benefit from this in a population of agents” (de Boer & Zuidema [2010]). These authors present a solution: “When a repertoire of holistic signals is optimized for distinctiveness in a population of agents, it converges to a situation in which the signals can be analyzed as combinatorial, even though the agents are not aware of this structure” (see above 12.3). It is in this situation that the ability to use productive combinatorial system would be advantageous.

 Becoming Human

12.5 From motor imitation to the consequences of the protodeclarative: What questions arise? In Chapter 10, it was suggested that the original disambiguation, or, we might say, the original strictly referential connection of a sign, would have come from the hand of the protodeclarative. In Chapter 11, we also suggested that after this disambiguation it became easy to access the perception of false beliefs of others, and thus, as we shall defend further on, the origins of syntax. In short, we thus came to the triad formed by ‘protodeclarative, disambiguation, perception of beliefs of others’. What more can we add now? We can now give a more adequate description of the first term of the triad, or trigger term. The linguistic protodeclarative, it had always been clear, is absurd and inconceivable outside the process of learning and teaching signs. The clarification we are now suggesting is that full ‘motor adaptation to the model’ would be a crucial element of this learning. The auditory-vocal Saussurean parity, which makes non-choral uses possible, would not have arisen, if there were no learning of complex new articulatoryphonetic patterns. We would be left, thus, with the following derivation from the basic human capacity. From the latter, i.e. from the ‘third mode of processing eyes of others’, we come (via, as we already know, the ‘big extension’) to the imitative learning of complex motor patterns. This learning, I insist, would lead, in turn to Saussurean vocal parity, that is present in any non-choral use, including the protodeclarative. What we now have, therefore, is the ‘complex motor imitation, verbal protodeclarative, disambiguation, perception of beliefs of others’ derivational sequence. Two comments are necessary in relation to this. On the one hand, we must ask if we might not be placing excessive emphasis on the motor component. On the other hand, we should see how this affects the so-called ‘theory of mind’ or ‘understanding of (one’s own and of an other’s) mind’. Clearly, these two questions must be examined separately. Nevertheless, the close relationship between them is immediately obvious. The greater the distance we put between motor elements and mental elements, the less unified the field of ‘theory of mind’ appears, and, inversely, the more central we consider the motor level to be, the better we will be able to describe as a unitary development the ability which has been called ‘theory of mind’. Which direction will my preferences take me? The reader, right from the Introduction, knows where I am going. I have already stated my core wager explicitly on several occasions. The human abilities normally labelled as being more mental could derive from what we call the ‘basic human ability’. It is now a question of specifying this idea faced with an important milestone in the sequence of derivations.

Chapter 12.â•‡ Between motor learning and the perception of beliefs of others 

12.6 Excessive emphasis on the motor component? Making explicit the anthropological approach underlying the hypothesis Let us assume that our idea was true, and that the motor patterns shaped by imitation are indeed at the base of the derivational sequence running from vocal Saussurean parity and the linguistic protodeclarative to the crucial perception of false beliefs of others. In that case, it would have to be concluded that these patterns and this motor learning are supporting the language edifice. At first glance, this emphasis on motor learning may be shocking. Would this not be a return to pre-cognitivist behaviourism? Would we not just be forgetting the mind again? I believe such accusations would be totally unfair. Let us begin by pointing out that imitative motor learning, as soon as it is considered in its true breadth, ends up embracing all of culture (and ‘cultural ratchet effect’6). On the one hand, as we have already suggested, this learning supports language, and therefore, all the acquisitions that depend on language. On the other hand, human techniques of any kind always involve ‘motor adaptation to a model’. This is the case whether we are talking about dancing, building tools or sewing. In this case, motor patterns are not only under the control of the environment but also and primarily under the control of the model. However, pointing this out would be to highlight the attention of our hypothesis to culture, and not to the mind. Nevertheless, the content of those possible accusations was the concealment of the mind. As a result, we have to focus on showing that it is the human mind itself that is being addressed in our suggestions. I shall put forward the two convictions that lie behind my hypothesis. Firstly, it is as a result of the movement observed in his body that the interiority of a person can be perceived in the first place (see above, 4.9.2). It is to precisely this perception through movement that the abilities of primates (macaques’ mirror neurons as well as chimpanzees’ visual attributions) would be acting as a prologue. Secondly, there is only a properly human mind when the mind of others is perceived. Having mental states is typical of animals too. The human mind would only appear when a mental state is considered to be precisely a mental state. Clearly, this step may occur for one’s own as well as for states of others. Nevertheless, it is much easier, I insist, when it occurs when facing mental states of others. The ability to perceive a radically different interiority would come, we have said this before, to be the origin of those distinguishing human characteristics to which the ancients alluded in terms of soul or spirit. 6. This term has been used by Tomasello in order to describe “exclusively human processes”. However a directional cultural ratchet effect has been shown in songbirds: See Fehér et al. (2009), who introduced birds raised in isolation into social groups, where young males were exposed only to isolates’ poor-quality song. Within three to four generations, these isolate lines produced something approaching normal song. Certainly this is an extremely interesting datum. However I would emphasize that the result was something approaching normal song, i.e., something where novelties had been erased.

 Becoming Human

In these two convictions is the key to the hypothesis. The interpretation of movements of others as belonging to a radically different self (in other words, their kinaesthetic interpretation performed as a true simulation and not mere a posteriori expectation) would be at the base of the properly human mind. That simulation would arise at the beginning for pointing gestures, that is, for some movements that are being addressed to oneself. This is how it would be, I repeat, at the beginning. Later, the simulation could be extended to any new complex motor pattern. With this ‘big extension’ (internal to our species or not?: remember 8.10), two victories would be won. First, there will be simulation of vocal movements, which in primates were completely outside of the a posteriori kinaesthetic expectation. Secondly, there will be simulation of movements of others not addressed to oneself, i.e. of movements for which at the beginning the old resource of expectation and not the simulatory centre would intervene. If it were not for this ‘big extension’, there could not have been ‘truly imitative motor learning’ in the field of non-communicative techniques. Nor would the articulatoryphonetic learning involved in language be possible. This learning is exactly what, in this chapter, we have placed at the base of vocal Saussurean parity; as a result, the connection between the ‘big extension’ and linguistic protodeclaratives has now been achieved But what we are looking at here is primarily the level of signifiers. Is this not a peripheral and secondary level in language? So said the objection to which we gave voice earlier (in 12.1). We addressed this objection above with a merely defensive strategy. The final conclusion would be the same in any case, even if the articulatory-phonetic level were really only a peripheral and secondary level. The meanings of innate communicative signs do not need to be learned. It is then only for learned articulatory-phonetic patterns that the level of meanings must be the object of learning. That strategy served us very well at that point. But perhaps we should not merely remain on the defensive. Why should an emphasis on the symbolic and evocative capacity lead us to scorn the motor element? Certainly, we do not even know if evocation is a general ability in animals, or if, on the other hand, it is characteristically human. We acknowledged this in Chapter 7. However, in case it was indeed a characteristically human capacity, I have tried above (in 8.8,8.9, and 9.3) to derive it from motor adaptation to the model. If this turned out to be right, then, through the mediation of the ‘motor adaptation to the model’, not only vocal Saussurean parity and the linguistic protodeclarative but also the symbolic aspect itself would sink its roots into the basic human capacity.

12.7 Language, the pointing gesture and symbolic play, three modes of involvement of the simulatory centre At this point, we ought to compare language with the pointing gesture, on the one hand, and with symbolic play, on the other. Let us look first at the pointing gesture.

Chapter 12.â•‡ Between motor learning and the perception of beliefs of others 

Why in the pointing gesture do observed movements of others pass through the simulatory centre? We believe that kinaesthetic interpretation is practically assured for this type of movement. As a result, the involvement of the simulatory centre in observing these movements will depend exclusively on whether these are addressed to the observer or not. This is, at least, what would happen at the beginning. Let us move on to symbolic play. Why is the simulatory centre involved here? Here, since personal interaction is lost, everything rests on motor adaptation to the model. Now let us apply the same question to language. Why is the simulatory centre involved here? Right now, language appears clearly as a mixture of what was seen in the pointing gesture and what was seen in symbolic play. Let us specify the situation of each of these characteristics in language. Personal interaction? Yes. In this area, language almost always coincides with the pointing gesture and opposes symbolic play. Motor adaptation to the model? Yes. In this area, it coincides with symbolic play and is opposed to the pointing gesture. Evolutionary roots that back up kinaesthetic interpretation? No. In this area, language, with its vocal-auditory channel, is opposed to both the pointing gesture and symbolic play. Shall we comment on the implications that this picture has within our hypothesis? In the vocal-auditory channel, in order to ensure the involvement of the recipient’s simulatory centre, personal interaction is not enough and motor patterns previously learned by the producer and recipient are required. Then, as a result, it turns out that the involvement of the simulatory centre in this channel not only manages to ensure Saussurean parity just as it had originally been achieved in the pointing gesture, but would also achieve other results. For example, widening Saussurean parity to a message that is not being addressed to me, and perhaps also, causing absent objects to be evoked. Another way of commenting on this picture within the hypothesis would be the following. It is understood that there could have been no kinaesthetic interpretation (interpretation, as we know, through a posteriori expectations) for vocal movements in non-human primates. Cries had always had a communicative function. Therefore, if we accept that the simulatory centre is exclusive to human beings, the impossibility of chimpanzees having kinaesthetic interpretations for cries of others is clear. As a result, in order to break with the inertia of this evolutionary past, it was necessary for vocal communication to be supported by motor imitation if the simulatory centre was to intervene in this communication channel. But enough of these comments. It is time to stop this running round in circles.7 7. Instead of continuing to go round in a circle, we need to move forward. Clearly, as we have acknowledged many times (e.g., in Chapter 6, note 6, p. 96 or in 12.3), the suggestions in our network or synthesis support each other reciprocally. Nevertheless, this is not the same as tediously going round in a circle. For reciprocal support to not be rejectable, we have to demand that any suggestions which support each other reciprocally be able to generate together another different suggestion. Therefore, of course, in the end it is only the long-term and historical continuity that can decide. But in short-term step-by-step progress the ability to generate a new suggestion is, I repeat, what saves the previous reciprocity from rejection.

 Becoming Human

Of course, the language features that stand out in that picture are only those which the comparison made here has allowed to be highlighted. The result would be different if language were aligned with other terms of comparison instead of with symbolic play and the pointing gesture. In previous chapters, we have insinuated a comparison between language and intonation (kinaesthetic interpretation, yes; kinaesthetic interpretation, no), or also between language and technical learning (evocation, yes; evocation, no). But let us return to the point that interests us here. The interpretation of movements of others – and more specifically, such interpretation by the simulatory centre – constitutes the key aspect of the three human characteristics which are the pointing gesture, symbolic play and language. Now I will try to describe better what above I called my emphasis on the motor component. What interests me are the movements of others, not the subject’s movements. What I am focussing on is in no way a question of scientific methodology. It is not at all my point that for the theorist an individual’s internal states would only be observable to the extent that they have been reflected in movements. My interest centres on the question of how each individual perceives another individual’s interiority. This is where the primacy of the motor level is indisputable. When we are dealing with an individual’s internal states, we could claim the first-person point of view, and in this way the exclusivity of the motor level could thus be denied. However, things are very different for my question about how an individual accesses another individual’s interiority. The movements of others are crucial for accessing the interiority of others. It does not matter if they are linguistic or non-linguistic, communicative or non-communicative. The necessary requirement lies in them being movements that have been opened out, and are, therefore, observable from outside. But, is this concept of interiority of others not too wide a concept? Kinaesthesia, visual perceptions, beliefs and more: will they all be the same thing? Would we be incorrectly including very different things? With this, we come to the subject we announced as the second commentary and which it was compulsory to do when faced with the ‘motor imitation – protodeclarative – disambiguation – perception of beliefs of others’ derivational sequence.

12.8 The unity of the ‘theory of mind’ In the general hypothesis, such as it was put forward in the Introduction, the appearance of the second centre of the mind is the decisive milestone. Clearly, the activity of the second centre is performed on levels that are very different one from the other. For the ‘third mode of processing the eyes of others’, or also to perceive a kinaesthetic interiority which is addressed to me, the second centre would be simulating the sensorymotor level of others. In the perception of beliefs of others, on the other hand, the simulatory centre would operate on a completely different level, which is mental in the most typical sense of ‘mental’. Between these two extremes, we find the symbol, in

Chapter 12.â•‡ Between motor learning and the perception of beliefs of others 

which the second centre, through an imitated motor pattern, evokes absent objects, but without managing to declare them precisely. Clearly, this range of operations is a wide one. Nevertheless, in the general hypothesis, these differences are all on the same side of the truly important frontier. What we have attempted to show in the present chapter is the fluid continuity of a specific stretch of the second mental centre’s activity. From the ‘motor adaptation to the model’, I have come as far as the protodeclarative along several routes which, in my hypothesis, would all be obligatory and not optional. The first would be through the ‘motor reception’ (or, more precisely, acquisition and reception in production-format) of speech, and the consequent vocal Saussurean parity. The second, through the process of learning. We ought perhaps also to repeat that the double intonation option (conative and non-conative) which appears with the protodeclarative could not occur if there is no common nucleus independent from intonational modulation i.e. if there were not some articulatory-phonetic ingredient, and it is difficult to imagine this unhooked from motor imitation. This is how we have opened out the protodeclarative backward connection. Once this is done, the suggested route to the perception of beliefs of others involves a rather long journey: basic human capacity, motor adaptation to a model, protodeclarative, disambiguation, perception of beliefs of others. However, one single ability – this is the point we were making – crosses the different levels without any abrupt breaks. The duality of centres in the mind would always be the key. An enormous step forward is made between the understanding of gestures of pointing and the perception of beliefs of others. That is unquestionable. Nevertheless, this step forward would not be incompatible with the deep unity of these abilities. The perception of the self of others (i.e., the true simulation) would also be involved in human understanding of pointing gestures. The ground covered by the ‘theory of mind’ will have, therefore, to be extended beyond the boundaries that contained it at first. Nowadays, more and more researchers are thinking exactly this. While the old approach focussed on the frontier at four years old, now it is the ‘11-months revolution’ – let us use Tomasello’s term – which seems more and more important.

12.9 Derivations from the perception of beliefs of others: Introducing the following section Up to now, however, the suggested derivational course has not yet reached true human language. In full language, even semantics itself is shaped by syntax and is indissoluble from it. Therefore, as long as they do not reach syntax, our derivations will not have solved the problem. This chapter has connected the previous derivations with a specific anthropological approach of my choice (supra, 4.9). Nevertheless, it has not advanced the task at all.

 Becoming Human

We cannot delay any further, therefore. We must push on to the question of how syntax must have originated. Many pages back I announced that predication and with it, compositionality, might have been created from false beliefs of others. Now we have come from the sign established by the protodeclarative to a particularly simple means of perceiving false beliefs of others. It is, therefore, the moment to attempt what we announced, and move on to the chapters on predication.

section five

Pregrammatical, theme-rheme syntax Revisiting Frege and Vygotsky

chapter 13

From beliefs of others to communicative predication 13.1 From beliefs of others to predication: A relationship that can be interpreted in two very different ways We have reached a perception of beliefs of others and a motivation, therefore, for the predicative communicative function. A human being in the Holophrastic Era has perceived an erroneous belief from the producer of a message (perhaps a message requesting an unavailable material, or calling someone absent). And thus, for the first time we have a human being motivated to produce a predication. She wishes to correct, or update, that erroneous belief.1 Certainly humans do not just communicate because they perceive errors in others; certainly there are other circumstances or contexts where people have a much stronger urge to produce communication signals. All this, however, does not constitute a serious objection to what I have just said. We must bear in mind that I am focussing only on the origin of the communicative function of predication – that, in other words, I am setting aside the conative function. But let us concentrate on what the objection mentioned above would say about predications. Admittedly, predication very frequently fulfils other functions in present-day language. On some occasions, it carries out Malinowski-Jakobson’s phatic function; on others, it shows a feature (a cultural level, e.g.) which the subject is proud of. But those functions cannot

1. How do hearers receive linguistic information? Does language present anything new in this regard compared to animal communication? Jerison and Knight both stress this newness, but each of them in opposing directions. For Knight (2003, for example), who has been highly influenced by the question of Zahavi’s ‘cost of signals’, the novel element would be the trust of recipients. For Jerison (1988), in contrast, language, unlike perception, requires careful reception in which information contents would remain as if in quarantine. I align myself, it goes without saying, with Jerison. The reception (in production-format) of speech, the second mental centre, and now, this interpersonal (or dialogic) origin of syntax: all this requires me to take this stance. Certainly, language serves as a means of learning from other people. However, as Koenig & Harris (2005) have shown, young children are selective in whom they trust for information. And what of the fact that the honesty of animal signals has to be ratified by the biological cost? That fact in no way forces me to accept an attitude of mistrust in animals.

 Becoming Human

be the primaeval ones, since they can be fulfilled with simpler resources than predication and syntax.2 According to my hypothesis, the primaeval function of all predicative communication would be to transform beliefs of others. We must now analyse how this predication would arise. Thus, the question that will interest us is how exactly the origin of the predicative communicative function would relate to beliefs of others. But first, it is appropriate to remember that many authors have suggested that there would be a contradiction between Gricean pragmatics and the results of the ‘theory of mind’ experiments. The predicative communications, which, according to that Pragmatics, entail a belief being attributed to the hearer, appear in children (this is the alleged scandalous piece of data) much earlier than when they succeed in the tests on perception of beliefs of others. The author where I first read this was Risjord (1996) (see also Breheny [2006]). We do not need to deal with that alleged contradiction here. We have already seen that access to false beliefs of others can be made via much simpler routes than those of the classic test. But, accepting the link between perception of beliefs of others and original predication, we shall address two possibilities. According to the first of these, predicative composition, although motivated and selected by the very belief that has just been received, may merely reflect speaker’s knowledge. The speaker knows all the details that she will bring together in her predication – by definition, this must be the case. Therefore, the predicative composition would only, according to this first possibility, reflect and translate the structure of this knowledge (or, from the point of view of the speaker, the structure of reality itself) into natural language. In this first possibility, the perception of the belief of the hearer is clearly determining. Without this perception there would, of course, be no predication, the need for predicative communication would not even have been felt. Nevertheless, the influence of this perception of belief does not go very far in this first possibility, as it would only explain the origin of the predicative function, and not the origin of the predicative form. The composition would come out of the speaker’s head already assembled, or more exactly, from her head as it was before it began to perceive the belief of the hearer. Within the composition itself, there would be no role for the belief of the hearer. The second possibility, which is the one I defend, grants, in contrast, the belief of others (or more concretely, the belief of the hearer) a much stronger role, a determining 2. Certainly, Dunbar (1998) has proposed a function which we could perhaps make resemble to the phatic function and which, in his view, would have been crucial in the evolutionary emergence of language. (The great apes maintain social cohesion through almost constant grooming activities. Once the group size expanded beyond a certain number, however, it became impossible for each member to maintain constant physical contact with every other member of the group. Thus, language and gossip were developed as a substitute for physical intimacy.) But the substitution of grooming could have been achieved by much less sophisticated means than gossip and syntactic language. One of those economic means would be a liking for rhythm, for example: see Dunbar himself and also many other authors from Rousseau until today.

Chapter 13.â•‡ From beliefs of others to communicative predication 

role in the origin of compositionality. Why, in this possibility, would the compositional structure originally appear? Because in the mouth of the speaker the first part of the predication carries a cognitive content that is different to her own, and for this reason alone it is necessary to add the second part of the predication. First and second part: to what am I referring exactly? To subject and predicate? To what has come to be understood as the articulation of theme and rheme? Theme and rheme will be clarified further in a subsequent chapter, Chapter 16, but permit me, for the moment, to continue with my account of this possibility. When a designation (either a designation of an object, or a proper noun) stands for the speaker’s cognitive content, that designation includes within itself all the features that the speaker knows about that reality, both permanent and situational, both the qualities and actions in force in it. Therefore, without the mediation of a cognitive content different from one’s own, there would be no need for any additions. The addition or second part modifies this different content in the appropriate direction. This content may be, in the speaker’s opinion, either insufficient or erroneous. When this cognitive content is judged insufficient by the speaker, affirmative predications arise. In the latter, the predicate adds the feature that was wrongly excluded from the cognitive content of the first part. On other occasions, the content not belonging to the speaker, that is, the content involved in the first part of the predication, is perceived to include more features than it should. In such cases, the predicate will be responsible for removing the unneeded, erroneous feature. These would be, clearly, negative predications. But in both cases, what occurs is that the first part is transformed by the predicate so that it converges with reality (or, moving away from the speaker’s point of view and adopting the theorist’s, so that it converges with the speaker’s own content). Thus, in this option, as we said above, the influence that the perception of the belief of the hearer comes to have on the origin of the predicative function is very strong. It is not a case of the beliefs of my hearer making me select the most pragmatically appropriate predication. It is, in fact, much more than this. Why, in predication, is one of the characteristics of the reality in question mentioned explicitly as a predicate? Because the first part of the predication is not being understood by the speaker in the way, in the speaker’s own view, it should be understood. The key requirement is, therefore, that two different cognitive contents be held about a reality – on the one hand, the speaker’s knowledge, and, on the other, knowledge that appears insufficient or incorrect to the speaker. It is only when this is carried out that it becomes necessary to reformulate one’s own knowledge in terms of the insufficient knowledge, or, in other words, to correct or complete this insufficient knowledge until it is made to converge with one’s own knowledge. This duality of cognitive contents or files regarding the same reality would, it is clear, require the original compositional structure to depend on this feature (the duality of mental centres) which, as we have been hypothesising throughout this work, the basic human characteristic consists of. Clearly, the course running between this duality of contents and the initial basic capacity is very long. In previous chapters we have attempted to describe it in its

 Becoming Human

various steps. Nevertheless, the duality of mental centres would remain as its permanent nucleus. But let us return to the two possibilities. It is, I repeat, the latter, which we have just outlined, which we are interested in. We will have, therefore, to unwrap and develop this outline throughout this present Section. Where do we begin?

13.2 Predications of response or reply and quoting To support this hypothesis, we can begin by addressing the class of predications that are responses to the interlocutor’s previous message. In this type of predications, the nucleus of the previous message is operating as an important element. On some occasions, it will be repeated verbally and will constitute the first part of the predication.3 On other occasions, there will be no such tiresome repetition, but not for that reason does the nucleus of the previous message cease playing an indispensable role. What is the nature of the role played by the key term of the interlocutor’s previous message in predications of response? For the moment, it is very clear that in predications of response this term operates with the same meaning and the same referential-contextual landing it had in the previous message. No one disputes this copy on the semantic and pragmatic levels. What I wish to show now is that the copy would also include the cognitive level. The speaker would be loading the term with the degree of knowledge that he attributes to the interlocutor. As was hinted in the previous subsection, I shall propose that, for the speaker, the first part of predications composed of theme and rheme would always be metarepresentational in nature, that is, loaded with a different mental state (or, more specifically, a different belief state) to the speaker’s own. This, to my knowledge, has never been hypothesised for predications of theme and rheme in general (among other things, because there is no agreement about how to define theme). In contrast, for predications of response we could take the view that it has indeed been hypothesised, at least implicitly, by several authors. The key to this is the fact that, when tiresome repetition is chosen in the response, this tiresome repetition is within the category of quotations, and these have been related to the capacity for metarepresentation. Fitch (2004) is particularly interesting in this regard. He tries to comprehensively understand “the features of human language which chimpanzees lack”. But let us focus on what Fitch states with regards to quoting. “Vocal imitative skills provide a ‘scaffolding’ for theory of mind abilities, via the mechanism of quoting”. He applies this 3. This is maximally frequent when the interlocutor’s previous message is about a referent which is distant in time or space and which has not been the object of the previous conversation. In 9.4.2, we said that the unfolded repetition in those cases would help the recipient in the task of evocation. For children, this unfolded repetition would be an almost essential requirement. For adults, on the other hand, it is an optional but probably useful tool.

Chapter 13.â•‡ From beliefs of others to communicative predication 

statement, of course, in the obvious field of sentences of the type “John says (believes) that the conference has been postponed”, but not only there. He also points out: “The ability to retrieve the utterance (silently or out loud) along with its context, long after the communicative act is over, perhaps in concert with new information, provides a very important advantage in trying to make sense of others’ minds”. This last form of applying his statement interests us most at this point. Of course, as an author aligned with Chomsky, he does not at all suggest that this is the route along which we would originally have reached compositionality. Nevertheless, Fitch is generalising beyond the ‘reported speech’ the idea that reproducing imitatively another’s message eases perceiving the mind of the original speaker. That is what I have invoked by analysing the predications of response. The first part, or theme, of the predications of response, stands for the addressee’s incomplete or false belief (that is, of the previous speaker). We should also point out a criticism that Fitch received (in the same virtual colloquium where his article appeared). Origgi (2004) points out that in quotations there is something more than merely linguistic repetition: “Is it quoting a specific linguistic mechanism? Doesn’t it require already a ‘metarepresentational’ ability to represent in our own mind a representation?”. What Origgi was seeking to conclude was, clearly, that the direction Fitch had chosen needed to be inverted. It is not that quoting supports the ‘perception of the mind of others’, but, on the contrary, it is the ability for ‘the perception of the mind of others’ that supports the quoting. This question of the two directions is an interesting one. I believe we need to recognise two levels on which the mind of others is perceived. On the one hand, the repetition, aloud or silent, of a received message, and even the mere motor reception (or, more exactly, reception in production-format) of speech, involves considering that message as a message produced by another individual. In this way, even though we did not move off the sensory-motor level at all, there would nevertheless be a true perception of the interiority of others. On the other hand, the repetition (silent or aloud, immediate or very delayed) of some specific messages may lead to the perception of the beliefs of others. These two levels of perception of the mind of others are different, and therefore one cannot tackle the question of Fitch and Origgi’s directions if we do not identify which level we are addressing at a given moment. The repetition (some type of repetition) of the message of another individual makes it possible to perceive the belief of that individual. Fitch’s direction is appropriate here. However, if we think about a perception of the interiority of others which considers states that are different to belief, and specifically, kinaesthetic-motor states which are radically of others (that is, which are inaccessible for mere a posteriori expectations), then the repetition of the heard message would already be involving that level of the ‘theory of mind’. In that case, the direction would be Origgi’s. Returning now to our main thread, the analysis we have hypothesised of the predications of response has provided the model we shall extend to the rest of the themerheme predications. The quotation that appears in the predications of response is only

 Becoming Human

a particularly eye-catching example of a much more general situation. In a moment, I shall extend the hypothesis in this direction.

13.3 The embedding of the interlocutor’s message inside the child’s message: The interpersonal origin of recursivity Certainly I shall extend the hypothesis to the rest of the theme-rheme predications. First, however, I wish to point out that deciding to begin by exploring predications that respond to linguistic productions of the hearer is by no means an unwarranted tactic aimed at easing the task of exposition. Far from being an unwarranted decision, it is inspired by the facts. When the child comes through the holophrastic stage and reaches predicative communication, it does so to reply to a linguistic intervention from the adult. “Conversation facilitates the development of multiword speech. In the transitional period, the child may be cognitively ready to do this only with the help of a conversational partner”: Veneziano (1999) (or, more generally, Vygotsky [1934]). Let us apply my hypothesis (what we might call ‘repetition at the cognitive level’) to a typical example of the child’s earliest predications. The adult and the child are playing with some wood blocks, building a tower. At a given moment, the adult asks the child for more blocks: (Spanish) “Dame más” (“Give me some more”). The child, who has the completely empty box of blocks in his hands, replies (Spanish) “Más, no”4 (‘más’ = ‘more’; ‘no’ = ‘no’). In Spanish, an adult would normally have replied, “No hay” (“There is none”) or “No hay más” (“There is no more”), although, if he wished to emphatically stress his surprise, he could have chosen to say “¿Más?: No hay” (“More? There isn’t any.”). Though it may interrupt the thread of the argument, I must mention an additional benefit, or secondary reading, that we may extract from this example. Here, the adult’s previous message is a request to the child. Thus, the reception of a conative message may cause the sprouting of the first perception of false beliefs of others in the child too.5 It is clear that the suggestion we made above (in 11.5.3) is not an ad hoc 4. There was a little temporal space between words. But, in my view, this ‘Más, no’ would not be a ‘successive single-word utterance’ (Veneziano [1999] and Herr-Israel & McCune [in press] study this type of utterances). 5. In a previous chapter, we said that, among conatives, request as well as call messages are able to reveal the producer’s false belief to the recipient, and for this reason, are also able to give rise to a predicative communication of response. Now, having seen an initial example of this process, we note that the previous adult message is one of petition, not call. Is this chance? Or, on the other hand, is the triggering of the process easier after a petition than after a call? My personal observations seem to indicate that it really is easier with a petition than with a call. I have found some responses to a call which reveals its producer’s false belief (e.g. to the vocative “Manolo!” said when Manolo had already left) and all of these are later by some months to the first predicative response. But my observations are too scant, of course.

Chapter 13.â•‡ From beliefs of others to communicative predication 

artefact imposed by the historical approach. But we shall continue with our analysis of the child’s predication. The child’s ‘more’, it is clear, repeats the adult’s ‘more’ on the articulatory-phonetic, semantic and pragmatic-contextual levels. But what I shall hypothesise is not this, but also that it would repeat it on the cognitive level. The adult’s ‘more’ implied the false belief that there are still blocks remaining in the box. This false belief would also be involved in the child’s ‘more’. It is precisely for this reason that the child has felt the need to add the ‘no’ which is the core of its message. In this simple ‘Más, no’ reply, the recursivity that is typical of syntax would have been introduced, however germinally. The mother’s message has become a part of the child’s message. Thus we would now have a much more primitive recursivity even than in Everett’s (2005) Amazonian language, but one which would fulfil the definition of recursivity. As has been said so many times, embedding an entity within an entity of the same type, that is, self-embedding or recursivity, is a phenomenon that has not been observed in animal communication systems. Therefore, it is logical that we should be fascinated by the question about its origins. According to my hypothesis, recursivity (originally, the embedding of one message inside another) would have arisen interpersonally after the false belief of the interlocutor had been grasped. Fitch (2004) says that, although recursive syntax has not yet been observed in animal communication systems, he thinks it is possible in principle that it may be found in some of them, although naturally without any corresponding semantics. The opening toward the importance of vocal imitation (the opening for which I praise him enthusiastically) leads Fitch to an idea with which I do not agree. “A mockingbird could ‘quote’ the phrase-structured utterance of another bird within its own phrases, or a humpback whale the song of another male”. In my view, this would not be authentic recursivity, however much this type of copies might be found in whales or birds, since for the repeater that piece of song would not have the status of an independent message. The simulatory centre, or second mental centre, is the only thing that can preserve that status in the repeater. Only if, during the repetition, the repeaters continue considering this piece as having been produced by the other individual, only then will they be able to continue considering it as a message.6 Without this requisite, Before finishing this note, I must add something more. It was a student, Juan Pavón, who pointed out to me that “the adult’s request to the child would have to be easier than the alternative which would consist of a call to an absent addressee, because no third person is involved in the request”. Only after hearing this perceptive comment, that ‘prediction’, did I start to look in my data, and to obtain thus initial proof of it. 6. On this point, I wish to recall Benveniste’s article on the language of bees (Benveniste [1952]). The key element in this article is astonishingly current. Let us highlight the extent to which it is astonishing. Since the year it was written, floods of articles on the comparison between human language and animal communication have appeared. In addition, the topic it addresses – the language of bees – is too primitive for us to expect a useful comparison with human

 Becoming Human

the repetition of the piece of the song would not imply true recursivity. The piece is repeated, indeed, but not as the message that it originally had been.

13.4 Syntactic recursivity and human exclusivity: My disagreement with Hauser, Chomsky & Fitch Once we have come in our derivation to syntactic recursivity, it becomes necessary for us to address the ‘language faculty understood in its narrow sense’ coined by Hauser, Chomsky & Fitch (2002). Syntactic recursivity would be, according to these authors, the only exclusively human characteristic of language. Clearly, they insinuate that some animals present this characteristic in some domains. Nevertheless, they add that only human beings would have moved it into the communicative domain. The recursivity or embedding of one unit within another of the same type would be – this is their thesis – the nucleus that makes language exclusively human. I am not at all against syntactic recursivity being an exclusive capacity of human beings. But I place stress on other features that are also exclusively human. These features would be the foundation from which we are proposing syntactic recursivity be derived. Even at the risk of falling into tedious repetition, I shall tease out the contrast between my vision and that of Hauser, Chomsky & Fitch’s article. Using a look or the finger to point constitutes an exclusively human capacity that is not considered at all in this article. In what I am suggesting, the simulatory centre required by this basic human ability would make possible an exclusively human type of imitation ability, i.e. the imitation of new complex motor patterns. Hauser et al., in contrast, place no emphasis on this motor imitation. Certainly, they highlight the poor imitative capacities of non-human primates. Nevertheless, they do not point out the decisive frontier, that is, they do not address the step from the useless imitation of simple movements to the learning of new motor patterns. Let us avoid the protodeclarative and everything else. Instead of gathering more details, let us go to the crucial point. These authors do not address at all the possibility of an interpersonal and historic origin for syntax. For them, syntax would be innate, and would have appeared in

language. At other levels – cooperation and displacement – there is certainly a strong similarity between these languages: see Bickerton (2009). However, if we focus on the underpinning processes involved, the difference between the two languages can hardly be exaggerated. Insects and their robotic intelligence, Brooks’ ‘cockroach’ and the rest: it all seems to take away any hope that the comparison could contribute anything. And, nevertheless, Benveniste tells us that “the main difference lies in a bee being able neither to repeat another bee’s message nor reply to it”. That short line, in my view, hits the very centre of the bull’s-eye. We should note in particular how he places the two negations in conjunction and in parallel –‘neither repeat nor reply’.

Chapter 13.â•‡ From beliefs of others to communicative predication 

biological evolution.7 In this scenario, syntax should have appeared with astonishing evolutionary speed. So, some generativist authors have thrown themselves into looking for solutions for this problem (Uriagereka & Piatelli-Palmerini [2004]). In contrast, this would absolutely not be a problem for my set of suggestions even in the extreme case of Neanderthals still lacking the pointing gesture using a look or finger. In my view, a biological innovation still outside language would suffice in order to reach language (an innovation which either would be identified with, in that extreme case, the ‘basic capacity’ itself, or, according to other less extreme alternatives, with what we have called ‘the first big extension of the simulatory centre’). That innovation would have made possible, and even probable, a succession of historic steps that would end up creating language. Here, I say once more, we have begun by addressing those features that are exclusively human but not strictly linguistic. In this way my set of suggestions is now managing to derive an initial recursivity from the basic human capacity. ‘The interlocutor’s message (which has no reason to be syntactic), plus this message’s correction’: this sum of elements would create a new message, now syntactic, although still pregrammatical. In this way, a message (a whole message, complete and closed) is embedded within another message. More concretely, thanks to the second mental centre, the interlocutor’s message would retain its status of a unitary and independent message despite being subsumed under my reply. Much later, recursivity would have become grammaticalised, and would have started to be able to operate on an intrapersonal level. But it is not yet the moment for us to deal with this. Nowadays Bickerton (2009, p. 6) having accepted Chomsky´s Minimalist Program, concludes: “If the brain obeys Merge, it does not insert anything within anything, but merely merges ever-larger segments of lexical material with one another until a complete sentence is achieved.” So, for Bickerton, the traditional recursion would be an obsolet concept. “It is, of course, posible to adopt a looser definition of recursion – any process that uses the output of one stage as the input to the next. However, recursion defined in these terms could apply to almost any process – a bird building its nest, for example.” (see also Rizzi [2009]). This is certainly a very different concept of recursion. However, it is the traditional recursion that I am focusing on. In my view, this recursion in the sense that I am giving it could cover not only the pregrammatical or grammatical syntax, but also and with the same validity, the very different process involved in numeracy. I would say that exclusively human numeracy comes into being when a set 7. It would have appeared, they say, invoking Gould’s spandrel, in the same way as the spherical triangles between the arches under a dome; that is, indirectly and without any functionality of their own. However, I insist, this is not what I am interested in discussing. Adaptive incremental evolution (Pinker & Bloom [1990]) or mere spandrel, that is not my point at all. I reject the idea that syntax appeared in biological evolution. It would have appeared, in its original version, that is, as compositionality of theme and rheme, interpersonally at some point in the history of our species.

 Becoming Human

is reformulated as ‘the previous set + 1’. Until this reformulation is achieved there will only be subitisation. Thus, in exclusively human numeracy it is necessary to retain a certain amount of independence for the previous set. A whole set, complete and closed, is embedded within another set. (Later, in Chapter 18 we will get back to this point.) Let us come back to the main proposal of this chapter. Why could syntactic recursion be simple and pregrammatical in primaeval times? Because it involves the embedding of a message as such within another message, and because it is, in sum, an interpersonal and second-person recursion. I want to point out that Koschmann (in press), in order to find evidence of recursivity in simple (“grammatically un-embedded”) structures of naturally produced discourse, provides several examples that serve as demonstrations of how speakers and listeners construct larger interactional structures using turns at talk as building blocks. “Consider this example from Sacks (1992): ‘A: Did you put the garbage out?/B: Did I put out the garbage?/A: Yes./B: Yes.’ In this example, the second question and its response are embedded within the first Q–R pair.” Here, Koschmann focuses on an echo-question, i.e., on a question that quotes the previous message of the interlocutor. Therefore, this example might suggest the connection that I am proposing in this chapter, the connection between quotation and recursivity. (In Chapter 20, I will study echo-questions.)

13.5 The predicate ‘no’ which occurs in the child’s first predications We have already said earlier that the cognitive content of the interlocutor may either include some feature that in the judgement of the speaker does not belong to it, or on the other hand, lack some other which really does belong to it. In the first case, the predication that undertakes the task of transformation will be a negative predication. In the second case, it will be affirmative. But, at this point, we must analyse the child’s ‘no’, the predicate ‘no’ which very often occurs in the child’s first predications. (See again 10.3 and 10.4.2: what lessons we can and cannot transfer from child development to hypotheses on language origins?) Here, we would not have proper negative predication. No feature is suppressed here but, instead, a wider rejection occurs, an ‘amendment to the whole’, if you prefer. At this age, children cannot carry out the subtle job of choosing the pertinent transformations. They simply reject the message of the interlocutor. This is the first point we wanted to make about this example; but there is a second, which, in my view, is even more interesting. In the ‘Más, no’, the message of the interlocutor is rejected both as a speech act and as a wrong belief. Certainly, false belief (the false belief that the child attributes to the adult) is involved in the mother’s ‘Más’ as the child has perceived it. Nevertheless, here, in the child’s reply the rejected false belief is still confused with, and is still the same thing as, the immediately perceivable speech act. I believe that the fact that this occurs in the

Chapter 13.â•‡ From beliefs of others to communicative predication 

earliest predication is absolutely not chance. In Chapter 11, we hypothesised that the false beliefs of others can be more easily accessed when one receives a speech act. We are now highlighting a new aspect of this mediation. At the beginnings of language in children, the rejection of the false belief originates as a rejection of the speech act of the interlocutor. The ‘no’ in our example rejects the mother’s misplaced request and at the same time and jointly with it the belief that there are blocks remaining. Paths rather like this one, paths that start out from the perceptible level of speech acts and reach more typically mental levels, have been defended in very different fields. For example, Sweetser (1991) defended something similar in the field of modal verb meanings. From ‘No-one can say that a father is younger than his son’ we would come to ‘It is not possible for a father to be younger than his son’. In this way, the attitude toward a speech act would have ended up giving rise to the attitude toward a belief, that is, giving rise to ‘We cannot believe such a thing; such a thing is impossible’. Although there is, of course, some similarity, we should clarify the considerable differences between our hypothesis and Sweetser’s, whether or not the latter is true. While in the course suggested by Sweetser attention is given to the explicit stating of logical impossibility, we, on the other hand, are detecting a much earlier course, the child’s movement from the reception of a linguistic message to the rejection of the belief of the interlocutor. Soon, only some months later, the child will be able to dispense with the mediation afforded by the speech act of the interlocutor. This is true. But at the beginning of its learning, this mediation, I insist, would be fundamental. It is therefore interesting to compare this with another example. We see exactly the same words, but in a very different context. The adult has been insisting that the child finishes its vegetable purée. ‘Have some more; here, a little more’. The child has no desire to continue eating and replies (Spanish) ‘Más, no’ (‘More, no’). This example is from an age some months earlier than the blocks. However, regrettably, we have to take this data about the age difference with precaution, because the two examples do not come from the same child. Moreover, the two children’s calendars of linguistic development differ considerably one from the other. There is no false belief of the mother involved in the ‘Más, no’ of the vegetable purée. Simply, the adult’s order or invitation clashes with the child’s desire, and it rejects the order. The child’s message really equates to a request or order, however much it may be a counter-request. The adult’s insistence annoys the child; the child wishes to suppress such requests, and uses verbal resources (the verbal resources it has to hand) to achieve its desire. In this case, therefore, we have the key element of conative function. Let us clarify the difference between this ‘Más, no’ and the ‘Más, no’ in the example of the blocks. Certainly, in both cases, the adult’s speech act is rejected. The reason, however, is different in each case. In the case of the vegetable purée, the adult request is rejected because it clashes with the child’s desires. In the case of the blocks, in contrast, because the request was wrong. It is a small, but enormously important difference.

 Becoming Human

In the second case we have, I insist, a revolutionary new development. At the very core of the rejection of the speech act of the interlocutor, the child would have come also to the rejection of the false belief. However much in its origins predication is still close to the other type of rejection of the speech act of the interlocutor, we should not be deceived by this. The ‘no’ of the blocks is on a different side of the frontier. With it has arrived the communicative function of predication.

13.6 Initial predications and the sophisticated task of opaque contexts: A comparison that leads us to stress the origin of forked archives At this point, it is convenient to address an interesting question regarding the development of the ‘theory of mind’ in children, which was raised by Reboul, in press. This author is surprised by the finding that the perception of false beliefs of others precedes the perception of the insufficient knowledge of others (a perception that is necessary to resolve the task of opaque contexts). The criteria of closeness to reality would provide, she tells us, the following ladder: true beliefs of others, incomplete beliefs of others, and, lastly, false beliefs of others. Why does the child’s order of acquisition in this area not adhere to this order? The opaque contexts task is described classically in philosophy of language as the task of understanding that someone who thinks Aristotle is a disciple of Plato may, however, not believe that Alexander the Great’s private tutor is a disciple of Plato. These classical examples, obviously, are replaced by other more appropriate ones for the experiments with children. It is observed (see also, for example, Apperly & Robinson [2003]) that children do not succeed at this test until they are 6 or 7 years of age. “In contrast, the perception of false beliefs of others occurs two years earlier” (this is the datum which Reboul finds surprising). How do we explain this? This previousness that Reboul mentions is not only real, but, in my view, is even stronger than she depicts it. If we leave the classic ‘false belief ’ tests behind, that is, tests such as the Maxi test, and accept the interlocutory mode of perceiving false beliefs of others, then we would have to estimate the previousness at considerably more than two years. However, in my opinion, the order that children follow in their conquests, far from being strange, is what would be expected, is just what makes sense. I would point out that false belief, by clashing head-on with reality (or by clashing, if we wish to say it from the theorist’s perspective, with what the subject believes to be reality), is more persuasive in requiring the archive to be split in two. The false belief of others would therefore be perceived before the subtle differences involved in the task of opaque contexts. We have already seen that the first predicate or rheme that children produce is ‘No’. There would be two reasons for this. Firstly, the ‘amendment to the whole’ is the easiest method of correction. The second reason, which is the one that interests us here, has to do with the easiest route for a statement of other individual to finally succeed in being considered a mental state and not yet the expression of

Chapter 13.â•‡ From beliefs of others to communicative predication 

reality itself. This conversion into mere thought or mental state is, we hypothesise, much easier if there has been a strong clash, an absolute incompatibility, between this statement and reality (or, more exactly, the reality for the subject).

13.7 The non-redundant predicate: Our next stage of exposition Up to now we have spoken only about the child’s initial predications, or, in other words, about predications of response. However, whatever the original importance of responses may be, it is clear that there are predications that are not responses. Therefore, the task of extending our hypothesis to predication in general now awaits us. Having come this far, we can formulate our hypothesis with other words. If, as we are suggesting, the first part of predication has as content a belief which is different to one’s own (i.e. different to the belief of the speaker), then it turns out that the predicate, or second part of predication, would have no reason to belong (or, rather, would not belong at all) to the cognitive content of the first part. Therefore, a predication, however true it may be, would not at all have to be redundant. A predication can be at one and the same time true and not redundant. This formulation sends us back to a problem which was already detected by Frege. In the following chapter we shall therefore analyse the problems that Frege attempted to solve with his concept of Sinn.

chapter 14

Revisiting Frege How can a predication be at one and the same time true and not redundant?

14.1 Why did Frege coin the term Sinn? In 1892, Frege realised that judgements with equality entailed a problem. ‘The morning star is the evening star’. Since this judgement is true, we can replace the term ‘the morning star’ with ‘the evening star’. But what this leads to, namely, ‘The evening star is the evening star’, is a completely different judgement from the one we started from. While this was an interesting and informative judgement, one that had involved a discovery, the judgement after the substitution is superfluous, redundant and entirely lacking in interest. How can we explain the difference between these two judgements? In other words, are the terms ‘the morning star’ and ‘the evening star’ equal or not? Both terms have to designate the same object in the heavens. If this were not the case, the judgement of identity would not be true. The two terms, therefore, have to have the same reference, Frege says. However, alongside this, there has to be some difference between them. If not, the difference between the judgement that had been the product of a discovery and the superfluous judgement could not be explained. In his youth, Frege had sought the solution in the difference between the ‘morning star’ sign and the ‘evening star’ sign. But in 1892 he rejected this attempt at solving the problem and noted the reasons that moved him to change. A difference between signs is a question that only concerns the specific language to which those signs belong; it is, in short, a merely lexicographical question. As long as one considers only that difference (the difference between signifiers and not between signifieds, we might say), justice will not be done to the interest or newness of a judgement that was the result of an astronomical discovery, or, more generally, of a discovery on the level of the things themselves. As a result, we can make use neither of different references nor of the difference between signs in seeking a solution. Those routes are closed, but the problem persists. Yet the difference between the two judgements has to be explained. What do we do, then? Frege coined a new concept, Sinn or sense. ‘The morning star’ and ‘the evening star’ would be two different ‘senses’ of the same reference. For this reason, a judgement can be at one and the same time true and not superfluous. The single reference allows it to be true. The two different senses avoid superfluity or redundancy. This is Frege’s

 Becoming Human

conclusion. According to him, the problem would thus be solved. But does what he tells us really solve it?

14.2 Is the problem really solved simply by coining the Sinn? Frege presents the Sinn to us as the element that provides the solution. What else does he say about the Sinn? In the normal situation, a sense (Sinn) corresponds to a sign, and a reference to the sense. The sense does not coincide with the reference, which, Frege says, is impossible to know completely. Nor does the sense coincide with the sign, since the difference of signs, he says, may not involve any difference of knowledge. The sense of an expression is the “mode of presentation” of the item referred to. The judgement of identity that succeeds in avoiding superfluity or redundancy is, he says, the judgement of identity that identifies two different contents. This is the solution that Frege offers to the problem. But does this explain why ‘the morning star’ and ‘the evening star’ are two different contents? That is my question. The difference between ‘the morning star’ and ‘the evening star’ is obvious at the level of signs. But is it just the same on the level of the contents of knowledge? The speaker of this judgement is, by definition, someone who knows that the morning star is the evening star. The cognitive content that the speaker has about this reference will include, therefore, both characteristics. If the speaker, when she says ‘The morning star’, really understood the cognitive content that she has about the object in question, then the character of being the evening star would be included in this ‘morning star’. We should remember that in 1892 Frege rejects the attempt he made in his youth: The difference between signs, he says, does not concern the knowledge of things. Therefore, I continue: if sense is the knowledge of things, why will a person’s knowledge about the morning star be different to that same person’s knowledge about the evening star, when that person knows that both are the same star? My point will already be clear. The problem of true and simultaneously non-redundant judgements will not be resolved through the concept of Sinn, unless we add a new clarification. In ‘The morning star is the evening star’, the cognitive content of the term ‘the morning star’ cannot be the cognitive content held by the speaker about the reference, but the cognitive content that a person unaware of the identity between the two stars will have. In principle, Frege addresses only judgements of identity*. However, in my opinion, we come up against the problem in other types of predication also. Consider ‘The Sun is now hidden by a cloud’. My point is again the same. Among the properties, circumstances and details that the speaker of this judgement knows about the Sun when she makes this statement must necessarily be the characteristic that it is now hidden by a cloud. Then if the term ‘Sun’ were to be understood as bearing the cognitive content that the speaker has about the Sun, then the predicate would be a redundant and superfluous element. It does not matter that this new predication is not one of identity.

Chapter 14.â•‡ Revisiting Frege 

(In reality, the expansion beyond judgements of identity can be glimpsed in Frege himself. In the first note of ‘On Sense and Reference’, we can see how he presents the problem with judgements that are absolutely not of identity, such as ‘Aristotle was born in Stagira’.1) What is most important here? Frege had never considered logical judgements to be communicative acts. He never differentiated between speaker and hearer. Nevertheless, the problem he detected goes to the very communicative nucleus of predications. A person who is unaware of the identity between the stars, the kind of person we need if Frege’s problem is to be properly solved, is precisely the addressee for whom the judgement that the stars are identical is appropriate. But, with this, the pragmatic adaptation of a predicative communication is not an addition alien to predicative syntax itself. It is not only a question of our interlocutor’s mental state (i.e. her lack of awareness about the stars) leading us to send her the appropriate predication about the stars. Even more important is that predication can only be originally conceived as a modification, or an updating, of a cognitive content which is different to one’s own about the star or any other object in question. It is only because one’s own cognitive content has to be expressed taking a different cognitive content as a starting point, for this reason alone, I repeat, that predication would have become necessary and would first have appeared. The cognitive content a person has about an object includes the characteristics of that object that the person knows. This inclusion is clearly demonstrated in the use of a vocative. When an individual is called using a vocative, many characteristics of the individual being called are necessarily included in the vocative used. The modulation with which the vocative is pronounced depends on the social condition of the individual called, his expression, or also the physical distance that individual is from the speaker at that moment, etc. If the cognitive content a person has about an object thus includes all kinds of characteristics, permanent as well as circumstantial, essential as well as incidental, it would not then be necessary for the person himself to add any predicate after designating the object. But in predication the speaker selects one of the characteristics he knows about the object, and adds this characteristic as a predicate. My hypothesis that the subject term used by the speaker would carry a cognitive content different to the speaker’s own explains that a characteristic will be added, and also, that this characteristic will be the only one selected among those known to the speaker. In addition to this, the hypothesis accounts also for the intriguing phenomenon of negative predication. Certainly, in true negative predication, the negation of the characteristic predicated is in keeping with the reality. Nevertheless, the presence of this characteristic in 1. Clearly, if the scholastic appellative ‘the Stagirite’ were the predicate used in the judgement, then it would be without doubt a judgement of identity. Nevertheless, Frege’s text absolutely does not allow the predicate to be interpreted in this way – at least, this is what the native German speakers whom I have consulted tell me.

 Becoming Human

predication is not for this reason less intriguing. Where does this characteristic come from? By definition, this characteristic belongs neither to the thing nor to the speaker’s current cognitive content about the thing. Negative predicates cannot come from perceptive data. Even Gibson agrees with this. The peculiar nature of these predicates is highlighted clearly if we ask ourselves at what point a schematic drawing (Olson [1997]) or a map (Rowlands [2009]) ceases to be simply a schematic drawing and starts to become a written text. Imagine a line drawn, for example, above the drawing of a horse in order to show the absence of horses. The line is obviously not drawing any real element. ‘Negative facts’ were always an uncomfortable question for logical atomism, with good reason. But the problem that negative predicates entail may perhaps best be described by turning to Frege’s core question. If the negative judgement is true, and if (contrary to the suggestion set out here) we maintained that the negatively-predicated characteristic does not really belong to the content of the subject term, then its negation would be superfluous and redundant.2 Bergson already explicitly invoked the hearer’s beliefs or suspicions in order to explain the usefulness of negative predications. However, as we said in the previous chapter, there are two different positions regarding how beliefs different to one’s own relate to predication. In Bergson’s formula, the relationship considered seems to be the one we have called weak. From the other position, I prefer to hypothesise that, when a characteristic is denied, it is because that characteristic was incorrectly included in the cognitive content of the subject term of the predication. The subject term – the reader already knows the refrain – would be reflecting the addressee’s own degree of knowledge, or state of relative ignorance. We said earlier that the problem detected by Frege leads us ultimately to a solution that would reach beyond the horizon in which he moved, and would go to the very heart of the communicative and interpersonal approach.3 I would add now that such a solution connects with other studies that, at the time Frege was writing, would have to 2. This last statement recalls some made by Searle [1958] (about sentences with the verb ‘exist’). But Searle makes use of these reflections to support the Fregean Sinn, or rather, his nuanced revision of it. What I am hypothesising is that the objective that these reflections can meet is not this, but to point out to us that the sense involved in the subject term of predication cannot in any way be the sense that the speaker has about the reality designated by that term. 3. Note that an accusation frequently made against the Sinn thus loses all foundation. The socalled New Theory of Reference believes the Sinn, since it is content belonging to an individual or, in other words, private content, cannot belong in language (see, for example, Wettstein [1986]). But the Sinn as it is being reformulated here is precisely the opposite of private, as it coincides with a content different to the speaker’s one. In a word, subjectivity (or mind) is absolutely not the same thing as privacy. (Was that accusation against the Fregean Sinn fair or unfair? I would say that it was a very unfair accusation. See Frege (1892) on the distinction between sense – Sinn – and representation – Vorstellung. See also his comment about the statements that occupy the role of the direct object of verbs such as such as ‘Joe Bloggs says that’ or ‘Joe Bloggs

Chapter 14.â•‡ Revisiting Frege 

wait almost a century to be born. False (or insufficient, or not up-to-date) beliefs of others, which are the flagship of ‘theory of mind’ research, would be just the clarification necessary for the concept of Sinn to be able to really to play the role for which Frege coined it.

14.3 Moving away from the route taken by Frege 14.3.1 Where then would the reference of the term be? The bleaching of the subject term and Carstairs-McCarthy’s question In my view, since it is loaded with cognitive content other than the speaker’s, the subject term of the predication would no longer in itself have reference or connection to the world. The subject term, before it can successfully connect with the reality in question, would have to be reunited in the mind with the predicated term, and would have to be modified by this predicate. The reference that one attempts to communicate in the predication would only belong to the complete sentence. The subject term, on the other hand, would immediately designate only the mental state of the hearer – the addressee’s false or insufficient belief. I have just summarised the solution I hypothesise for Frege’s problem. But it should be noted that in these same lines we are also responding to Carstairs-McCarthy’s question (1999) about why all languages differentiate between complete sentences and noun phrases. Indeed, the question was foreseen by Jespersen (1924) (‘The blue dress is the oldest’ versus ‘The oldest dress is blue’).4 It is nevertheless true that CarstairsMcCarthy raised a crucial question, even though his answer – his solution – has been rejected almost unanimously (Uriagereka [2001], Hurford [2007b]5). What has happened to the normal watertight compartments? What has happened is what often occurs when an exciting problem is addressed. Frege is not the sole preserve of logicians and historians of philosophy, nor is Carstairs-McCarthy the sole property of linguists. But there is nothing to gain by insisting on this nowadays. Nowadays, in general, the opaque barriers between compartments have begun to disappear. This has not required any deliberate methodological purpose; the change has flowed merely from the questions that have begun to come into the open, from the

thinks that’: Frege says that these statements cease to have any reference and become merely Sinn or thought. 4. This question – why are the union (“the blue dress”) and the predication (“the dress is blue”) different?– is a point that I focussed on in my thesis (1985). 5. Hurford (2007, b) gives a different answer to this question. This answer is based on the model of predications hypothesised in Hurford (2003). (Bejarano [2004] is a detailed critique of Hurford [2003].)

 Becoming Human

questions on ‘theory of mind’ or about the origins of language. But let us return to our argument. That emptying of the subject term is, I would suggest, the bleaching that would have to be identified with the origins of syntax. In 10.5.3, we hypothesised an even more original bleaching. There it was a question of unhooking from a single obligatory communicative force. After the emergence of the protodeclarative, conative-type force and intonation would have ceased to be the only possibility, and meaning would thus have come to be more neutral and empty. This step is the one that we saw would occur before the end of the Holophrastic Era. In the step we are focusing on here, the emptying and weakening to which the sign is subjected goes much further. When the term designating a reality comes to play the role of the subject of the predication, from that very moment it ceases to point to that reality. In its new role, the connection of this term with reality is no longer direct and immediate. It points initially only to a mental state which is recognised as a mental state by the speaker. The idea of bleaching or weakening continues to produce results. Langacker and Sapir put their finger on the crucial spot. To round off our visit to Frege, we should contrast these last implications of our hypothesis with him. In order to outline the comparison, we shall formulate the following questions. The reference of the term (or, more precisely, the reference of the subject of the predication): does it exist or not? The reference of the complete statement: must it lie in the truth value, or, on the other hand, could it lie in the thing designated? It is in the answers to these questions that I move away from Frege. As regards the term, Frege clearly invoked the cognitive level, the knowledge burden. Nevertheless, he does not take the step of ascribing only this level to the term and, therefore, of denying it reference. In his article, his opening assumption had been that the designating term stands for the reference or thing designated. This does not change after he coins Sinn.

14.3.2 The reference of the complete statements: Stressing beyond Frege the difference between within and outside the speaker’s point of view As regards what the reference of the complete statement would be, Frege’s answer is conditioned, principally, by the answer he has given about the term. He cannot even consider the option that the statement’s reference lies in the reality designated by the subject term. He cannot even consider this because he has already assigned that reality to the term. As a result, Frege has no other option than to give the truth value of complete statements as their reference. True statements, all of them equally, would have, according to Frege, the value T as reference; all false statements, the value F. However, in addition, there is another deeper motivation why precisely this should be Frege’s hypothesis about complete statements. When Frege talks about the reality or reference of a linguistic expression, he wishes to disregard the speaker’s point of view. This reference is then unknowable and could not be reduced to anything that is being

Chapter 14.â•‡ Revisiting Frege 

thought by the speaker of the true (and sincere) statement. Such reference, in itself, could be described in millions of ways so that the speaker cannot even suspect that they are descriptions of the same thing contemplated in the statement. In all this lies precisely the principal clarification that Frege offers in ‘On Sense and Reference’ about the concept of reference. It is therefore understandable that he does not wish to identify this inhuman reference with the partial vision contributed by the statement, or, in other words, with a description for which there would be innumerable alternatives However, faced with this position, we can highlight that the speaker of a true statement is pointing to the reality of which he is speaking.6 In this way, these two characteristics of reference would align for us (although not for Frege). On the one hand, we accept that reference is inexhaustible and inhuman. On the other, reference is what the speaker would point to with the statement. How can these be compatible with each other? They can be so because the second characteristic depends on the speaker’s point of view, and the first, contrastingly, has required us to step outside this point of view. The reference, or reality, that the statement aims to signify is the reality or reference only in the eyes of the speaker herself. If we step outside the speaker’s point of view, then the statement will immediately pale beside the reality itself, and appear as a mere mental bubble. However, for this to occur, it is not necessary to invoke the reality itself or the inhuman reference. It is enough to step outside the speaker’s point of view into the point of view of another individual. To the eyes of someone with greater knowledge than the speaker (more precisely, to the eyes of someone who believes the speaker to have false or insufficient beliefs), the complete statement would cease to have direct reference to or connection with the world. This would be clarified in the subordinate of the reported speech (whether this be direct or indirect reported speech). The statement ceases to have any reference and becomes merely Sinn or thought when it comes to occupy the role of the direct object of verbs such as ‘Joe Bloggs says that’ or ‘Joe Bloggs thinks that’. Frege saw this clearly. But there is another case where this conversion also occurs. This other case, not picked up by Frege, is when a hearer merely listens to the statement and is not convinced of its truth.7 In the mind of this hearer, the complete statement, despite not being wrapped in any subordinate syntactic structure, would lack, however, an immediate connection with the world. (In 16.4.2 and also in 21.3, I will focus on the difference between these two cases in a more detailed way.) 6. Even Frege himself (without leaving ‘On Sense and Reference’) says this when he wishes to disregard the reticence of sceptics or idealists. Clearly, he confines this clarification to the first part of his article, that is, which he dedicates to the term. We, nevertheless, who deny the term’s reference, can move it out of there and take it to the ground of the complete statement. 7. Frege’s failure to consider this case is important, because his omission conditions how Frege addresses the question of the reference of the simple complete statement, or, more concretely, how he tries to argue – with a reductio ad absurdum – in favour of his incorrect answer: See Bejarano (2000).

 Becoming Human

Nevertheless, in spite of all this, at the moment of speaking the speaker is pointing with his statement to an area of reality. For him, the complete statement stands for the reality it is about. Clearly, this can only be supported from within the speaker’s point of view. However, if we wish to understand the process of syntactic production, it seems we have to adopt the point of view of the speaker. If we do, then we will be able to make the statement’s reference lie in the reference or area of reality being addressed by the speaker. For the speaker, the statement is that area of reality (although – this is our other clarification – that area is there reformulated as a transformation of the addressee’s cognitive content). We – I repeat – have no obstacle to accepting this conclusion about the reference of the statement. Unlike Frege, we have not awarded this reference to the term in any way. According to the hypothesis presented here, the subject term of the statement stands for a cognitive content different from the speaker’s content. The speaker is not loading the subject term with what for her is the reality designated by it, but with what she is conceiving as a mental state.

14.3.3 The predicate of judgement of identity: Another point where we move away from Frege We have seen that Frege found his great question when he was addressing what he called judgements of identity. Clearly, in 14.2, we have already noted that this framework seemed to sit small on Frege himself. Nevertheless, this, in my opinion, inadequate framework leaves its mark on Frege’s hypothesis about the Sinn. This mark, this harmful influence, has to do with the predicate of assumed judgements of identity. We shall look at this slowly. I have suggested that the Sinn be identified with the insufficient or incorrect content that the speaker believes the hearer has about the topic in question. This suggestion entails the Sinn only being located in the subject term of the judgement. But Frege coined his notion of Sinn to apply it both to the subject term and to the predicated term of the judgements of identity. Shall I then reject the second of these applications of the Sinn? This is not exactly what I shall do. What I wish to reject is the concept of the judgement of identity. A predication is never symmetrical (Cf., e.g., Strawson [1950]). What is predicated upon and what is predicate are never fully equal. The second is only a modifier. This asymmetry – this is my point against Frege – would occur also in ‘The morning star is the evening star’. The predicate here does not consist merely of the term ‘evening star’, but of the set ‘is the evening star’. It does not matter if we are dealing with a language that has a copulative verb or not. In any case, in any language, the ‘be the evening star’ characteristic would be the modifier that completes or updates the content with which the subject term, ‘the morning star’, is loaded in this sentence.

Chapter 14.â•‡ Revisiting Frege 

14.4 What exactly have we taken from Frege? Frege defined Sinn as “mode of presentation of the reference”. As a result, he decided to address a cognitive level and not merely the level of signs. I have suggested that when we decide to address that level, we cannot then remain in that impersonal neutrality (“presentation”). The subject term of predication, that on which it predicates, would be a cognitive content which primarily would be the one the speaker is attributing to the hearer. It is precisely because of this, that is, precisely because this content is different to the content the speaker has at that point, that the speaker conceives of it as simply mental content. One’s current own beliefs are never mere mental contents for oneself, but are reality itself (at least this is what originally, and also usually, would occur). This chapter has attempted, within the framework of some theoretical concerns unconnected to Frege, to raise anew some problems this author had detected. Clearly, as these concerns were unconnected to Frege, the present pages should not be read as an interpretation of his text. Nevertheless, I wish to make clear that the questions we have addressed here really were posed by Frege. These questions even constitute, I would say, one of the most valuable parts of Frege’s legacy. At least, if we accept, as I would not hesitate to do, that the value of a text does not lie only in the responses or solutions present in it. Discovering problems or questions is undoubtedly much more important and much more difficult than finding solutions.

14.5 Toward other authors and other aspects As well as Frege, there are, of course, very many other authors who have discussed predication. It would certainly be appropriate for us to refer to them. (I think of van Eijck & de Vries [1995, p. 20], for example: “In the dynamic approach to information processing, the meaning of a sentence is equated with its information change potential, with the effect that it has on a given state of information”.) Nevertheless, this task is far beyond what I can do. There are, indeed, previous articles of mine addressing some hypotheses which I found particularly interesting. Here, since I cannot add anything new to them, I will simply make reference to them. Bejarano (2004) is thus a comment and critique of Hurford (2003) (See also Bejarano [2010a], a review of Hurford [2007]). Additionally, in Bejarano (1999c) I studied the hypothesis about predication entailed in Lakoff ’s (1994) ‘metaphor of event structure’. I shall, however, take up one point again from my 1999 article. It might be useful to open out the comparison that I was making there between predications and orders. That point, although it was included there, was not really a criticism against Lakoff, but

 Becoming Human

was more an independent piece of evidence in favour of my hypothesis about predication. In addition, I wish to review a concept coined by Vygotsky that is still considerably mentioned nowadays. Similarly, our hypothesis on the origins of syntax forces us to pay attention to studies that have come to be known as conceptual semantics. But all this will be best addressed in a new chapter.

chapter 15

Communicative functions, Vygotskian ‘pure predicate’ and conceptual semantics Various questions about predication

15.1 Comparing predications and orders 15.1.1 From Jakobson to Searle’s ‘direction of fit’ ‘The window is open’, and ‘Open the window’. The difference, and also the similarity, between these two messages is clear. However, how we describe this contrast is a question that has been of interest of specialists for a very long time. Jakobson (1960) highlighted that, while the predominant factor in predicative communicative function was situation, in conative function it was, in contrast, the hearer. These attempted descriptions find support in intuition. However, they are also highly vulnerable to questioning. Most importantly, it is not at all clear why situation should not be equally important in the conative or in predication. In reality, one could hold a position opposite to Jakobson. Situation would have more importance for the conative than for the declarative. If there is no window nearby, the request ‘Open the window’ is absurd. In contrast, predication can refer not only to situations distant in time and space (‘The Pharaohs of the V dynasty ordered pyramids to be built’) but may at times also be metalinguistic or conceptual (‘Horses are mammals’). Searle (1979) made a decisive step forward in this task when he coined the concept of the ‘direction of fit’* between language and world. For the predicative speech act, that direction would be the opposite of the direction for the conative. In predication, the open window in the world would come first, and this would then be reflected in language. In orders, in contrast, language would be first, and only then, if the hearer was obedient, would come the impact in the world. Searle explained this contrast through the example of the supermarket. A man goes to the supermarket with a list his wife has written for him, and throws the corresponding articles into the trolley. But this is not the end of the tale. Since this man is being spied upon, a spy follows him through the supermarket and writes down a list of everything being thrown into the trolley. Both lists are identical, but they have an inverse direction of fit. I would like to add the comment here that one can find this identity of form only in lists, and not in authentic communication. The written list

 Becoming Human

exemplifies the first stage of writing, that is, a writing that still neither edits nor aspires to replace oral communication. In contrast, language properly speaking can never allow itself such ambiguity. It is clear that Searle has laid out the difference between predications and orders much better than Jakobson. But let us examine the vision of predication that is obtainable through the lens of the direction of fit. Predication would be a reflection of the world. Reality would thus exist first, and linguistic expression would be like its mirror. Unlike orders, which are characterised in their relationship with the world by taking the initiative, predications would come closer to passive fidelity. Clearly, there is a nucleus of truth in this type of vision. It is adequate especially in its original outlook, which was, as we know, to show the contrast between predications and orders. However, several points will still need to be explained if we adhere only to this characterisation of predications. Firstly, the matter of dishonest predications: in these, the world is not present first. But we do not even need to descend into lies, which can always, at the end of the day, be said to be aberrant or exceptional. Honest negative predications do not fit particularly well into the vision offered by Searle either. As was already said above, however much negative predication reflects reality, it is not clear where the characteristic to be negated comes from. And, more generally, all predications, including honest and positive predications, select only one of the characteristics that (for the speaker) are in the real-world reality in question. This selection remains unexplained if predication is only described through its direction of fit.

15.1.2 Predication as an order to be obeyed in the mind It is clear there is also a difference in the role played by the hearer in predications and orders, in addition to the difference posed by the direction of fit. The hearer is extremely important for orders, as Jakobson said. But in predication the hearer is also decisive. In what way does the hearer participate in each type of communication? If we answer this question within the lines of the hypothesis presented here, it seems to us that predication would become an order or command that the speaker sends to the hearer.1 The difference with regard to orders or commands properly speaking relates to the area where the order has to be carried out. While the normal imperative asks the hearer to perform a task in the world, predication, by contrast, requires her to perform a task in her own mind. ‘Acting on something’ versus ‘contemplating something’ is a dichotomy that was highlighted a long time ago. Our hypothesis about the relationship between predications and orders would respect that ancient dichotomy, on the one hand, but would also blur it at the same time, on the other. There are differences, but also similarities, between the contemplation toward which predications drive and the action demanded by orders. 1.

Cf. Janet (1936) (cited above, p. 94, Chapter 6, note 4): to order/to obey: to speak/to hear.

Chapter 15.â•‡ Communicative functions, Vygotskian ‘pure predicate’ and conceptual semantics 

The hearer of the predication is asked to put the subject and predicate together (“collocate”2) in her mind, and thus to transform the former through the addition of the latter. In this way, the formal difference that separates predications and orders would signal the place where the instruction sent to the hearer is to be performed. The predicative form would indicate that the order must be performed by the addressee in his/her own mind; the conative form, in contrast, would indicate that the order must be fulfilled in external reality As a result, the formal difference may be erased when there are sufficient other indications to signal the place. In this way, ‘Remove arrogance from the image you have created of John’ may be formulated with an imperative despite coinciding with the predication ‘John is not arrogant’. That imperative form may suffice because ‘the image you have created of Juan’ is already a clear indication that the task demanded from the hearer has to be carried out, not in reality itself but in the mind of the hearer. Observe, however, how the imperative form ‘Remove arrogance from John’ would be a bad choice to transmit the same message, since it would lead to the interpretation that something would have to be done to the real John. (Limiting ourselves to examples taken from Spanish, we might add something more.) Conversely, when a model is being described so that the hearer will follow it, an order may take the form of predication. Think of stage directions: these may just as easily take the form of an imperative –‘Que suenen gritos a lo lejos’ (English: Let shouts sound in the distance)– as the form of a predication –‘Suenan gritos a lo lejos’ (English: Shouts sound in the distance). Here, the author is giving orders to those who will perform his/her play in the future. Why, therefore, is the predication form accepted? Because this predication is describing the model which the theatre director is required to follow. The stage direction may take the form of a description or predication because the imperative implicit in the model is sufficiently clear. There are other similar examples: in instructions for building models (The wing-tips are red/ Paint the wing-tips red), or also in the description of the Easter rituals in missals (The priest writes Alpha and Omega on the candle/Let the priest write Alpha and Omega on the candle). In short, the idea of predication as a command that must be carried out in the mind and not in the world turns out to fit some facts very well. We have seen that a hypothetical deduction we had derived from this idea is fulfilled in fact. Clearly, this proves nothing. Nevertheless, the hypothesis of this Section, that is, the hypothesis that the subject or theme of the statement would carry the false (or incomplete) belief attributed by the speaker to the addressee, has been strengthened somewhat.

2. In this respect, there would be some truth in the “event structure metaphor” –“colocation”– described by Lakoff (1994, p. 65).

 Becoming Human

15.2 Pre-linguistic semantics: What do we do with it in our hypothesis about predication? 15.2.1 Richness of details versus syntactic articulation At this stage, we should delay no further in addressing perhaps the most challenging consequence of our hypothesis: in pre-linguistic content there would be no independent elements. The concepts of agent, action, quality or place, for example, which are usually placed on the level of conceptual or pre-linguistic semantics, would be a copy, and a non-valid expansion, of what occurs only in language. What is it I mean by this? I would prefer to begin by clarifying what I do not mean. In no way am I suggesting that pre-linguistic perception, including animal perception, lacks those characteristics or details. It is my view that perception itself, without any help from language, is a highly-tuned and sophisticated capacity. If it were not recognised in perception, for example, that the thing there is a predator, that it is very big, and that it is running and getting closer, what would be the point of such perception?3 But, it will be said, agent, and action, and quality, and place are present in a perception such as this. These features and details are certainly present in perception, but my point is that they are not there as separately addressable elements. When an animal wants to address a section of a specific perception, it will look for a new perception that will focus on that section. But each perception is a unit in itself. This would be the case both for animal perception and for any human perception on which no syntactic process is yet operating. Clearly, the perceiver has to have expectations as to what will be perceived to each side of this perception. If this were not the case, the animal would be unable to find its way. Nevertheless, there is no reason whatsoever why these expectations should imply that the perception in question be considered by the percipient as a lower-order unity within another higher-order perceptual unit. Standing against the reality of ‘mental maps’ is in no way a position only I take. Clark & Chalmers (1998) or Haugeland (1998) for example, are among the authors who reject mental maps (“The world as an outside memory”: Thomas [1999, p. 219], or Myin & O’Regan [2009, p. 187], for example.).4 The only thing I might add is that this rejection is in agreement with my emphasis on expectations and my suggestion 3. Ninio (1993, p. 294): “Children are able to isolate the relevant, invariant, feature of a series of situations that constitutes an utterance’s condition of use, out of the totality of the event. Were events represented as unanalyzable wholes, such abstractions were impossible.” She pertinently refutes the idea that the perception of the holophrastic child would be insensitive to the different aspects and details of the scene perceived. However, in my view, this does not require a genuine analysis. 4. Barsalou (2009, p. 241) writes: “When people focus attention on a particular entity or event in perception, they continue to perceive the background situation. The situation does not disappear, leaving the focal entity or event in a perceptual vacuum”. In my view, he is right to reject the ‘perceptual vacuum’. However, in order to avoid that ‘vacuum’, expectations are enough, or said otherwise, evocations are not necessary.

Chapter 15.â•‡ Communicative functions, Vygotskian ‘pure predicate’ and conceptual semantics 

that the evocation of anything not being perceived would be an exclusively human process (we might recall here all that has been said about the coherence of synthesis and regrettably limited value of this coherence). But we should return to our point. Once the obstacle entailed by the supposed mental maps, with their higher-order perceptual unit, is weakened, we can reformulate the opinion we shall defend here. Within each perception there may be many details, but within that same perception none of them will be addressed independently. (Anderson & Oates [2003, p. 284] are certainly restricted to focus on the attachment of qualities to objects, because this was the topic of the target article that they comment. Certainly, they do not focus on the attachment of actions to agents. However perhaps we can cite them. “We suggest that the formation of objective predicates, i.e., the attachment of qualities to objects, is inextricably bound up with the emergence of language itself.”)

15.2.2 Vygotsky’s presumed ‘pure predicate’: Why are the presumed subject and presumed predicate interchangeable? Suppose I am alone in the middle of the desert and that my bottle of water has fallen onto a rock and smashed. I might, even without there being a hearer, pronounce linguistic meanings in those circumstances. It is true that I could emit only a terrified ‘Agggh!’, but it is also perfectly possible for me to turn to a word for that merely expressive function. In that speech which is not communicative but radically for oneself, the content of the options ‘The bottle!’ and ‘It’s broken!’ would then be exactly the same. Would these exclamations be pre-syntactic? As we already saw in 10.3.2, to call them pre-syntactic would not be entirely precise. Since they are formed by words, syntax is already involved in them. Words that are either nouns or verbs, or adjectives, are, for this reason, parts of the sentence, or, expressed another way, elements shaped by and for syntax. Nevertheless, if we are addressing the processes of syntactic production, those exclamations can justly be said to be previous to any syntactic process. Nothing in this description would have to change if, instead of such words being pronounced audibly they occur in an inner speech that was at least as radically egocentric as the audible exclamations (what I have called ‘speech for oneself ’ Piaget and Vygotsky called ‘egocentric speech’). It is true that in highly emotionally-charged situations, or in subjects whose capacity for inhibition is weak, the exclamation would tend to be made out loud. Nevertheless, in other cases, it could be confined to inner speech. It should be noted that with this I am opposing Vygotsky’s suggestion about the pure predicate which, he says, would represent the inner speech. As is well known, Vygotsky was able to see that the small child’s ‘egocentric language’ (that is, ‘language for oneself ’) is internalised and thus gives rise to inner speech. But Vygotsky (orig. 1934) adds that this speech is predicative: In his view, inner speech is formed only of subjectless predicates, since, as it is speech for oneself, there would be no need to signal explicitly what the subject is.

 Becoming Human

Let us see how Vygotsky argues in favour of the latter and, in my view, incorrect statement. He assimilated this alleged ‘ellipsis of the subject term of the predications of inner speech’ to the ellipsis that occurs in communicative predications between people who are addressing the same thing, or, as he says, between interlocutors who are very similar to each other. In Vygotsky’s opinion, inner speech or ‘speech radically for oneself ’ would only be the extreme point on a scale. If we climb the ladder starting from written communication, where the recipient may be separated from the writer even by centuries, we will first find oral communication between interlocutors who are distanced from each other by their presuppositions, biography, etc. Gradually, we will come to communication between interlocutors who are so close they understand one another with a minimum of means. And, at the extreme we would find, Vygotsky says, speech for oneself, where speaker and hearer coincide. I reject this fluid continuity between communication between similar interlocutors and speech for oneself. Clearly, when in predicative communication the speaker and the hearer are addressing the same scene and the same aspect of the scene, the subject of the predication may remain implicit. Nevertheless, the predicate continues to be communicatively necessary in these cases. If there is predicative communication, then the predicate will represent a characteristic that is somehow a piece of new information for the hearer. This is precisely what cannot in any way occur when speaker and hearer are one and the same. In short, the affinity between interlocutors eases communication, but, in contrast, the identity between speaker and hearer is incompatible with any purpose of predicative communicative function.5 What then about the presumed ‘pure predicate’ of inner speech? Is Vygotsky right to hypothesise this? Or, on the other hand, would it have a merely expressive function, that is, an emotional discharge function? This latter is my preferred option. Where might we look for evidence to support it? One could, of course, accuse Vygotsky of being unfaithful to his ‘Fundamental Principle’, that is, to his idea that the higher processes originate in the interpersonal level, and only later become intrapersonalised. Vygotsky’s and, more explicitly, Luria’s 5. In this discussion with Vygotsky about inner speech, one very interesting point has remained untouched, namely, how many communicative functions it is possible to perform in that speech. In addition to the expressive function to which I have reduced Vygotsky’s supposed pure predicates, there would also be the conative function of exhorting oneself, in other words, of reactivating for ourselves an object which would have begun to lose the ability to attract us (Kühn & Brass [2010, p. 9]: “Our results suggest that the representation of non-actions contains a facilitation of the alternative action rather than a suppression of the action in question”. See 8.9.2 on the regulation of one’s own attention, and see also p. 75, Chapter 4, note 9), as well, perhaps, as true predications which, far from the ones Vygotsky describes, would really be creative problem-solving: predications which would re-describe the goal of the problem, or its object of interrogation. Here, we must think of ‘psychopragmatics’: Dascal (1983), which was an important landmark in my formation. These types of inner language, as well as their respective uses are extremely interesting (maximally interesting, I repeat), but this is not the time to deal with them.

Chapter 15.â•‡ Communicative functions, Vygotskian ‘pure predicate’ and conceptual semantics 

(1979) hypothesis, that for any predication, without it mattering if it is communicative or not, the choice of predicate would take place in a language radically for oneself, does not fit very well with that Principle. Nevertheless, this would not be a valid argument in favour of our suggestion. The greater or lesser internal coherence of the doctrines of the Vygotskian School is unconnected to this. We have already seen the only evidence we can call upon against the Vygotsky’s supposed pure predicates. The options ‘The bottle!’ and ‘It’s broken!’ were perfectly interchangeable in inner speech (or speech for oneself). The lack of difference between the two possibilities would be impossible, it should be noted, in any truly predicative communication. In predication produced to address the hearer’s information needs, either the speaker believes his hearer is not up-to-date with the subject of the action, or that he is not up-to-date with the action of the subject, or, lastly, that he knows nothing about what is happening. Three different types of communicative predication would thus be possible. But the choice between these types is not all insignificant, but is entirely required and determined by the specific combination of knowledge and ignorance that the speaker attributes to the audience in each case. Clearly, if there were a hearer nearby, then these words, even when they had not been pronounced with a view to any addressee, could initiate real communication. If the hearer hears ‘The bottle!’ then he would ask ‘What is up with the bottle?’. If, on the other hand, what he heard was ‘It’s broken!’, then he would ask ‘What is broken?’ This is undoubtedly true. However, it is no great help to Vygotsky. Firstly, because the final speech episode ceases to be egocentric. And, secondly, because the predication that reaches the hearer in the end would have a structure inverse to the one proposed by Vygotsky. Let us examine how it would be inverse. We must bear in mind that, in these contexts, Vygotsky is understanding subject and predicate as theme and rheme respectively. This becomes evident in the example he gives about a clock falling in the room next door. When the noise of the fall has been heard, ‘The clock!’, Vygotsky tells us, is “the predicate”. The terminology may appear to clash, but Vygotsky makes this point in laudable harmony with what the Prague linguists were doing at just that time. However, returning to what is important for us, what happens if we have to understand the predicate as rheme? In that case, the supposed inner speech predicate would end up playing the role of theme in the final predication. It will be helpful to address once more the fact that, in emotional discharges to oneself, the options are interchangeable. Why is this so? Under our hypothesis, it would have to be said that, in the level prior to predicative communication, not only is composition between subject and predicate, between actor and action, absent; rather, one cannot even conceive of the need to add the second element. My point is that there would be no intrinsically incomplete content before perceiving a belief different to one’s own. (Clearly, I am not saying that the speaker’s own cognitive limitations are absent from this. Indeed, the partiality and limitations entailed in any human perspective are very present. Nevertheless, as we have already repeated on several occasions, 14.3.2. for

 Becoming Human

example, a perspective is only visible to whoever looks at it from another perspective.) Thus, in my view, there is no true predicate in radically egocentric speech.

15.2.3 A pause and a reflection: The basic human capacity and predication We can comment on this hypothesis about the origins of syntax within the lines of our general hypothesis. The key evolutionary milestone for the setting up of the human capacities is, we have been suggesting throughout this text, the appearance of a second, or simulatory, centre in the mind. When I conceive that the self of other individual is looking at me, I am necessarily situating myself in the periphery of that self, although, of course, I continue to have the other type of consciousness, the primary or animal consciousness. Attention to an interiority radically and intrinsically different from one’s own interiority would, therefore, be what makes the mind something characteristically human. Analogous to this would be what, according to what has been hypothesised in this chapter, underlies the origins of predication and also of properly human knowledge. Before moving on to open out this suggested analogy, I will comment on the final addition in which I invoke ‘properly human knowledge’. What would have given rise to the characteristically human kind of knowledge? We can, at very least, give the answer which would follow from our hypothesis about the origins of syntax. According to this, perceptual content which has no exclusively human peculiarity may, nevertheless, become an exclusively human thought if it is formulated from content different to the speaker’s. No categorisation or special abstraction would be necessarily involved in such a change. The only essential requirement will be that one’s own perceptual content be conceived from the content of other individual, or, put another way, be reformulated as a completion or modification of that content. In this sense, therefore, the requirement for predication coincides with the requirement for properly human knowledge. Let us now open out the analogy between what has been hypothesised in this chapter and the simulatory centre such as we defined it in Section Two. Clearly, the second, simulatory centre here is not a self that is looking at me, as it was for the ‘third mode of processing of eyes of others’. This is now a self that has a degree of knowledge different to mine. However, in both processes there is a common core. What is one’s own is perceived, or constructed, or reformulated as the periphery that the second centre reaches. The syntax (or more exactly, the original core of syntax, that is, predicative compositionality which is not yet grammatical) would thus originally be a result of characteristically human interpersonality. In addition to this, syntax, in my opinion, would also be the platform from which some aspects of human intelligence take off. (I will attempt to support this opinion – perhaps – if I succeed in finishing a new book. But for now, it is enough for us to focus on some intellectual abilities – logical reasoning, at the very least.) What follows if we examine these two suggestions about syntax

Chapter 15.â•‡ Communicative functions, Vygotskian ‘pure predicate’ and conceptual semantics 

together? We would have a human intelligence that was not the cause but the result of the capacity to address the self of others. This corollary of the general hypothesis would, in turn, be the subject of various comments. But we shall leave it here.

15.2.4 Studies on conceptual or pre-linguistic semantics: Their value and limits We should now address another question: what then of studies on so-called conceptual semantics? I am thinking of the shrewd work carried out in this field of study by Wierzbicka (2002), for example, and her natural semantic metalanguage (NSM) framework. How ought we to judge these from the perspective of my hypothesis? Such semantics offers a level of abstraction constructed on the basis of data from different languages. As a result, those studies are (will be more and more) an essential foundation for automatic translation programmes, and machine-to-person communication. In this sense, they clearly represent a discipline whose great future one does not have to be a fortune-teller to see. Nevertheless, I would not be in favour of these studies if they are being put forward as describing pre-linguistic perception. Animal inferences would not require true syntactic compositionality. I do not find the arguments offered by Fodor (1978) in favour of such pre-linguistic compositionality at all convincing, although I cannot prove my position either: we find ourselves in the typical case of one wager against another – innate mentalese with compositionality and syntax, against interpersonal origin of syntax. For the moment, we have to accept that nothing can be proven or disproven. But I can, at least, point out that there is no reason why the cognitive contents which are appropriate in relation, for example, to working in artificial intelligence, should coincide with those involved in biological perception. The strategy of shifting them from one level to another without there being any questioning not only lacks authority, but may also, and more seriously, move us further away from recognising the knowledge we do not have and from formulating questions. I would also like to underline how the division of opinion to which I refer sits across the borders usually cited. For example, Chomsky and Givón cannot be considered more opposed to one another. However, both come down on the same side with regard to the origin of syntax that has been suggested here. For Chomsky, syntax is autonomous and innate, but this autonomy and innateness are clearly rejected by Givón’s functionalism. Nevertheless, in both positions, the very idea of the compositionality is assumed to be present in the human mind from the beginning. (In Givón and Malle [2002] compositionality originates in perception: in their view, the different elements of the scene perceived are already the elements of compositionality, that is, the agent, action, and other elements of linguistic compositionality.) In other words, neither position contemplates an interpersonal and historical origin for the very possibility of syntactic composition. What we have hypothesised here on the origin of syntax is, thus, as removed from the one as it is from the other.

chapter 16

Connecting with the concepts of theme (or topic) and rheme (or comment) In previous chapters, we concluded that the subject term of predication would originally have had to correspond to the cognitive load that the speaker attributes to the hearer about the item in question. Correspondingly, the predicate term would be the element needed to turn this incomplete or incorrect cognitive content into the appropriate reference (or, more exactly, into speaker’s own cognitive content at that moment). However, is what we are talking about here really the subject term and predicate term?

16.1 The Prague school hypothesis In Aristotelian logic, judgement is the union of subject and predicate. More recently, but highly analogously, it has become common to relabel the Sentence as ‘Nominal Phrase and Verbal Phrase’. But this type of hinge ceased to be the only option many years ago. The alternative theme and rheme (or known and new) hinge dates back at least to the Prague Circle of the Twenties. (In 15.2.2, we saw how Vygotsky, although without coining any new concept, also pointed to this structure.) From the cognitive-communicative perspective, it is clear that the information focus of a predication can be made up of a low-rank element in the hierarchy of syntax, such as a complement of time, manner, or place. After hearing, ‘What time does the train leave?’, the ticket officer in the railway station may respond either with the short reply ‘At 12’ or with the longer and more tiring sentence ‘The train leaves at 12 o’clock’. What should be highlighted here is that, even if the latter reply is the one chosen, this will not necessarily alter the fact that, from the cognitive-communicative point of view, the time complement is the only important element. There are, thus, two different syntactic structurings that can be applied correctly to the reply ‘The train leaves at 12 o’clock’. On the one hand, the predicate or Verbal Phrase would be ‘leaves at 12 o’clock’, and, correspondingly, the subject would be ‘the train’. On the other hand, the sentence could be cut differently to isolate ‘at 12 o’clock’ as the only important element, while everything else, including the verb, would function only as the non-essential and superfluous background. Certainly this is just an issue of the focus provided by the way a question is formulated –“What time does the train leave?” versus “Does the bus or train leave at 12 o’clock?”. Certainly, this allows

 Becoming Human

the answer to be a full sentence or just the requested “slot filler”. However, it is equally true that here there would be a hinge, a discourse-syntax, which is different from grammatical syntax. We have said that the two structurings are appropriate for the same sentence. However, we should qualify this. The second would be appropriate only for this communicative use and this intonational modulation of the sentence. The first, in contrast, would be appropriate for any use that could be made of it. Saussure’s division between langue and parole is thus insufficient. A free combination of words such as ‘The train leaves at 12 o’clock’ can in no way lie within langue. The so-called langue syntagms are prefabricated and culturally-learned pieces, like sayings or set phrases, and as a result, that sentence cannot be considered a Saussurean langue syntagm. But, on the other hand, utterances, real-life uses, as we have seen, clarify much more than do the conventionally understood syntactic combinations. In real-life use, one element of this combination will be highlighted on some occasion; other elements on other occasions. We might, therefore, say that sentences analysed syntactically would be located in an intermediate level of abstraction or generality. We have alluded to intonation as a factor reflecting and indicating the theme/ rheme structure. This connection points to the universality of the theme/rheme combination. Indeed, since Hockett’s (1963) description of this combination as one of the linguistic universals, no discovery has come to oppose him. Nevertheless, since intonation is another absolute linguistic universal, those links with intonation are good news for those of us who wish to highlight the importance of this type of syntax. In addition, if we assume that the formation of intonation blocks coincides with the planning of higher order units, we can conclude that discourse syntax would be a primary element in the microgenesis (or genesis of particular sentences), as well as in infantile ontogenesis as we have already seen.1 However, there is a further point to be made with regard to intonation. We should refrain from thinking that the theme and rheme structuring would only have a bearing on intonation but would be lacking grammatical consequences in the strict sense. Far from this, down through the years many linguistic phenomena whose explanation may lie in the concepts of theme and rheme have been identified. Panfilov (1972 [orig. 1963]) insisted specifically on the grammatical weight of the theme and rheme structuring. The study of thematic suffixes in a Paleo-Siberian language undoubtedly had an influence on his sensitivity to the properly grammatical nature of the Prague School’s so-called articulation. But it is not a question, I stress, only of the theme and rheme prefixes or suffixes that have been discovered in certain language families.

1. Is there any evidence that the formation of intonatory blocks is tantamount to planning higher order units? The transposition of sounds between adjacent words tends to occur within an intonatory block.

Chapter 16.â•‡ Connecting with the concepts of theme (or topic) and rheme (or comment) 

The examples I will analyse in the following section are taken from Spanish, although an explanation is given in English. Please bear with me. In Spanish, a sentence such as ‘Se ha caído una teja’ (in English, ‘A tile has fallen’; Lit: ‘Has fallen a tile’) is at odds with the presumably normal subject and verb order. This is because it is a rhemeonly structure. ‘Una teja’, although the subject of the verb, is not theme. For this reason, it does not occupy the initial position. We have said that here ‘una teja’ does not play the theme function. It would be more accurate to say that ‘una teja’ could never play this role, because the indefinite article clearly signals the tile in question as being unknown by the hearer. I have chosen this example (a subject with an indefinite article) precisely for this reason. Even though the sentence appears here decontextualised and in writing, there is, however, no doubt that ‘una teja’ cannot play the theme function. But – this is the addition that interests me here – we must not let the example betray us. We must take into account that the sentence ‘Se ha ido Manolo’ (in English, ‘Manolo has left’; Lit: ‘Has gone Manolo’) could be used quite perfectly in appropriate occasions as a rheme-only structure. On these occasions, exactly as occurred with ‘una teja’, ‘Manolo’ would be the subject, but would not for this reason automatically be theme. Additionally, in ‘Esta mesa, ya la he limpiado’ (in English, ‘I have already cleaned the table’; Lit: ‘This table, already it I have cleaned’), we observe that the element ‘esta mesa’ has been dislocated to the left. As it is theme, ‘esta mesa’ must occupy the initial position. But it is necessary also to make clear that the meaning ‘table’ functions as the Object of ‘he limpiado’. To make this clear, this referent is re-designated but now in a way, with the anaphoric ‘la’, which underlines its role as the Object. The two roles that the meaning ‘table’ plays in the sentence can be made very clear. Certainly, this difference in roles is very different to the difference in roles that the meaning ‘I’ plays in ‘Yo me peino’ (in English, ‘I am brushing my hair’; Lit: ‘I myself am brushing’). While ‘yo’ and ‘me’ have two different roles on the same syntactic level, this is not at all the case for ‘esta mesa’ and ‘la’. Nevertheless, it can also be said that ‘esta mesa’ and ‘la’ play two different roles. We need only consider the two levels of syntax – conventional and theme/rheme syntax. We might also cite cleft sentences here –‘The time the train leaves is 12 o’clock’, for example. These obey the purpose of having the two structures coincide. In one particular use, the sentence ‘The train leaves at 12 o’clock’ presents a theme/rheme hinge according to which the rheme is ‘at 12 o’clock’. This use is the only one connected to that particular cleft. Using Bühler’s (1990 [orig. 1934]) ‘synsemantic’ term, I would say that this use is ‘synsemanticised’ in the cleft. This term seems highly appropriate to me, since, in my opinion, this extremely specific type of sentence must have arisen after writing. Let us examine this. There are matters, such as laws or contracts, where interpretation can be open to no ambiguity, that is, where it is beneficial for meaning to obey ‘the truth, the whole truth, and nothing but the truth’. As a result, the theme/rheme organisation has to be

 Becoming Human

entirely clear. Think, for example, of the difference between structuring ‘I saw John’ as a rheme-only structure, or, on the other hand, leaving just ‘John’ as rheme. While in the first case an honest speaker (a witness in a murder trial, for example) may have seen hundreds of people that afternoon in the place in question, in the second case, in contrast, the speaker is saying that he only saw John. However, what happens when a specific syntactic combination makes its way into writing? In that case, both the intonation and the context disappear. Thus, it would cease to be clear in the writing which theme/rheme structuring the producer had used. It was to remedy this that, late on in human history, the cleft linguistic form would have arisen in some languages. The old (now, nominalised) theme, plus the copulative verb, plus the old rheme: this is, to use Bühler’s term, the ‘synsemanticisation’ of the theme and rheme structure. In short, and without pursuing this question any further, it is clear that theme and rheme is no less important a structure than conventional syntax. Only the traditional concentration of grammar on written examples would explain why the concepts of theme and rheme began to be studied only in the 20th century. But how should we define these very concepts?

16.2 Difficulties raised by the initial definition of ‘theme’: A hypothesis for a reformulation The task of defining the concepts of theme and rheme at first seemed trivial. What was already given, or known, would be the theme; what was new, the rheme. These two elements or blocks would correspond, from the cognitive-communicative point of view, to the intonation blocks. All this, I repeat, seemed very easy at the outset. But it would soon become difficult to get the two criteria to overlap, that is, the intonational and the cognitive-communicative, which had practically been treated as one single criterion. We shall begin by analysing one of the most striking and frequent of these difficulties.

16.2.1 Presenting the hypothesis: A first example The theme is the element that the speaker assumes the hearer to know. But notice the following conversational sequence. ‘Who has called?’, ‘Your brother called’. It is obvious that the speaker of the response (John) assumes his hearer (Mary) will know her own brother. Nevertheless, it would be very difficult to state that ‘your brother’ operates as theme or topic in this predication. Not only the intonation, but also the superfluous or non-essential character of the other elements, signal this ‘your brother’ as the rheme.

Chapter 16.â•‡ Connecting with the concepts of theme (or topic) and rheme (or comment) 

The following description has occasionally been made of the anomaly. ‘Your brother’ was classed as ‘known in long-term memory’, but at the same time ‘new to short-term memory’ (that is, in the attention prior to the message being received), and was classified, therefore, as ‘weak-sense rheme’. But this cannot be the solution. Let us assume that it was clear in the situation that the call could only be from the hearer’s family. Since the response would have to keep being analysed in the same way, we would find that ‘your brother’ would not be new for the hearer, not even in her shortest-term memory or attention. Immediately, the real explanation for the presumed anomaly will occur to all of us. ‘Your brother’ is clearly a real-world object known by the hearer. But the hearer does not know the fact that it is her brother who has called. In this sense, ‘your brother’ would be new information, exactly as corresponds to his intonation. This is obvious. But – and this, finally, is what we are interested in – the consequences of such an explanation have to be brought out. Specifically, the consequences regarding the knowledge the hearer has about the ‘has called’ have to be brought out. Clearly, the hearer knows about the call. However, by justifying the new information about the ‘your brother’ element, we have underlined that the hearer had scant knowledge about the ‘has called’. This scant knowledge, this mixture of knowledge and ignorance, is, of course, what is needed to ask questions. Plato (Meno’s paradox): “A man cannot search either for what he knows or for what he does not know. He cannot search for what he knows: since he knows it, there is no need to search. He cannot search for what he does not know, for he does not know what to look for.” All this is clear as regards the question. What I wish to suggest at this point is that the theme or topic of the predication also has to be defined along these lines. The scant knowledge would be not only on the lips of the person asking the question, it would also be on the lips of whoever converts this scant knowledge into the theme of the predication of response. And beyond predications of response or reply, it would also be in the theme of any predication where theme and rheme were present. Let us return to our example. In the first part of that predication, that is, in the ‘has called’, the speaker would be managing the cognitive load he attributes to the hearer and not the cognitive load he might give to the ‘has called’. If this speaker were speaking to addressees who know the identity of the caller, then this speaker, in order to refer to the phone call episode, would have sufficient with the subject pronoun ‘he’ (e.g. ‘He has called at 12 o’clock’); ‘your brother’ would not be necessary. There would be no need to clarify further, neither for the speaker himself nor for these addressees. The greater or lesser need for clarification reflects the more or less scant cognitive content involved in the ‘has called’. In order to put forward our hypothesis we have chosen to analyse ‘Ha llamado, (pues) tu hermano’. This sentence, as we have said, presents one of the most eye-catching difficulties for the traditional definition of ‘theme’. But there are other difficulties that may also be resolved with our hypothesis about theme.

 Becoming Human

16.2.2 On the heels of another example. The different positions that can be taken up faced with this type of difficulties Let us consider the question ‘¿Y Manolo?’ (in English, ‘What about Manolo?’; Lit: ‘And Manolo?’) and the following possible replies. 1) ‘Ya se ha ido’ (in English, ‘He has left already’; Lit: ‘Already he has gone’); 2) ‘Manolo, (pues) ya se ha ido’ (in English, ‘Manolo has left already’; Lit: ‘Manolo, already he has gone’), and 3) ‘Ya se ha ido Manolo’ (in English, ‘Manolo has left already’; Lit: ‘Already has gone Manolo’).2 The point that interests me is the comparison of the last two replies. The only differences between them are the word order and intonation. In the third reply, the intonation would be classed as a stressed ‘rheme-only structure’. In other words, from the point of view of intonation, this reply is only rheme. Clearly, here the rheme is longer and more complex than in the brief ‘Ya se ha ido’. Nevertheless, thanks precisely to the stress, this complexity of elements is not reflected in any bipartition of intonation.3 In the second, on the other hand, the intonation is classic minor, where ‘Manolo’ appears as theme. This status of ‘Manolo’ as theme is exactly what we would expect. The elements copied from the interlocutor’s previous question present the most paradigmatic type of theme. However, we must then explain why, in the other reply, the same ‘Manolo’ element, in a practically identical reply to just the same question, has, nevertheless, no theme intonation. Why in the stressed ‘rheme-only structure’ is ‘Manolo’ operating as part of the rheme? This then is the question. One possibility would be to view intonation as a last-minute retouching that does not necessarily have to agree with the cognitive-communicative structuring. In other words, under this possibility, the consistency of the two criteria, the intonational and the cognitive-communicative, would not have to be viewed as a totally reliable rule, even though it served at the beginning to raise the alert about the theme and rheme concepts. Intonation might at times have been only a declamatory whim. This position, as I say, is a possibility that, in principle, cannot be rejected. However, I shall not adopt it in any way. Intonation, rather than an unduly exaggerated question, appears to have 2. In Spanish the order of words is more flexible than in English. Thus, discourse-syntax can be more easily pointed out with Spanish examples. 3. Casielles & Progovac (2010) focus on these complex rheme-only structures, “which in Spanish present the order Verb-subject”, and correctly observe that “these statements do not separate the entity from the event, but merely express a state of affairs, where a new situation is presented as a whole”. But I disagree with these authors when they propose that “the VS structures under consideration are evolutionarily more primary”. In my view, if syntactic duality (agent-verb) has turned into one intonational unit, this transformation has required an additional and effortful process. In addition, we know that these VS structures are normally produced without any previous message of the hearer (although it is not the case in the example), and also that, in Vygotsky’s (1934) words, dialogue (or, as I would prefer to say, dialogically shaped compositionality) precedes monologue (monological compositionality). Infra, in Chapter 18 we will study these structures involving verbs of (dis)appearance.

Chapter 16.â•‡ Connecting with the concepts of theme (or topic) and rheme (or comment) 

been ignored, and undersold by grammar’s concentration on writing.4 There is evidence 4. I have already been fervent in my praise of intonation in previous chapters. Intonation carries, I said, the primary communicative force. This role would occur both in holophrastic and in syntactic language, although it would be more essential in the former. It would be useful to add at this point that intonation plays another role, equally as important, which is exclusive to syntactic language. Intonation would become a bond of syntactic unity, both on the level of the complete sentence and, even more interestingly, on the level of the blocks that go to make up the sentence. When each intonation pattern is applied to the set of words in a block or sentence, it succeeds in converting the set in question into a perceptual unit (an auditory Gestalt), even though each of the words is an independently addressable unit. In this way, we with our cognitive processes are moved by intonation (Cf. Schögler & Trevarthen [2007, p. 286] on being moved by “an intersubjective timeframe or core sense of time-in-movement”.) This intonation is, therefore, a magnificent and accessible window into the mental processes that underlie syntactic production. None of these are new statements, but we do not perhaps draw out all the possible conclusions. Why should the intonatory bond be so important if the plurality of elements occurred already in pre-linguistic thought? (Or, more precisely, why, if such plurality existed previously, would the bonding function be so important as to impose itself on even the most strained objective of pretending to be beginning a longer block than what is intended? Not even the best actor in the world could do this. He would be incapable of adopting the appropriate intonation for the feigned block. It is easy to feign anything in language, except this.) The moral I would take from this, as you know, is that the intonatory bond is crucial for the plurality of meanings to become a composition, a high-order unit. And further than this, of course, that there is no syntactic compositionality or ‘argument structure’ outside language – language without further qualification, since I do not need to differentiate it from any supposed mentalese. I would like to add a vaguer and more adventurous insinuation about intonation. Since I read Calvin and his obsession with throwing (see, for example, Calvin & Bickerton [2000]), it has occurred to me that throwing would be analogous to intonation, and not as Calvin claims, to the other component of language. The length of the intonation curve has to be calibrated. In language, this length will depend on how many and which words have to be bonded by this curve. In pre-linguistic communication, it may have been an iconic reflection of something, perhaps the distance the addressee is from the speaker. But always, as we have seen, there is no other option once a specific intonation curve has begun. This type of movement is ballistic and cannot be retouched once it has set in motion. It is similar in this way to the ballistic movement of a missile. Can something useful be taken from this similarity that has come into our hands? The only thing that occurs to me is that the shoulder movements, which are related to throwing, are under the control of the ipsilateral hemisphere, and also that in ontogeny (see Rönnqvist [2003]) the right preference in arm and shoulder movements precedes hand preferences. Would this have been the original trigger for the present allocation of intonation to the right hemisphere? Later, and as a result of this, the whole group of hand and articulatory-phonetic movements, which are more nuanced compared to those of the arms and intonation, and which are, in addition, self-perceptible and, consequently, more easily imitable, would be under the control of the other hemisphere. Of course, we would have to continue asking why throwing was connected to the right shoulder. Following the superiority of the right hand? (Remember that it has been stated on occasion that the preferred manual grasping in non-human primates would be on the right) The convenience of not involving the heart in the intensity of throwing? But we

 Becoming Human

(in studies on speech errors) to indicate that the formation of intonation blocks is primary and fundamental to the process of linguistic production (see above, 16.1). Likewise, “artificial grammar learning work with adults indicates that participants readily segment statistically coherent words inside prosodic contours, but not spanning two contours.” (Gervain & Mehler [2010, p. 206]). In addition, and in particular, given that the cognitive-communicative criterion is not overly clear in the bibliography, the decision to spurn evidence on intonation does not appear particularly sensible. Another possibility is the one that took shape in the so-called Second Prague School of Linguistics, which hypothesised that we would have to consider a graduated range, instead of only the two concepts of theme and rheme. Some themes would meet both criteria and would merit the label ‘strong-sense theme’; other themes would only meet one or other criterion, and would be ‘weak-sense themes’. The same would apply for rheme (we saw the latter in 16.2.1). With this crumbling of the initial concepts, many counter-examples can be successfully faced. This is true. But it is also true that, in the work of the Second Prague School, the attraction of the initial dichotomy has disappeared. The ‘thing spoken about/thing spoken’ hinge, which always, in any grammatical theory, had been intuited as the nucleus of predication, is ignored here. New terms only appear to respond to the need to patch up a building that is falling down. Now we have descriptions of various parts of the intonation curve, and also descriptions of referents at different ‘mental distances’ from the hearer. If one criterion is not connected to the other, that is, if the cognitive-communicative criterion does not correspond to a specific intonation curve, we have nothing more than descriptions of elements that were already obvious beforehand. It is doubtful that the labels hypothesised by the Second Prague School can be called conceptual coinings. There is still one other position. It is possible, in fact, to attempt a reformulation of the theory. This latter is, of course, what I am following. The theme is not simply an element that the speaker judges the hearer to know. This is clearly an essential, but not sufficient, requirement for the words belonging to a predication to be the theme. These words must carry the (incomplete, incorrect or out of date) knowledge that, in the speaker’s judgement, the hearer has about the topic. Therefore, in those two similar responses that might be given to the very same question in the very same circumstances, the status of ‘Manolo’ could, even so, be different in each reply. After hearing the question ‘What about Manolo?’, two almost identical replies were possible. On the one hand, the bipartite intonation reply, ‘Manolo, (pues) ya se ha ido’ (in English, ‘Manolo has left already’; Lit: ‘Manolo, already has gone’). On the other hand, the stressed unipartite intonation reply ‘Ya se ha ido Manolo’ (in English, ‘Manolo has left already’; Lit: ‘Already has gone Manolo’). This was, we should remember, the example we were examining. should come back from our excursion, from the rash adventure into which Calvin has pushed us! These obsessions are contagious, clearly.

Chapter 16.â•‡ Connecting with the concepts of theme (or topic) and rheme (or comment) 

In the second, ‘Manolo’ cannot be theme, because it comes after the information ‘ya se ha ido’. As a result, this ‘Manolo’ is no longer copying the cognitive level of the hearer at the beginning. A ‘Manolo’ that already includes the feature of having left cannot operate as theme. In contrast, in the bipartite reply, ‘Manolo’ is the echo at all the levels (or, underlining the aspect that matters to us, the echo on the cognitive level) of the ‘Manolo’ which appeared in the hearer’s question. Here, the speaker takes up the platform offered by the hearer, and completes or updates it. Undoubtedly, it is for this reason that the bipartite reply sounds more polite than the other. While the unipartite stressed reply rejects the platform offered by the hearer and, thus, does not address the specific needs of the particular hearer, the bipartite reply, on the other hand, sounds to the addressee as though it were made just for him. These two characteristics with which I have defined the theme would also explain the behaviour of ‘there is’ and ‘exists’. We said – all authors have always said – that the theme has to be some object or aspect known by the hearer (in the view, clearly, of the speaker). This explains the pattern of uses that may or may not occur with ‘there is’. The verb ‘there is’ is a presentative verb, that is, a verb that presents something not yet known to the hearer. For this reason, ‘*There is Julius Caesar’ is not grammatical, as was already stressed by Frege. For the same reason, there can be no themes with this verb. The verb ‘exist’, in contrast, is used with themes. If these are to be followed by an ‘exists’ in the positive, they will have to be elements whose reality the speaker is willing to state. But they have also to be elements about whose reality the hearer needs confirmation.5 Here we see the second of the characteristics with which my hypothesis has characterised the theme. The theme corresponds not to the mental content of the speaker, but to that of the hearer.

16.3 Dishonest predication: An interesting clarification A clarification that must be added to my hypothesis about the thematic element is the one we shall obtain when we address dishonest predications (remember supra, 15.1.1). In this type of communication, it is still the case that the speaker wishes to change the hearer’s level of knowledge. However, in these cases, the speaker’s aim is not to make the knowledge of the hearer converge with his own, i.e. the speaker’s, but, on the contrary, to drive them apart. To this difference in communicative objective we must add another difference that concerns the very nature of the thematic element. The mental content of the hearer that one wishes to transform here is only reality itself from the dishonest speaker’s point of view. When the two beliefs, the speaker’s belief and the belief the speaker attributes to the addressee, coincide, the latter does 5. Note that this is met even in an example as artificial as the Cartesian Cogito. Descartes’ ‘I’ had been placed in doubt by Descartes himself in his role as a theorist. See also p. 222, Chapter 14, note 2 above.

 Becoming Human

not really constitute a mental content for the speaker. (It is only for someone outside these two interlocutors and who does not share the cognitive limitations of this common belief, only for this possible third person, that this common belief could really be perceived as a belief or mental content.) Would it then be the case that my hypothesis about the thematic element is not fulfilled in dishonest predications? I believe it continues to be fulfilled. The content on which the transformation will be performed is not the speaker’s. The dishonest speaker will still maintain his prior level of knowledge after issuing his lie. As the object of transformation, the mental content of the hearer has, therefore, to be separated and isolated from reality (or, put another way, separated and isolated from one’s own mental content). It is true that in dishonest predications this separation (or duality) is not determined by any initial difference between the speaker’s and hearer’s beliefs. Nevertheless, when it comes to transformation, the mental archive of the hearer has to be isolated from one’s own by a barrier which is no less clear, but, in fact, more emphatic than the one operating in honest predicative communication. Undoubtedly, building this at once unmotivated and emphatic barrier is a relatively demanding cognitive task. But this is not something that will surprise us, quite the opposite, in fact. Producing dishonest predications, however much moral censure it may have received, has never been considered a bad cognitive symptom (see, e.g., La Frenière [1988]; Peskin [1992]; Núñez & Rivière [1994]).6 Let us summarise our definition of theme. This element must no longer be defined as an object that is known by the hearer, but as the level of the hearer’s own knowledge about that object. As a result, in the theme of his predication, the speaker would be handling a belief he judges to be false, insufficient or not up-to-date (or – to cover the case of lies – simply a belief that he, the speaker, wishes to transform). The first part of his predication would be for the speaker a mental state that he is perceiving at that moment as exactly that mental state. The theme appears thus as an extremely complex

6. In fact, the better constructed and more convincing a lie is, the greater its cognitive demands will be. In this sense, presenting the dishonest information as casually delivered is an efficient refinement. Instead of telling the lie ‘It hurts’, it is better simply to complain; instead of stating the dishonest ‘There are several people in the house with me’, it is better to shout out a vocative, ‘Pepe!’, ‘Antonio!’ Or, to use the procedures of rhetoric, making one’s own point as an assertion is much less efficient than asking a ‘rhetorical question’ at the appropriate moment: faced with this question the audience will be driven to choose a response, even if subvocally, and the content of that response will thus be stored in each hearer as his own and not as received. The hearer storing the dishonest content as his own, be it as a response, be it as an inference, but always as one’s own: this is the key aspect of sophisticated lies. To evaluate all this sophistication, I will mention the example of a crude and inefficient lie told by a child. The child’s mother had categorically forbidden the child to have the cat in his bed, and the child, when his mother came into the bedroom, hurried to say, ‘The cat isn’t here’. The lie in this case was not only inefficient but counterproductive, since the mother was not thinking about the cat at all at that moment. (A similar commentary can be found in Horn [1989].)

Chapter 16.â•‡ Connecting with the concepts of theme (or topic) and rheme (or comment) 

meaning, which is no longer limited to pointing to its correspondent in the world but which involves a second-order mental state.

16.4 Is my reformulation of the concept of ‘theme’ too complex and challenging? 16.4.1 Predication and metabelief: Is the second of this pair really the more complex element? My formulation of the concept of ‘theme’ may appear over complex. However, if we examine the role given to this element under this hypothesis, it cannot be said that such complexity is inappropriate. If the predicative communication function and, alongside it, the very possibility of syntax originally appear at the same time as, and as a consequence of, the appearance of theme, then we cannot be surprised that theme should be the product of high-level human capacities. The perception of false beliefs of others has no reason to entail a more complex or higher-level process than the one involved in the origins of syntactic and predicative language. It might be the opposite, instead. This would seem almost compulsory as soon as we take into account that the ability to perceive beliefs of others is not limited to third-person beliefs. Although typically scrimped on in most studies on ‘theory of mind’, second-person beliefs would be at the origins of such ability. This is what, in my opinion, children’s first predications show (see supra, 11.4, and also 13.2). Two aspects explain this original status of ‘second-person’ belief. On the one hand, as we have already said above, the conception of a radically different self, or, put another way, of a new centre in one’s own mind, is easier if this self is communicating with oneself. It is precisely the actions or looks from the second person, that is, the you that is addressing me, that would have caused this second mental centre to appear. It would have been with the attributing to the eyes of others of a visual perception that includes me, we hypothesised above, that the old resource of expectations ceased to be sufficient to understand conspecifics. On the other hand, second-person belief does not need to be clarified in the language of whoever will answer or reply to that belief. It is this second aspect that will occupy us now. In order to analyse it properly, we must contrast it with what happens in third-person beliefs.

16.4.2 The hearer’s belief: Included, but not displayed, in the predication Let us assume that, whether it be after receiving a linguistic message from Manolo (e.g. ‘I’m in a hurry. Where are the car keys?’), or whether it be after seeing Manolo act in such a way as to reveal his false belief, I perceive that Manolo believes incorrectly that the car is running well. What I will then have to say to Manolo is ‘The car isn’t

 Becoming Human

working’. To do so, we have hypothesised, I will be using as theme of my predication Manolo’s incorrect cognitive load about the car; that is, the car according to Manolo, the car as running well at that moment. I will have to correct and modify this cognitive load. On the other hand, what I absolutely do not have to do in my predicative communication to Manolo is to state clearly his incorrect belief. Stating this clearly to Manolo would be an absurd task. By contrast, things are very different if I want to tell another person about Manolo’s incorrect belief. Then, for me to achieve this goal, the fact that I use the designation ‘the car’ with the cognitive load that Manolo inappropriately has about the car will not be sufficient. The new hearer would have no reason to understand ‘the car’ in this way. Therefore, in reported speech (i.e. when I tell another person about Manolo’s false beliefs), I will have to turn Manolo’s belief into a predication, a subordinate predication, to be specific: “Manolo believes the car is in good condition”. What am I seeking with such obvious observations? I wish to show why we find it so difficult to accept that the simple designation ‘the car’ in the predication addressed to Manolo would involve a belief. The theme, or second-person belief, is no less belief than the subordinated predication that follows ‘Manolo believes’. Nevertheless, since the verb ‘to believe’ is rarely used for second-person beliefs, we are tempted to forget them. The error in this conclusion will be perfectly clear if we consider the verb ‘to say’. This verb is not at all necessary to speak, to say things about the world. It only became necessary (and only came into existence, I would suggest) when the attempt was made to tell what someone else had said. Before this, there was no verb ‘to say’, but, naturally, this did not prevent many things really being said. Equally, although the verb ‘to believe’ only appeared to tell one person about the beliefs of another person, of a third person, we do not therefore have to conclude that before this there were no beliefs.7 But we need to make the necessary clarifications. What I am seeking to show is not the obvious fact that there were beliefs before the origin of the verb ‘to believe’. As I 7. Benveniste (1958) observed that the verb ‘to believe’ behaves differently in the first person than in the other persons. In this same article, he observes that this difference also occurs in the verbs whose meanings imply ‘to say’ (more exactly, Benveniste focused on ‘I swear’/‘He swears’) and this was the first appearance of the question that in time would come to be known as performative verbs and about which so much ink has flowed. In contrast, the internal asymmetry of the verb ‘to believe’ has gone considerably less noticed. ‘Joe Bloggs thinks’ accepts subordinated sentences that are the content of Joe Blogg’s immediate and primary perceptions or knowledge; the speaker needs only to have knowledge that there has been an elaborate plot to deceive Joe Bloggs. In contrast, my immediate and primary perception or knowledge cannot come after ‘I believe’. That is, with the first person the verb ‘to believe’ necessarily has to incorporate a certain distance or lack of immediacy between the believer and the beliefs. Likewise, and even more obviously, it is only with the first person that the subordinated sentence can not be unbelieved by the speaker. These would be the semantic innovations that the use of the morphological possibility of complete conjugation would have meant for a verb which, like ‘to believe’, would have originated only to be used in the third person (see Bejarano [1994]).

Chapter 16.â•‡ Connecting with the concepts of theme (or topic) and rheme (or comment) 

understand the term belief, it would occur even in animals.8 Nor is it the hypothesis that there existed the exclusively human capacity of perceiving beliefs of others. We have already hypothesised above that such capacity existed before syntax originated. My only point here rests on the fact that, before the verb ‘to believe’ appeared, beliefs of others were already being handled in language. I would be handling my hearer’s beliefs in language without needing to state them. ‘The car’, in the message I send to Manolo (‘The car isn’t working’), would be loaded with Manolo’s belief. Of course, if we theorists (taking third-person perspective) wish to specify which belief Manolo has about the car, we will necessarily have to change it into the false predication, ‘The car is running fine’. But this does not authorise this predication to be placed either in the mind of Manolo himself or in the mind of the speaker who will put him right. Those of us wagering on a rich pre-linguistic (or, more specifically, pre-compositional) perception can practice an ascetic use of the explicit statement-type resources. Predication is only observed in the language of human beings. Therefore, to claim the structure of predication as underlying any content of perception, that is, to go beyond the limits of observation, requires good reasons to be put forward. As long as these are not put forward, the decision to reduce predication to human beings and to language might be the most prudent strategy. Here we are being guided by this prudence – in addition, clearly, to attempting to connect the hypothesis with evidence from other fields.

16.5 Looking toward the next section Let us now move on from all this. At this point we need to acknowledge how little we have achieved. Since we set out to explain the origin of syntax, we have to acknowledge that we have achieved almost nothing. With the theme/rheme hinge we have, indeed, come as far as, as I suggested, the compositionality of meaning, but we have not at all reached true grammatical syntax. In true syntax, a word is a noun forever, or a verb forever, etc. On the other hand, theme and rheme are roles within discourse that have no effect on the word outside the discourse under consideration. With regard to the origins of syntax, we find a general problem for all the theories and a particular problem for my hypothesis. Let us look at the general one. How, from 8. Premack & Premack (2003) insist that to speak about beliefs in relation to the explanation how an individual’s mind operates is an abuse of terminology. ‘Belief + desire = action’, the classic formulation of the ‘philosophy of mind’ is the target of their protests. In this formulation, ‘belief ’ should be replaced by ‘perception, or memory, or inference’. The term ‘belief ’ should only be used on the social level, when a individual perceives another’s belief. I am entirely in agreement with the Premacks, although, like them, I adopt the sacred terminology in the end. I wish to add that Davidson had already put forward a similar point when he proposed the term ‘to have the concept of belief ’; see infra, 18.2.3.

 Becoming Human

the input, does the child become an expert in syntax and all its details? Do we or do we not have to postulate properly syntactic innate abilities? Those who adopt one or other of the possible responses will also run into a further task. Either they will have to try to explain how those capacities would have come to human genes: this is the task that comes straight at the innatists. Or they will have to build a model of brain processing that is sufficiently sophisticated as to allow learning of syntax to be derived simply from the linguistic input received: this is the unresolved question for those of us who reject generativist innatism. (In words of Lieven [2010], an indisputably antigenerativist author: “In the usage-based approach, much remains to be explained. The development of abstraction and of the interaction between different parts of the grammar in arriving at the adult system is very under-theorised”.) But it is not those further problems I am referring to when I talk about the particular problem for my hypothesis. What is the nature of this particular problem? I do not accept that the compositionality of meaning is the reflection of a pre-linguistic perceptual compositionality. The concepts of action, agent etc. would not be in any way attentional units before language was established. A perception embraces a multitude of details, but, as long as one remains inside that perception, no detail can be addressed in itself. We have already seen all this in previous chapters. Now let us bring out the consequence that interests us here. The historical origin of the categories of noun or verb, and other ‘parts of the sentence’, appears in our net of suggestions as a much harder problem than it appears in the theories that accept compositionality prior to language. As far as concerns the acquisition of these categories by children, my hypothesis only comes up against the problem common to all those who choose to reject generativism. But as regards the historical origin, there is a question that appears just when one has decided to reject a true pre-linguistic compositionality. According to this latest rejection, there would have been no meaning intrinsically shaped for syntax either in the language of the Holophrastic Era or even in that of simple compositionality of theme/rheme. The roles of theme and rheme are discourse roles, that is, they are not at all intrinsic to meaning in itself. How then did the noun and verb arise, or any other part of the sentence, for that matter? It has to be acknowledged that my suggestion about the genesis of syntax and, more specifically, about the absence of compositionality in pre-linguistic perception, has extended rather more than shortened what had to be explained. My vision of ‘theme as a mental state of the hearer’ has solved, certainly, one problem, but this problem, apart from only having been solved in part, is a problem I had added. Things being as they are, it is my duty at least to face this problem, or more specifically, to face the part of it that is not yet solved. In the next Section, we will address first how the learner’s meaning is turned into adult meaning. We shall take the view that word meaning, far from coinciding with what introspection offers, would include all the specific links within which the individual has received or produced the meaning in question. Each word is thus a highly complex network of links, an enormous heap of data. It is on this that the brain processing would operate, the brain processing which would succeed in extracting the

Chapter 16.â•‡ Connecting with the concepts of theme (or topic) and rheme (or comment) 

syntactic categories, or, put another way, the prefixed abstract links that define the nature of the verb, the noun or any other part of the sentence. As we can see, from the perspective of the first problem, these considerations are mere whistling in the darkness. We have to acknowledge that the unresolved question for the antigenerativists remains unresolved. Nevertheless, that whistling in the darkness might contribute something with regard to our second problem. It will allow us to reformulate the goal of the problem. The historical genesis of grammaticalised syntax would have begun with the historical genesis of a few prefixed abstract links. How could these abstract prefixed links have originated? It seems to me that they absolutely could not have originated merely from the theme and rheme hinge. The question we will have to ask ourselves will be the following: From what, historically, might these abstract links have begun to be formed?

section six

From original to present-day predication Links and grammatical syntax

chapter 17

Meaning and the different types of link 17.1 Opening out the contrast between word and symbol The links that a word can maintain should not be understood as being a consequence or derivation of the meaning of a word. These links would be just one part of its meaning. More exactly, they would be just that part which is most specifically linguistic. Linguistic meaning, that is, each word, would essentially and primarily be a nexus of links with other words. Clearly, the symbolic ability is a crucial requirement of language. Nevertheless, in full language, meaning goes well beyond the symbolic ability. This has usually been underscored for words learned at school.1 The phenomenon has also been supported by the confirmation that even congenitally blind children can acquire and use a colour vocabulary. Landau and Gleitman (1985) and Landau (2000), having observed the child Killi, argue that much of the semantics of words can be acquired by language internal means, i.e. through distributional evidence and syntactic context. Certainly, in all these cases there is no doubt about the fundamental role played by the links in the meaning. However, in my view that crucial role can be observed in all words. As a result, the criteria for judging whether or not a sign is similar to the full-language meaning would have to be formulated in this way. There is no nexus of links present? Then we will not be looking at a linguistic meaning. There is? Then this will be an authentic linguistic meaning, even in cases where any evocative or referential semantic load is lacking. There is a class of words that falls outside this definition of a linguistic meaning. I am referring, clearly, to interjections, which maintain no links with any other words. However, is an interjection really a word? In traditional grammars, they are classified as one of the seven parts of the sentence. Obviously, however, interjections are never integrated into sentences. In addition, the communicative value of an interjection depends almost exclusively on a factor that is as old in phylogenesis as emotional intonation.2 As a result of all this, the fact that interjections do not fit this description of 1. See the two different kinds of learning mentioned by Vygotsky (1934). School age children and adults generally acquire words via incidental learning situations, often involving reading: Jenkins et al. (1984), Nagy et al. (1985) and Sternberg (1987). 2. Now that we have noted that interjections are alien to syntax and semantics, we must remember the phonetic code. What is there of this aspect in interjection? On occasions, the description ‘shout that does not seem human’ is used to underline the enormous emotional force

 Becoming Human

linguistic meaning only serves, in short, to support that description. Genuine linguistic meaning includes these links. What might we say about how children acquire these links? I would focus particularly on research such as that by Tomasello and his more or less direct associates (see, for example, Tomasello [1992] and [2003], Israel, Johnson & Brooks [2001], Abbot-Smith, Lieven & Tomasello [2004], Dabrowska & Lieven [2005], Lieven [2008], Lieven et al. [2009]). All these studies extend those by Nelson (Nelson [1985], for example). The ever more reliable conclusion of those studies on learning rests on the child not working, at first, with categories such as Verb or Transitive Verb, but with particular specific constructions which it would gradually generalise. The web of similarities woven by the brain would begin at a very specific level. Long before it is possible to achieve a link such as, for example, the link that requires verbs to have a subject, links have to have been detected between one specific word and another. Syntactic categories would originate in something extremely specific. “In ‘Kick X’, X is the thing one kicks”: this is the category at the beginning. At the outset, these “island constructions” are absolutely not taken as an abstract model. This is manifested in the inordinately long period of time in which children produce a certain type of syntactic combination only with one or two selected lexemes. Only gradually would it be noticed that word A is sometimes followed by word B and at other times by word C, and from there a similarity would come to be established between words B and C. Such distributional nexes are the extremely humble beginnings shown to us by studies on language acquisition. In short, there is increasing evidence that children’s learning of syntactic constructions is extremely closely related to characteristics of the input (Lieven, Behrens, Speares, & Tomasello [2003]; Rowland [2007]). As Vogt & Lieven (2010, p. 23) have summarised: “There will initially be fully concrete chunks of speech. As of a shout. This is inexact. We human beings are conditioned by the phonetic code of our native language, and we cannot move outside it. It is well known that even woof or miaow are different in different languages. Certainly, it has to be acknowledged that the inexact description of shouts of pain, the exaggeration so often used, for example, in romantic dramas, was based on an idea that was correct: the greater the emotional charge, the greater the independence as regards everything that constitutes the learned aspects of language. However, returning to the question this note has posed, it must be concluded that the peculiarity of interjections does not completely reach the phonetic aspect. (Not completely, it is true. But “Ouch! and other interjections such as Ah!, Ooh!, Wow! or Yuck!, are usually produced with sudden intakes of breath, which is the opposite of ordinary talk” – Yule, 2006.) As regards communication through interjections, it will be convenient to remember the paragraph in which Vygotsky, in Thought and Language, gives an example of the different meanings the same rude interjection can take depending on its intonation. Those meanings are laid out in a very attractive and convincing way. We are being offered there (we say to ourselves admiringly) the very reality of each one of these uses. Clearly, I partake of such praise, but we ought not to forget that in the drunk protagonists of these examples there were only intonations and a word removed from its original meaning.

Chapter 17.â•‡ Meaning and the different types of link 

development proceeds, these constructions become internally analyzed and related to each other in a network of constructions”. See also Langacker (2009). Ninio (2006) focuses on this same approach in a slightly different way. Earlier (Ninio [1999]), she stressed that the pathbreaking verbs (i.e., verbs that begin the acquisition of a novel syntactic rule) tend to be very frequent and very generic verbs: They are not fortuitously chosen. In addition, in Ninio (1994, p. 2), we find a crisp formulation of the big question: “how do children come to acquire linguistic signs with such paradoxical meaning – words possessing a hole in them?” How does the brain go from these beginnings to being able to produce or understand combinations it has not experienced before? The only thing that might be said in this regard is that we must not underestimate the distributional and statistical processing of which the brain is capable. It is true that implicit or statistical learning (one and other probably labels the same reality: Perruchet & Pacton [2006]) has been studied and modelled for years without the models hypothesised being able, until now, to account for the levels of complexity reached by the brain. As a result, there is, for now, an enormous unresolved question. Do we rely on future progress in the study of cerebral processing? Would it be preferable to postulate a properly syntactic biological capacity? This is, as we have already said, a still current crossroads where the first option is the one I prefer. But we must now focus more closely on adult word meaning.

17.2 The innumerable speech episodes and the brain We have not yet opened out sufficiently the complexity of meaning. Let us take note of the trajectory of learning glimpsed in research such as that by Tomasello or Ninio. What does it suggest to us about adult word links? In the individual mind, the meaning of a word would include, as preactivations raised by it, all the specific words with which it has appeared. Every occasion where it is used in production or reception, every specific speech episode, would bring forward particular and specific links that appear later within the preactivations raised by the meaning in question. Oppenheim, Dell & Schwartz (2010, p. 228): “Our model reflects a recent trend in cognition to link psycholinguistics with theories of learning and memory by developing accounts of how experience changes language processing”. Let us put this in other words. It is clear that the countless episodes throughout which a meaning has been acquired contribute to this final result, even if only for frequency computation. What we have to ask is how this contribution is achieved? I would reply that it is because all the episodes are included in the final result.3 In this view, meaning would be a giant cerebral edifice inaccessible to any introspective effort. 3. A large cloud of tokens: Certainly, this seems close to ‘exemplar theory’ (Pierrehumbert [2001]), or to Goldinger (1998). However I am focusing only on token links and not on properties of a token that reflect the speaker’s voice, mood, speaking rate, etc.

 Becoming Human

Certainly, since the 60s different semantic network models have been hypothesised: the Q model, models by Collins & Loftus (1975), or the Latent Semantic Analysis (proposed by Landauer et al. [1997, and 1998], and adopted by Kintsch [1998]) and many others. Nowadays, this makes up a specific discipline: I have found something of an overview of this question in Steyvers & Tenenbaum (2005). Mintz (2003), and Monaghan & Christiansen (2008) showed that trigrams (or more generally, frames surrounding the word) provide distributional information that the child uses to determine the grammatical categories within the language. See also Andrews & Vigliocco (2010). Nevertheless, I do not intend to go into this field; I am merely pointing to a general, and well-known, idea about the role of links in the meaning of words.4 My point is that, contrary to what introspection would suggest, semantic memory (or more specifically, adult semantic memory) of a word would have very many more links than episodic memory of its use in a particular occasion.5 Certainly this seems a contraintuitive affirmation. We need only think of a specific word in a text we are examining, and how clearly the two or three uses it may have had within that text are evoked. On the other hand, when we step out of those episodic memories, that is, when we focus on it using our semantic memory, the word’s links seem to disappear. However, in my view, all the links (all the links within which the word in question has been received since one began to learn it) would be included as possible preactivations in semantic memory. When he talks about perceptual or cognitive abstraction, Fuster (2003) insists on an idea that is possibly similar. He rejects the ‘pyramidal’ notion of cerebral networks, according to which general concepts would be represented at a specific point of the higher cortex. Quite the opposite, Fuster says, cerebral networks tend to occupy more extensive space in the cortex as they penetrate more elevated hierarchical levels. More links, more cortical space, concepts that are more generalised: this, clearly, is the explanation I am thinking of. But let us leave these risky analogies to one side and return to the question of links.

4. Janda & Solovyev (2009) and Partington (2009) offer some information about the long history of this idea. 5. Needless to say, episodic links are important. See Barclay et al. (1974): Participants studied a critical word (“piano”) in the sentence ‘The man lifted the piano’. It was observed that recall was better with related cues (‘heavy’) than with unrelated cues (‘nice sound’). In these data, the power of recent (or said otherwise, episodic) links is shown. Let’s also pay attention to Metzing & Brennan (2003), who say that recent links from a word to its speaker should be included. Certainly, this ‘episodic link to the speaker’ would be in most cases lost in the ever growing crowd of links different from that type. However, in some cases, a word remains (whether it is in a recipient’s experiences, or in many recipient’s) confined to a speaker or to a well-defined group of speakers. That is probably how the register – colloquial, poetic, etc.– of a word is originated. Obviously, this is an example of the general rule: Semantic memory is originated by all episodic links.

Chapter 17.â•‡ Meaning and the different types of link 

17.3 Meaning as a giant unconscious cerebral edifice As you can see, I am talking about meanings in the individual. Clearly, the decision to look past individuals and take as the object of study the treasure common to the whole group was very useful to Linguistics. However, the time has surely come to address this on the level of individuals and their interactions, of brains and their comprehension and production episodes. Although prolonged interaction would happily succeed in making the meaning the same for all the members of the group, this result would derive from real processes that in some measure would be different in each individual. We must consider another, on this occasion more important, question. As we were suggesting above, the immense amount of information that has come to make up meaning can neither be evoked nor brought to introspective consciousness. The most important part of the meaning of a word would thus be at the margins of introspection. This is highly acceptable. It is not only a question of meaning and introspection being uncoupled with the second Wittgenstein, for example (Wittgenstein [1953]). Rather, in addition, nowadays, the separation between consciousness and cognitive processes is no longer only an accepted practice, but also a statement about the reality of the mind that it is difficult to dispute. It might be useful to cite Jackendoff (1996) at this point (See also Jackendoff [2007]). This author contrasts thought, which would be unconscious, with linguistic form, which would be conscious. I will not follow this terminology. It is my view that we should begin by differentiating between different forms of thought. However, I believe there is relatively little distance between what was suggested above and Jackendoff ’s thesis. According to what I said above, the part in the cerebral edifice of meaning which escapes introspection consists of links. The set of all the links in which a word has been heard or produced would be a very important cognitive resource (important not only in achieving the admirably fluid processes through which syntactic combinations are produced and understood, but perhaps also even in acting as a foundation for many of the creative problem-solving tasks). We could say, therefore, that the unconscious part of meaning coincides with the part we might call most similar to thought. This brings us close to Jackendoff ’s formula. But we need to take a little longer over this question of the relationship between meaning and introspection. Within the vision I have suggested of meaning as a giant cerebral edifice, we can explain easily why introspection of many linguistic meanings is so disappointing. If we ask ourselves about the meaning of an element of the so-called (in medieval Logic, at least) syncategorematic elements, namely, a preposition or a conjunction, there is little we can say. The introspective load we can attach to such linguistic meaning is practically nil. However, this does not mean there is no linguistic meaning present, quite the opposite, in fact. These merely relational meanings are the jewel in the crown, the culmination of syntactic language (Hurford [2002]). Why then such introspective inanity? Let us begin by opening out the question. Links are very important cognitive resources on which the production and reception of language depend. In themselves,

 Becoming Human

however, they would remain outside introspection. Not only would it be impossible to raise this enormous bundle of data up as far as attention, but it would in fact be harmful were this to occur, since it would make the normal use of language impossible. A word’s links would only be perceivable by their fruit, that is, by the easy production or comprehension of the words following it. What then makes up the content that, in the case of some meanings, is offered to us by introspection? When a meaning contains a part that is accessible to introspection, that part consists of the evocation of the element common to the scenes with which the meaning was often associated. Certainly, even those nouns, or verbs or adjectives, which have a more perceivable meaning will often be used without such a scenario, for example, when denying the presence of this content in the environment, or when asking, or when they are used in metalinguistic statements... However, there is a decisive factor that favours that scenario. Children, in order to learn their first meanings, absolutely need to have received them in the context of a scene that includes the real-world element correlative of that meaning (supra, 9.3). It turns out, therefore, that for a specific type of meanings, the content offered to us by introspection of those meanings is made up of the evocation of this element. However, this evocable element does not occur, I repeat, in syncategorematic words. The situations in which a term of this type has been learned are so entirely different from each other from the point of view of perception that it is impossible to extract any evocable element from introspection. In addition, there are no habitual links for a syncategorematic term: its past links are so numerous, varied and scattered that, as a result of the so-called ‘fan effect’, any possibility of being conscious would be cancelled for each of them. Hence the vacuum offered us by introspection when it is applied to these meanings. In these last paragraphs, we have defended the enormous complexity in the meaning of syncategorematic terms. In so doing, we have opposed the view that introspection offers about these terms. In respect of them, therefore, introspection would not, in fact, be reliable. But what would it be with regard to the others? We should now address meanings we might consider to be privileged as regards their openness to introspection. With these meanings, introspection of the isolated term offers us, we have already said, an evocable content. But here, too, introspection would have to be distrusted. There would be no reason at all why the content offered to us by introspection when we savour an isolated term should coincide with the meaning accompanying the normal use of language. In short, I would say that evocation would be used, in normal usage, only for complete sentences, or, at least, complete blocks within the sentence. What use would there be for an evocation associated with the isolated term ‘dog’, if on some occasions at the end of sentences we have to evoke a dog barking furiously, and on others, one asleep? When we wonder about the meaning of ‘dog’, then it makes sense to evoke some ‘dog’ image. Any image will do. But, on the other hand, when we are the recipients of a complete sentence, it would be better to wait and see if the dog

Chapter 17.â•‡ Meaning and the different types of link 

is sleeping or running. Throwing oneself into evocation term by term would be a completely uneconomical strategy. We must, therefore, reject introspection if we wish to give an account of the normal use of meaning. This, it goes without saying, is a classic statement if ever there was one (see for example Alston [1964]), and one on which there is no need for us to insist (we will see a more concrete and perhaps more interesting example against introspection later, in 18.7.4). Thus, we must pay attention to links and not only to evocation. Andrews, Vigliocco & Vinson (2009) distinguish “distributional meaning” (that is, links) and “experiential meaning” (that is, evocation). In general, a meaning is more elaborate and sophisticated the more it rests on the links. Evocation and expressive force are far from being the truly linguistic nucleus of the word, that is, of the word as contrasted with what in previous chapters we simply called a symbol. Emptying or bleaching is always the key to linguistic progress. To a greater or lesser degree, bleaching is present in any meaning in syntactic language.

17.4 ‘Typical links’: How can these intervene in a unified explanation of different phenomena? We have just suggested that the depth to which a word is processed is the result of the activation of its links, and not from the vividness or detail of the evocation. The evocation caused by the isolated term would in fact be inversely proportional to the appropriate linguistic processing; in short, it would be harmful to good linguistic comprehension. Linguistic meaning is, above all, a nexus of links, even though through introspection we may not see it like that. At this point, however, we must focus on the different types of links. From the entire set, which equates to the mass of all the occasions in which the word has been used throughout an individual’s life, we would come first, via a process of growing abstraction, to the typical links, and subsequently to the grammatically syntactic links, now unhooked from semantics. As I said above, the view of meaning as a nexus of links was hypothesised a long time ago, and does not exactly need to be defended. However, I would like to suggest that its potential as explanation still has not been fully utilised. To do so, we must begin by addressing what we have called typical links, and which might also be called habitual syntagmatic links. Some of a word’s particular and specific links are repeated on many of the occasions in which the word is used, and would thus have a special prevalence within the cerebral edifice of meaning. What I am suggesting is that these links (typical links) would be involved in the explanation of the following three phenomena – poetic metaphor, forgetting proper nouns, and the creative use of sayings and other ‘langue syntagms’. These allusions will be no more than allusions. I have no intention whatsoever of focusing on these questions in themselves, although I will say at least the direction of my thinking when I recommend a common explicative element for all of them.

 Becoming Human

17.4.1 Metaphors and tautologies The most typical links of a word used metaphorically would be crucial for the resulting poetic metaphor. Therefore, when one invokes only the similarity between the reference of the meaning employed metaphorically and what would have been the literal meaning, the communicative function of the metaphor would remain unexplained. On the other hand, if we look to typical links to find an explanation, the typical links of the metaphor term (the ‘ruby’ term) counterbalance the typical links of what would have been the literal term (that is, the links which have no connection with splendour, and which are the typical links for ‘mouth’ and ‘lips’). As a result of this alone, the view of the reality in question that comes to be transmitted (of the loved-one’s lips, to continue with the well-worn example) can escape from the silencing effect which the literal term (’mouth’ or ‘lips’, with its grey, splendourless links) threatens to impose on any predicate attempting to praise it (remember the similar silencing which imposes ‘soup’ on ‘cold’, ‘cardinal’ on ‘young’, or ‘elephant’ on ‘small’).6 According to this view, it would be a mistake to state that, when a speaker calls her job a prison, she is wanting to designate the superordinate that would include exhausting jobs and prisons. Far from this, what the metaphor seeks to apply to the ‘job’ in question are the specific links of ‘prison’. Let us address now what Bulhof & Gimbel (2001) call deep tautologies. ‘A mother is a mother’. ‘Tyranny is tyranny’, ‘Full is full’.7 What is achieved with these statements? The explanation I would suggest looks once more to typical links, or normal syntagmatic connections. The term operating here as predicate, as it is the repetition of the subject, no longer needs to contribute any evocation to the meaning of the complete sentence. Thus, unburdened of this part of its usual task, the term (the predicate term, I mean) can concentrate on the thorough activation of its typical links and, thus, on bringing to mind everything one can correctly and normally say about mothers or tyrannies or being full. In this sense, the predicate of deep tautologies is a rare case in which a model such as Landauer’s Latent Semantic Analysis would truly exhaust the meaning active at that point – that is, it would exhaust it without it mattering that the meaning might be of the type which could be evoked, as mother, or rose. 6. Many years ago, I analysed metaphor as a creative solution to the communicative-linguistic problem of avoiding the links of the literal term itself and replacing them with other more appropriate ones (see Bejarano [1991b]). That article is now out of date in its bibliographic references. I did not even consider the distinction between poetic metaphors, which were what interested me in that article (and here, also), and Lakoff ’s conventional metaphors. However, with regard to the question occupying us at this point (that is, the question of links as an element included within meaning), that article, despite the utter ignorance in which I then found myself, invokes that same notion of meaning. 7. It would be possible to include here Gertrude Stein’s famous ‘a rose is a rose is a rose is a rose, etc...’ Incidentally, Eco (1993) pays considerable attention to this. However, I prefer examples which are less contaminated by purposes of theory.

Chapter 17.â•‡ Meaning and the different types of link 

The same explanation would serve for the exact opposite phenomenon, that is, for ‘the negation of tautology’. ‘(...) 1740. Around that time, Kant was not yet Kant’. The predicate ‘Kant’ urges exclusively, and thus very intensely, this name’s most frequent links to be brought to mind.

17.4.2 Why are proper nouns so difficult to remember? We can also bring in here the fact that proper nouns are more difficult to remember than other types of words. The ‘I have it on the tip of my tongue’ phenomenon (TOT, or tip of tongue) is maximally frequent with proper nouns. Luria (1980) gives an attractive description of this type of experience. Valentine et al. (1996) explain it as follows: a proper noun is the only signifier to name its referent, it has no substitutes as, in contrast, do common nouns, which normally have synonyms. This is what, in the view of Valentine et al., makes it more difficult to bring to mind. I do not think this explanation is adequate. However much, in many languages, there are synonyms for a single referent, speakers do not choose freely from within a range of synonyms but make a very subtle and precise choice of the synonym most suited to the context. In contrast with that explanation, I believe the differentiating factor in proper nouns is the lack of typical links. Consider the amount of typical links for ‘sun’ e.g. ‘in the light of the sun’, ‘what a lovely, sunny day’, ‘make hay while the sun shines’, ‘if it’s sunny, we’ll go for a walk’, etc. This explains that, even though it is a noun that might in certain senses be considered proper, ‘sun’ has no difficulties of retrieval (these links of ‘sun’ are, it goes without saying, those projected onto Juliet in Shakespeare’s metaphor). On the other hand, for proper nouns which are people, there are no typical or previously-stipulated links. As a result, these nouns are the most difficult to recover. It should be noted that, when one knows, for example, an alphabetic list of surnames, an old class list, by memory, one can resort to reciting that list to recover from memory the name of an almost forgotten classmate. Note, likewise, that nouns included in the Encyclopaedia are less difficult to remember. We also find here that these nouns have typical links (that is precisely why they can be used figuratively –‘This General is no Napoleon’). The habitual links would always be crucial.

17.4.3 Links in ‘langue syntagms’ and in speech episodes: Identifying degrees of the same phenomenon This also explains the fact that the term at the very heart of a saying or of any other well-known text always bears some degree of adherence to the saying in its meaning. At times this is used by poets (and by speakers in everyday speech). By quoting a characteristic term of the saying, they seek to activate that saying in the mind of the hearer or reader. For example, when someone says ‘The horse didn’t drink’, this phrase may

 Becoming Human

suggest failure and possibly even threat, precisely because this message reveals and includes the saying ‘You can lead a horse to water, but you can’t make it drink’. The saying, thus activated, will overlap as an addition or as a contrast (it will overlap in an intersection of structures, to put it in the terminology used by Lotman [1978]) to the words really present. On most occasions, this resource is only used to shorten the form of expression. It can be used, however, by a creative speaker to successfully express an overload of meaning that could not be expressed any other way. We might say the same of many advertisements. This is an extremely well-known topic. What I am interested in highlighting, however, is that we should not allow ourselves to be blinded by the adherence of the links in the borderline case of a saying or ‘langue syntagm’. Any word incorporates the preactivations of its links. Certainly, there are many differences. (Vespignani et al. [2010, p. 1694] have showed that “the electrophysiological correlates of the processing of highly expected words in idioms, where predictability arises from our knowledge of idioms, differs from those underlying the processing of highly expected words in literal compositional sentences, where predictability is largely due to context – and sentence-level information”.) However, the key lies in that preactivations become varied, numerous and multiform when we move outside ‘langue syntagms’ or sayings. So much so that, when preactivations reach the higher level of abstraction, they simply demand a syntactic category. In this way, combinations never previously experienced are successfully produced and understood.8 We have just referred, on the one hand, to the close relationship between two key words in a saying, and, on the other hand, to the abstract and general link constituted by the requirement for verbs to have a subject. Typical links (which are involved in metaphors, tautologies and forgetting proper nouns) form the intermediate case. Typical links are also the connections that provide the characteristic flavour that differentiates one synonym from another. What leads us, in a given context, to prefer one word over another synonymous with it? This, being so difficult to grasp, has at times seemed a magical phenomenon. Formal or colloquial registers of words, on the other hand, have always appeared easier to explain. In the case of these, it is more obvious that the various speech episodes involving the term as it was being learned, and also subsequently, would have acted as the starting point from which those qualities were abstracted. However, we would have to seek the underlying cause for the ‘ungraspable flavour’ in a similar way – more concretely, in the type of links that are halfway between abstract in the type of links that are halfway between abstract syntactic links and the links internal to the saying. 8. But novelty can go beyond this. Let us think of the first time that a ‘complement coercion’ is used. This phenomenon is exemplified by the sentence “The man began the book” (Pustejovsky [1995]; Goldberg [1995]; Jackendoff [1997]). Verbs like ‘begin’, which semantically select for an activity, should be unable to take arguments denoting entities such as ‘book’. Nonetheless, we interpret such sentences as plausible. Thus, ‘begin the book’ is understood as ‘begin doing something with the book’.

Chapter 17.â•‡ Meaning and the different types of link 

We have suggested how the importance of the links – mainly the typical ones – within meaning has to be cited if we hope to give account of several particular phenomena. However, none of this should cloud the general function played by all these links, namely, to facilitate the production and comprehension of syntax. With each word, as we have said, its possible links on any level, whether typical or merely syntactic, would be preactivated. And it is within this network of unconscious preactivations where we witness the speaker’s choice of the most appropriate ones for the goal current at that point, and the hearer’s rapid comprehension of it (Conway et al. [2010]; remember also Plato’s aviarium).

17.4.4 The other side of the coin: Some unwanted secondary effects of links The network of links can, on occasions, create undesired consequences. For some years now, studies have been carried out which signal what we might call the ‘negative effects of conceptualisation’. The field where these negative effects have been most studied is the field of analogical problem solving. “Participants who thought aloud were more likely to retrieve surface matches and less likely to retrieve true analogies”: Lane & Schooler (2004, p. 715). “Verbal understanding activates a small ‘semantic field’ of information closely related to the contextually biased interpretation. Although normally effective, this activation pattern makes the verbal understanding vulnerable to misdirecting features of insight problems”: Bowden et al. (2005, p. 325). Even on the level of strictly episodic links – or ‘recent links’–, a term can continue to activate links when, in a new stage of the task, these have ceased to be appropriate. This would explain a mistake observed in small children. “In the dimensional change card-sorting task, the younger children will use the description they have used before in the pre-switch phase”: Kloo & Perner (2005, p. 47) (See also Pierce & Gholson [1994]). In short, we conclude that when a term has been acquired, the activation of this term would tend to lead us down the well-worn paths of its strongest links. This may, on occasions, lead us to an inappropriate understanding of a given message or text. Nevertheless, as we have already said, the advantages provided by meaning links are enormous. How would an optimal balance between such advantages and disadvantages be achieved? To date, we have seen no precise answers to this question, which is perhaps crucial to explaining creative problem solving. Even so, it seems the brain cannot allow the preactivated links to grow in number and significance if at the same time the strength with which it mobilises them for the goal current at that time does not also increase. There is a very interesting datum in respect of this in a specific type of frontal injury: patients demonstrate, on the one hand, great fluency in seeking words related to a word they are given but are, on the other, incapable of selecting an appropriate term to occupy the place of X in ‘Lion is to gazelle as cat is to X’ (cf. Luria [1976b, Chapter 9]). Word links would thus have negative, and not only positive, effects. But one is not comparable to the other in magnitude. Human cognition would never have reached its

 Becoming Human

present level if, through time (both individual and historic), it had not constructed the enormous edifices constituted by linguistic meanings.

17.5 Links and the perspectivist nature of meaning It has always been noted that the univocity between referent and sign, or, more specifically, from referent to sign is absent, even within the same language and same register. Langacker has popularised the term ‘perspectival’ to refer to this aspect of meaning. In reality, Langacker has limited his surprise to a single aspect of what would warrant being called perspectivism of meaning. He may have been influenced in this by the experimental conclusion that children, as if following the rule that there cannot be two names for one thing, select only the new object when they are trying to understand a new meaning. Whether or not this is the cause, the fact is that Langacker highlights only that, depending on the communicative context, a single element can be called ‘dog’, or ‘animal’, or ‘pet’, and likewise, that a single action can be called ‘run’, ‘move’ or ‘escape’. However, we must bear in mind that it is also possible to describe a whole scene in two different ways. Active voice/passive voice: ‘A follows B’/‘B precedes A’, ‘A is bigger than B’/‘B is smaller than A’, ‘A is the father of B’/‘B is the son of A’, ‘A is below B’/‘B is above A’, etc. In all these cases, the key to choosing between one possibility and another lies in the context, and more specifically in the previous statements or in those statements whose arrival is projected as being immediate. This is well known. The point I wish to highlight, although perhaps just as well known, is more relevant for what concerns us here: it is only the difference between links, and not between the referents designated, that can account for these semantic choices. (It is true that a specific syntactic outline will have more chance of being selected than its alternatives when a speaker has used that outline immediately previously, whether it be in production or in comprehension. The key – this has been clear for some time – is that, given a range of felicitous syntactic structures to express a message, speakers simply choose a structure on the basis of the relative accessibility of the alternatives: see Bock [1986], or Luka & Barsalou [2005]. However, let us leave these secondary influences – this mere priming – to one side.) Focussing on the general case, we can say that a term is selected on the basis of its links. Thus, the links would be precisely the rationale underlying perspectivism. In any genuinely linguistic meaning, links have as much or more weight as real-world correlation. But we should perhaps comment further on the perspectivism of meaning. The possibility of describing the same fact in many different ways, a possibility which is absent in the pointing gesture (in Cratylus’s finger) and which would only have flourished with the truly grammatical links of full language, is an enormously important cognitive resource. A specific description, by putting in action specific expectations about what will come next, is able to guide the subsequent thought – in both the hearer and the speaker herself. The rethorical use of ‘half-full bottle’ (versus ‘half-empty

Chapter 17.â•‡ Meaning and the different types of link 

bottle’) or of ‘semi-skimmed milk’ (versus ‘half-fat milk’) is so well-known that we do not need to focus on it. We will be able to expand this point about the guiding function if we come to see redescriptions as being still further disconnected from the code and more determined by ‘episodic’ (i.e. recent) links. What am I referring to? The redescription would be, I suspect, a crucial resource in problem solving (I take the term and concept of redescription from Karmiloff-Smith & Inhelder [1974], and Clark & Thornton [1997]). In an arithmetic problem, the unknown goal will have to be reached via several successive redescriptions so that by the end, in the final redescription, we can connect with the numbers that appear in the formulation. Equally, the key to the algebraic solution to the problem is to name the same referent in two different ways. Here the links employed in a description would not be the same links a term has the length and breadth of language, but only those presented by the referent in question in the specific text acting as the formulation of the problem. This difference with respect to previous cases is very clear and must be acknowledged. But in the case of the position laid out here, such a difference does not prevent us establishing a parallel between the redescription that operates in mathematical problems and the semantic choice of a term. If, as we are inclined to think, the grammatical code links originate in the enormous number of speech episodes, each of these two types of link (the typical and the episodic) would not then have to be located completely separate from the other. There would be a gradual transition between both types of link, and also between both types of descriptive choice. But we should leave the question of problem solving, which at the end of the day is secondary to our purposes, and continue enquiring after grammatical links.

17.6 The peculiarity of properly grammatical or abstract links: Raising the question of their historical origin In this chapter, we have hypothesised, firstly, that the cerebral meaning of a word would include as associated preactivations all those links within which that word has been addressed throughout the life of that individual. Secondly, we have suggested that among those thousands of links the brain would succeed in abstracting a semantic profile from the most repeated of those thousands of links. Thus, on this first level of abstraction we would reach what we have called typical links. As a third point, we have considered a higher abstraction which, taking the total number of links as its starting point, would go beyond the semantic level and would succeed in forging the syntactic categories themselves. Almost certainly, the reader will have found this way of addressing the difference between different types of links (the abstract and grammatical on the one hand, and the enormous mass of specific and episodic links on the other, with the typical links halfway between, as an intermediate type) to be an excessively frustrating and poor picture. Undoubtedly, the reader would wish for more specific focus on children’s

 Becoming Human

acquisition of abstract or grammatical links. What can I say about such wishes and desires? Only that I share them fully. This I can state categorically. We have already mentioned the investigations we consider to be most convincing regarding children’s acquisition of grammatical links. However, what happens when we explore the historical origins of these links? Let us assume language at a stage subsequent to the Holophrastic Era, but still limited to the compositionality of theme/ rheme and still lacking, therefore, true grammaticalised syntax. Let us continue assuming that the links with which each meaning is used gradually accumulate in individuals’ brains. What can we achieve from this starting point? My own view is that it would be impossible to reach true syntactic categories or true grammatical links from here. We cannot call upon any process of abstraction. At the historical origin, according to my hypothesis, there were no syntactic categories. There was not even any perceptive compositionality: we must, even though it may be difficult, make an effort to understand what such gaps would be. How, then, could syntactic categories have arisen in history? (As I said above, the present chapter, which is undoubtedly quite marginal for the hypothesis, has been included in the book with a single purpose, which is to reformulate this question.) The historical genesis of grammaticalised syntax would have begun with the historical genesis of a few prefixed links. Thus, we put the following question: How could prefixed, syntactic links have arisen in history? The answer that will be suggested here sees them as related to expressive or egocentric speech. But we will see all this in the next chapter.

chapter 18

Expressive speech and syntactic links A hypothesis on the historic origins of those links, and on some other questions, along the way

18.1 General overview of the chapter The problem we face now is to explain how it is that properly grammatical links could have originated when there were still no syntactic categories. In children, abstraction from real links might be capable of reaching syntactic categories; at least, it would not seem totally unlikely. In contrast, however, real links would not correspond to any intrinsic word category in the historic origins, and hence abstraction would not, in this case, suffice. My suggestion as regards historic origins will be based on emotional discharge speech. Clearly, articulatory-phonetic signs would not have arisen originally merely for an emotional discharge function. Other options were available in the evolutionary past for this function, and precisely as a result of this evolutionary past were easier and more economic. The idea that learned articulatory-phonetic signs might have originated for an emotional discharge function must be categorically rejected. However, this rejection does not prevent us hypothesising that, when articulatory-phonetic signs gradually became more common and less costly, they would come to be used with the mere emotional discharge function. What would have happened then? I would answer that, although such use was at first at the margins of deliberate communication and also of syntax, it may have given rise to proper syntactic links. We must then set this suggestion out clearly and explicitly. But to do so we will put forward two different versions, one more extended than the other. In the extended version, we will go much further than simply offering a solution to the problem of the historic origin of grammatical or abstract links. Here we are interested in studying a particular type of memory, and the hypothesis we will make in respect of it would not be strictly necessary to solve the links question. We will only connect up with this question at the end of the chapter (18.10). This final step of the long version is identical to what we have called the ‘short version’. Why am I not restricting myself to the short version? Why undertake a rather long digression that stands as an independent hypothesis on memory? I begin by acknowledging that any extended version of this type is more vulnerable than its short alternative. It is so by definition, since the vulnerability of all the steps leading to the final one

 Becoming Human

is added to it. In addition, speaking subjectively, I have less confidence in this area I am calling the long version than in the hypotheses of previous chapters. There are, it is true, points in it that I think are solid, but I would not put my neck on the line for it as a whole as happily as I did on other occasions. There, I have made my confession. Nevertheless, my hypothesis on memory fits with my general vision of the two mental centres, and has very close links to what we were suggesting about evocation in Chapter 7. Put another way, by confining the evocation of memories to the second, or exclusively human, mental centre, I am completing and giving full coherence to the idea of the two mental centres. In this sense, the question of memory is interesting in itself. This is the reasoning behind my decision to set out the extended version. We shall begin with a point that appears entirely unrelated to syntactic links. It takes us right back to the Theory of Mind. We shall now address the question, which we tried to steer clear of in earlier chapters, of ‘own false past beliefs’. Certainly, we will move, at first, within the usual field of the Theory of Mind, but we shall soon step outside that framework.

18.2 The ‘own past beliefs’ addressed by the Theory of Mind ‘Theory of Mind’ studies distinguish between two types of ‘false belief ’. Beliefs different to one’s own current belief may be either beliefs of others or own past beliefs. In previous chapters, we hypothesised a means of easy access to beliefs of others – that linguistic messages of others would reveal their producer’s belief to us. On the other hand, with regard to own past beliefs, there does not seem to be an analogous means of lowering the age of success in ‘theory of mind’ experiments. But let us not get ahead of ourselves. We need first to analyse these experiments, and we must also enquire about the concept of own past false belief itself.

18.2.1 The child forgets its own past beliefs: An absurd result? The child does not pass the deceptive box test until it is over four years old. This test demonstrates the fact that the child forgets its own past belief only moments after having said it. This result appeared incredible to many researchers, even in recent times. It is well known that a three-year-old child is able to remember for several months where the chocolate biscuits are stored in its aunt’s house. How, then, is it possible for a child to forget its own belief of a few minutes earlier? This distrust led researchers examining the results of these experiments to suggest interpretations that did not include the fact the own belief was forgotten. Would the child be ashamed of its error, and hide it for that reason? Experiments were performed to explore whether or not the child forgot, as they went along, unfortunate behaviour such as breaking a glass or painting the wrong colour on a drawing. The

Chapter 18.â•‡ Expressive speech and syntactic links 

result showed that errors of this type were not forgotten by the child, unlike its own past belief. In addition, it was demonstrated that children correctly understood pasttense questions. Another possible alternative explanation thus had to be rejected. Other authors suggested that the child was bending in its reply to what it thought the adult wanted to hear. However, once again, it was demonstrated that the child’s incorrect answer only appeared in relation to its own past belief. No one has seen a way to make this uncomfortable datum disappear. As a result, we are all beginning to think it might be better to change our objective. Now, we need to see the datum as logical and expectable. An evolutionary approach, paying attention to adaptive utility, might help us. We should ask whether or not remembering own false past beliefs is useful for any organism. As soon as we place the question on this footing, the tables are turned, and what turns out to require explanation is not that past false beliefs are forgotten, but that humans over four years of age can come to possess this memory ability. The mission of animals’ beliefs and knowledges is to be an appropriate guide for conduct. Updating, which is dominant in perception, is fundamental for this. If an animal’s conduct were guided by a belief that has already been shown to conflict with reality, that animal would not last long in the world. Out-of-date beliefs are not useful for guiding conduct in any way. It is therefore completely logical for them not to be remembered. Remembering a now out-of-date belief would only serve to create confusion and use up cognitive resources. There is a version of the own past beliefs experiment that has been redesigned in such a way that remembering, rather than being the memory of a false belief, has become the memory of a fact still corresponding to reality. We have already spoken about this in a previous chapter, when we addressed the attempts that have been taking place to lower the age of perception of beliefs different to one’s own current belief. This modified design (Mitchell & Lacohee [1991]) was the only way of successfully lowering the age children pass the deceptive box test. After the children reply ‘Chocolates’ when asked what is in the box, they are asked to choose, from several drawings offered to them, the drawing that corresponds to their reply, and to put this drawing in a letterbox. When, after they have seen that the box only contains staples, the children are asked a second question, namely, what it is they said was in the box, the children reply correctly –‘Chocolates’. This is the only case where, at three years of age, children pass the test perfectly. It is clear why this occurs. The belief that the box contains chocolates is not a simple out-of-date belief, but continues to correspond to a real fact, to the real and current content in the letterbox. Forgetting a belief is not determined by the time that has passed but by its falseness, or, following the clarification we made above, its lack of utility. As a result, what is surprising and needing explanation is our adult ability to remember beliefs that no longer serve to guide us. We shall attempt an explanation. First, however, we shall place the contrast between false beliefs of others and past false

 Becoming Human

beliefs on a more specific footing. It is precisely for this reason that I have been talking here about the experiment we had already mentioned in 11.1.1.

18.2.2 Remembering one’s own linguistic message versus remembering the linguistic message of others Specifically, we have to ask why the production of one’s own linguistic message (the child’s verbal response, ‘Chocolates’, in the original ‘deceptive box’ test) does not have the same consequence as the message of other individual. We should remember that a much smaller child perceived its mother’s false belief that there were still building blocks left when it heard her asking for ‘More (blocks)’. Why, in contrast, does the child’s own linguistic response ‘Chocolates’ not succeed in recording in its mind the false belief at that moment? Why is there this contrast between the consequences of each linguistic message? The mother’s message, however much it does not correspond to the reality of the building blocks, is backed up by a current true reality. At that moment in time, the false belief is in its mother – this is the current supporting reality. This would be analogous to what we saw in the modification designed by Mitchell & Lacohee. The presence of the picture card in the letterbox is, when the second question is asked, a current fact that is true and known by the child. The child therefore is well aware of it. If we focus now on the false beliefs revealed by the linguistic messages of others, we could say that the minds of others play a role similar to that of the letterbox. In contrast, there is no support for one’s own past beliefs. As a result, the child does not maintain them. In the original deceptive box test, the child’s reply will immediately fall victim to the update in perception.

18.2.3 Surprise, Davidson and the ‘concept of belief ’ Theory of mind studies offer us the clear conclusion that tests on remembering own past beliefs are absolutely not solved at a younger age than what we have called ‘classic tests’ of false beliefs of others. Our analysis of the beginnings of predication in children signals to us that there is an easier way of perceiving false beliefs of others, namely, the ‘second-person and linguistic-format’ perception which occurs when a linguistic message showing the speaker’s false belief is received. One possible conclusion from these premises is that remembering own past beliefs is a more difficult and more complex task than the easy type of perception of beliefs of others. This conclusion is highly plausible. But we must at least address a very different hypothesis. Let us read the analysis by Davidson (1982) on surprise: “Suppose I believe there is a coin in my pocket. I empty my pocket and find no coin. I am surprised. (...) Surprise requires that I be aware of a contrast between what I did believe and what I come to believe”. In the very instant that I notice that my pocket is empty, my previous

Chapter 18.â•‡ Expressive speech and syntactic links 

thought that I had a coin in my pocket appears as a belief, and no longer as reality: this is the statement in Davidson that interests us. In other words, as this statement sets things out, own past belief could be the original belief among false beliefs that are perceived as false. This raises for me the following question: is a subject’s frustration perhaps proof that it has maintained some optimistic beliefs in its mind after these have been shown to be false? Of course, if these beliefs remain in the mind, they will come to be evaluated as simply thoughts with no correlation in reality. However, the question is whether the condition that these beliefs are maintained in the mind is necessarily fulfilled in every frustrating experience. I would suggest that we ask whether or not animals are able to feel frustration. We need to be aware that, if we accept both Davidson’s analysis and animal frustration, we would then be obliged to ascribe to animals the memory of their own past beliefs.1 As a result, I, wagering as I do without hesitation on ascribing 1. There is a further question involved here. Davidson denies mental states for animals. I am not referring to the ‘concept of belief ’. Denying this concept for animals is a widely-shared opinion, which I of course share. Where I am opposed to Davidson is when he aligns mental state and linguistic expression, and (since he does not postulate any Fodorian mentalese) decides, therefore, that animals would lack any mental state. I have always thought this ‘logical behaviourism’ or antimentalism fits very badly both with evolution and with child development. I am certainly very interested in the differences between animals and humans. However, the enormous separation between the one and the other postulated by this antimentalism seems completely unrealistic to me. What about animal consciousness? For now, the issue of animal consciousness seems to me even more unapproachable than that of the particularities of exclusively human consciousness (See Bejarano [1997] and [1989]). In any case, it is convenient that we pay attention to the fact that animal perception entails distal localisation. Sensory data are always inside the animal’s skin, so to speak. Despite this, animals with brains perceive these data as distal stimuli; as being located at a particular distance from the organism. (But note that this distality is absent in the sense of smell and taste, which, as Krifka, in press, pointed out for linguistic categories of the sense of smell, are the only senses whose linguistic expression seems to involve an intrinsic hedonic bias. What about the sense of touch? Touch is certainly excluded from ‘distance-senses’, but it is also much more complex than the hedonically-biased ones.) This is not the case with plant tropism, or even with insect vision. (It is true that insects move towards the relevant stimulus as the sensory ganglion activates the appropriate motor ganglion. However, there is no need to assume that insects are supplied with any distal information by their compound eyes, and indeed this information is not necessary to them. A lot of distant flowers are equal to a few flowers close by. The balance between cost and benefit is identical.) By contrast, animals with brains manage to calculate the distance at which the stimulus is located and, given that distance is always relative to a centre of reference, an animal’s awareness of itself surely arises jointly with the perception of distance. For this distance to be detected there must be some degree of awareness of the body, even if at the beginning this is only as the centre of reference. It is possible for the animal to perceive external objects because it has some kind of awareness of what is not external – that is, an awareness of itself – and, by the same token, the animal has an awareness of itself because it perceives what is exterior to it. Needless to say, I agree that motivation would be the

 Becoming Human

frustration to animals, have to distrust any attempt to generalise the link between frustration and ‘concept of belief ’. The analysis Davidson makes of the human adult surprise experience, as sharp and convincing as it may be, is not relevant for the question of origins. We are asking if the ‘concept of belief ’ (in Davidson’s terminology, or ‘false belief ’, in ‘theory of mind’ terminology) originally appears with beliefs of others or, on the contrary, with own past beliefs. The situation analysed by Davidson is only useful in order to pose the question, and not because it offers any argument in favour of the ‘concept of belief ’ originating from intrapersonal surprise in children or in history.2 In small children, thus, the alternative interpretation to Davidson’s may still be proposed, and it is the one I opt for. Apart from rejecting the apparent counter-example that Davidson was putting forward, we must ask if there are arguments to support the interpersonal origin of the perception of false beliefs. Or, rather than speaking about arguments, we should speak about encouraging but not conclusive evidence. Are there any? The main one rests, it is clear, on children’s initial predications, which, as we hypothesised, would show very early perception of false beliefs of others. But before we continue our search for evidence of such original primacy, it will help to address a much more important question. What should we understand by own past belief? There are clearly two different things which it is possible to call own past belief.

18.3 False belief and out-of-date true belief: Taking the question beyond the Theory of Mind Two hours ago, it was raining, and my belief at that moment, insofar as it was raining, was true. But it is not raining now, and my previous belief is thus out of date, is thus no longer valid. Of course, if we use the linguistic resource of the past tense, the linguistic expression of the belief will continue to be true. But that linguistic resource may be similar to ‘John believes that’ or ‘In the film, Robin Hood meets Sherlock Holmes’ (‘What happens in the film is that Robin Hood meets Sherlock Holmes’):3 cf. Recanati (2000), or also Prior (1968) (quoted in Bermúdez [2003]). Let us pause over this point. key and the function of´ ‘experiencing’. “There is no point in being aware of one’s internal states if one cannot do something about it (...) The evolution of experiencing altered animal evolution. It became dominated and guided by learning and by future goals.” (Ginsburg & Jablonka, 2009). See also the suggestive Feinberg, in press. 2. These reflections about Davidson’s ‘concept of belief ’ appear in earlier articles of mine (Bejarano [1991a] and [1992]), in which I had not yet made a connection with the ‘theory of mind’. 3. Certainly the past tense is more frequently used: Its use is almost necessary in narrations and causal explications. Thus, the past tense is also more grammaticalised. However, its similarity with ‘John believes that’ can be easily defended. In the early stages of untutored second language acquisition, the shortage of grammaticalisation can make that similarity more obvious.

Chapter 18.â•‡ Expressive speech and syntactic links 

Why can we say it is true that an episode occurred? We can only express it in this way because we have used the past tense in ‘occurred’. The past verb tense is what allows us to ascribe a current truth to the sentences that narrate memories. Of course, in my example, just as in the cases of ‘John believes that’ or ‘In the film, Robin Hood meets Sherlock Holmes’, equally across all of these, in fact, some truth is responsible for the content which does not correspond to real circumstances having become interesting. John is believing the false belief, and the past episode was present in its own time. But the question is that all this lies outside what in the first instance is the real environment. As we saw above, if the minds of those who surround us are taken as included in the environment, then the false belief of others is also included in the real environment. We do not, on the other hand, find any current support for past belief, even in this model. And this occurs not only for own past beliefs that were false from the beginning, but also for those we have just placed under the spotlight, that is, own past beliefs which, although true in their moment, are now out-of-date. What would the perspective of adaptive utility, through which we were able to understand the child’s apparently incredible forgetting of own past false beliefs, suggest we should say in this new case? Is the memory of out-of-date perceptions useful in the first instance, or not? The reply will depend, clearly, on the type of memory to which we are referring. An animal visits an area of its territory and finds ripe fruit. A little later, the ripe fruit is finished, and the animal leaves. That perception of fruit on a specific tree is out of date. But its influence will make itself felt at subsequent points in that animal’s life. A few weeks later, it will return to that area and look for fruit in that tree in question. That memory is, thus, highly useful to the animal. However, is it necessary for that memory to consist of an evocation of the past perception? Or, on the contrary, is expectation, with a precise and specific but empty profile, sufficient to explain the animal’s return? In Chapter 7, we said that, for now, it is still possible to place wagers on this dilemma. Mine, as the reader already knows, is on expectations. What can be deduced from such an option? In this case it is clear that the usefulness of really maintaining out-of-date perceptions vanishes (their usefulness in the first instance, I mean). In our wager, the criterion of adaptive utility tells us that outof-date perceptions would be assimilated to own past false beliefs. “The stress hormone cortisol is known to substantially impair memory retrieval (it suppresses false memories in parallel with correct memories). By contrast, unlike retrieval, encoding can even be enhanced by cortisol” (Diekelmann et al. [2010, p. 1–2]). Thus, since stress selectively enhances the most urgent needs, we could conclude that retrieval is a relatively non-urgent need. But, it should be remembered, all this applies only to biological (or ‘in the first instance’) utility. It says nothing about humans; more specifically, it says nothing about what language can give us or about what we can achieve with what language has given us. And it is not only that we cannot confirm the lack of usefulness for humans

 Becoming Human

of both types of own past beliefs. Rather, we cannot even conclude that both types are equally difficult in children. Children nowadays very often receive not only statements but also questions about real past facts, that is, about what we have called out-of-date true beliefs. Already at an early age, these questions awaken in the child the memory of its out-of-date beliefs (in a subsequent chapter we will study the prodigious cognitive mechanism entailed by questions).4 In contrast, the situation is more difficult for own past false, and not merely out-of-date, beliefs. Firstly, because the child is less used to receiving and replying to these questions. Outside the laboratory, the child will only rarely receive questions about its own past false beliefs. Secondly, because for this type of beliefs questions would take the form of subordinated sentences characteristic of ‘believe’ or ‘say’ sentences. This complex syntax makes these questions much more difficult to answer than questions on out-of-date facts.5 We might say that questions on own past false beliefs present the same form and complexity as some particularly difficult versions of the classic Maxi’s false belief tests. ‘Where does Maxi think the marble is?’ and ‘What did you think the box contained?’ would be very similar. Contrary to this, questions of others about out-of-date facts as well as messages which reveal their producer’s false belief would both constitute easy access to beliefs different to one’s current own belief. These questions (‘What was your little cousin doing with his food?’) and messages (‘More blocks!’) are, thus, similar. But it should be noted that a difference is maintained alongside this similarity. While the messages revealing their producer’s false belief might belong to a language that is still very simple, the questions would be impossible without an advanced language. Therefore, although we might, in relation to children, speak of easy access in the case of the questions, this ease would disappear as soon as we tried to project it on to the level of historical origins. As the reader will have realised, we are now outside the framework of the Theory of Mind. The ‘out-of-date own true belief ’ is not contemplated in Theory of Mind studies. Certainly, Riggs & Simpson (2005) have focussed on the perception of true past beliefs. Russell (2005) might also be considered in this regard.6 However, we must clarify that what they study is the perception of true past beliefs of others, and not own memories. In addition, Perner (1995, p. 258) came out very explicitly against assimilating out-of-date true facts and false beliefs (when he questioned the similarity

4. There is a general agreement that the role of adults is vital for the child to remember a past scene. See, for example, Sawyer & Greeno, 2009: “At first, the parent may provide most of the detail, with the child filling in only occasionally”. 5.

The past tense is more grammaticalised. See above, p. 276, note 3.

6. From this latter article, I would like to point out that it brings Davidson in to address the question of ‘false belief ’ in the Theory of Mind, but unlike we did earlier, Russell cites, in particular, Davidson’s argument that social language (or, more precisely, social signs) cannot be acquired without a ‘concept of belief ’.

Chapter 18.â•‡ Expressive speech and syntactic links 

between the false belief task and the ‘Polaroid task’: “the photo is a true depiction of the past state of affairs”, his emphasis.). In reality, we have not been driven to focus on the question of own out-of-date perceptions because of anything we have seen in any Theory of Mind work. It was, in fact, on the one hand, our insistence on the criterion of ‘usefulness in the first instance’. As soon as one gives any attention to this criterion, doubts arise about whether remembering such perceptions is really a primary and prelinguistic ability. On the other hand, I would point out that deciding whether a particular own past belief is a false or, on the contrary, merely out-of-date belief may sometimes be a rather arduous task for a solitary subject. (I thought I saw the study light on earlier. It is now clearly switched off. Did the cleaner come and has she switched it off? Or did I get it wrong in my earlier perception?) As soon as we try to set to one side the theorist’s eternal temptation for omniscient logic, the border between each type of own past belief may sometimes begin to blur. Clearly, as we have said, the child finds it much easier to reply to questions about out-of-date events (‘What was your little cousin doing with his food?’) than to questions about its own past false beliefs (‘Earlier on, what was it you thought was in the box?’). But we are presented with a controversial issue when faced with this greater ease: is it perhaps caused by the frequent ‘questions about out-of-date events’ the child receives? Or is it, on the contrary, independent of language? We can formulate the problem differently. When the child replies to these wh-questions that cause it to manifest its memory of out-of-date events, these questions, I repeat, require grammaticalised syntax and advanced linguistic resources. What would happen in the absence of these linguistic communications? Would this memory ability then occur? This is not at all clear. However, I will point out some evidence in favour of denying that this ability can then occur. I shall first revise some old experiments that may indicate the absence of this ability in chimpanzees. I shall then have to explain how adults succeed in remembering own past beliefs, out-of-date as well as false, without needing to be asked questions about them.

18.4 Out-of-date perceptions and chimpanzees: Interpreting the results of some old experiments We have said above that, if we accept the concept of expectation, then we do not need out-of-date perceptions to be maintained in order to explain animal conduct. This weak formulation, which does not even come up as far as the hypothetical-deductive level, will clearly remain pitifully weak. However, we have to ask whether it would be possible to find any evidence for the absence of this maintenance in animals. We shall bring up two old experiments with chimpanzees, firstly, one by Premack with Sarah, and another, earlier, experiment by Jarvik.

 Becoming Human

Premack (1971) showed that his chimpanzee was unable to solve the following task. In a series of attempts the reward was alternately a small cake and a little slice of banana. In each case Sarah had to choose between a red and a blue counter. The red counter was the one that gave the reward when there was cake, and the blue one when there was a slice of banana. Why did the chimpanzee, which differentiated perfectly between colours (it obeyed, for example, a ‘Give me red’ command when there was a large number of counters of each colour), not, however, learn that task?7 I would say that after each attempt that turned out to be successful, the chimpanzee connected its success to the counter it had chosen in that attempt. But the lesson it took away for the next attempt did not include the maintenance of the now out-of-date perceptions of the immediately previous reward. Let us look now at Jarvik. Jarvik (1953 and 1956) showed that chimpanzees need hundreds and hundreds of attempts to learn that the peanut will always be a few steps beyond where, at each moment, there is a red surface, and never, on the other hand, beyond where there is a blue surface. On the other hand, as soon as Jarvik stuck the peanut to the back of the red surface, chimpanzees mastered the task after only one attempt. Adela Diamond (2006) has been working with these old data. She uses it to highlight how the physical connection between stimuli can ease the task of discovering a rule. The physical connection would be the decisive variable not only in the experiments with children carried out by this author, but also in chimpanzees. This reading of the two versions of Jarvik’s task is indisputable. I accept this reading completely, but I suggest that, by relating it to Premack’s experiment, we could comment on it further. Let us see. In the simple version of the task, the animal learns that the objective is to reach the coloured surface, and then look on its reverse side. In contrast, in the other version, the objective is the peanut alone. The coloured surface is always merely an obstacle it has to move to one side. As such, success in this version of the task would require the animal to hold the colour of the obstacle in its mind after it has already overcome this obstacle, and, consequently, to hold an out-of-date perception. One might object that the sound of the bell, or the shape of the key pressed, has completely faded when the animal is eating. However, this objection does not take into account the key element of Jarvik’s difficult task, which we should explore. The chimpanzee’s task is to choose between two locations. (These locations are situated by the animal in the absolute, not-egocentric space, i.e., independently of the route chosen. We must escape the temptation to look down on animal perception). When it is 7. I wish to recall here an influence that was highly important in my personal journey. Sánchez de Zavala stressed Premack’s chimpanzee’s failure on several occasions. He was convinced that there we would be touching on an extremely important frontier between chimpanzees and humans (Sánchez de Zavala [1978]). And his conviction was infectious. For many years, the question mark about Sarah’s failure has been continually at the back of my mind.

Chapter 18.â•‡ Expressive speech and syntactic links 

successful, the location it had chosen in that attempt is recorded as correct. It is clear, however, that this rule will certainly fail in its subsequent attempts. What we have to explain is why it does not associate success with ‘the location, whichever it may be, indicated by the colour red’. The reason is that in order to associate the colour red with success, it is necessary to remember the presence of this colour in the obstacle that has been left far behind. In a word, Jarvik’s difficult task is assimilable to easy conditioning only when the task has been solved. In short, in Premack’s and in Jarvik’s tests the chimpanzees would fail because they are unable to hold an out-of-date perception. This is either a perception of the reward from the previous episode (Premack) or a perception of the obstacle they had left behind when they reached the reward (Jarvik). What is important is that in those two surprising failures, in both tasks equally, there is, I hypothesise, an out-of-date perception that would have to have been maintained.

18.5 Do adults always maintain their own past belief? Recycling an argument from Dennett In children, I have already said above, we do not know if the memories of out-of-date perceptions might be limited to those occasions on which the child receives well-profiled questions about those out-of-date perceptive scenes. However, it is a fact that adults achieve those memories without needing to receive questions. It is on this that we will concentrate for the moment. Does this fact force us to renounce our conjecture that out-of-date true beliefs are far from being primary and immediate mental states? Or, on the contrary, do we have to consider this adult ability to be an astonishing fact that has to be explained? In this second disjunctive, we would have to ask how adults manage to save the out-of-date perceptive contents from disappearing, or, in other words, how they rescue these contents from the incessant guillotine of perceptive updating. But, before we explore the second disjunctive, we would first have to ask if it is true that normal adults maintain the memory of the immediately previous perception in all situations. We might turn here to a small piece of experimental data, to which Dennett (1991, Chapter 5) gives close attention. This author, clearly, uses it to support his theory of the ‘Multiple drafts of consciousness’ and of ‘Rewriting, either Orwellian or Stalinist, of past episodes which would be performed by consciousness’. We, on the other hand, will employ it for our point about the maintenance of out-of-date beliefs. This is an old Gestalt experiment, by Wertheimer, to be specific.8 The subjects, who are, it should be noted, adults, are looking at a black screen. A luminous red dot appears for an instant near one of the edges of the screen. Then, after a very short lapse of time when the screen is completely black, a luminous blue dot appears at the opposite 8. See also Christie & Barresi (2002) and Suchow & Alvarez (2011).

 Becoming Human

edge. The result of the experiment is that all the subjects insist that they have seen the luminous dot cross the screen and change colour halfway across. What makes the result interesting is that the intermediate black-screen time lapse lasts as long or more than the appearances of the luminous points. However, despite it in no way lasting less than the well-remembered appearances, the black screen lapse is forgotten. Of course, if we do an identical experiment with the exception that it is interrupted just when the luminous blue dot would have to appear, subjects participating in this experiment say that the screen went black at the end. This assures us that the intermediate lapse is perceived. However, I repeat, in the cases in which it is followed by the blue dot, the lapse is not retained by memory. Why is the black screen lapse forgotten? Clearly this is because, by dropping the memory of this intermediate perception, a more coherent perceptive sequence is obtained (the experiment emerged, as we have said, in the Gestalt School). But, however much we have thus pointed out an essential factor, we must keep looking for a way to explain the fact that the immediately previous perception has been forgotten. My point is that the enabling characteristic of this forgetting might lie in the perception’s out-ofdate status. This perception is forgotten simply because it stopped being the latest news about the screen. It is very true that in this experiment there are many differences with the own past belief memory tests. The deceptive box test, or also the one with the stone-like sponge by Flavell et al. (1983), in which the small child forgets its first wrong categorisation, and, generally, all theory of mind tests, last much longer and also ask for the subject to respond at each step. Clearly, the distance separating Wertheimer’s experiment from these cannot be ignored. However, we must also remember the other undeniable difference. The luminous dots experiment is done with adults. What I am insinuating is that any process underlying the usual adult ability to remember its own past beliefs (false or simply out-of-date) would perhaps be unable to operate in the very short time the Gestalt experiment lasts. What is this process that would succeed in mobilising the previously-established duplication of archives for the new function of remembering own past beliefs? As you can see, we have come back to our question. In it lies what really matters to us here. I am not overly confident in the relationship between Wertheimer’s experiment and the forgetting of perceptions or own outdated beliefs (i.e. the forgetting that, according to my wager, would occur in human beings under four years of age, if they do not receive any questions about the past episodes). I would not be too surprised if in the end a different mechanism were discovered to intervene in this experiment. In other words, I am not betting personally on this point, but I am allowing myself to be led by Dennett. He situated explanation on a specific level, and, by sketching an alternative to this explanation, I am approving its situation at this level, and not at another lower level. However, I repeat, this point does not belong to my own framework of suggestions. In contrast, I am fully convinced that we should pay attention to our question. What is the process underlying the ability to

Chapter 18.â•‡ Expressive speech and syntactic links 

remember own past beliefs? Does inner speech have something to do with the correct memory of out-of-date beliefs?

18.6 Halting temporarily this chapter’s progress: How much have we achieved and how much is still to be done? We have broken out of the framework of the ‘Theory of the Mind’ and have assimilated out-dated true memories to false beliefs. This has brought us to ask how the ability to remember specific past scenes would be acquired. This is where we have reached. But we have to remember that what we were pursuing was the question of grammatical links. How might grammatical links (in other words, the grammatical syntactic resources, and not merely discourse syntax) have originated historically? Our immediate question thus will be to ask if there is any relationship between the ability to remember past scenes (or, in other words, own out-of-date beliefs) and syntactic links. How could such a relationship appear in present adult language and how might it have originated historically? It is this question which gives rise, as we saw earlier, to the long version of our hypothesis on the origin of syntactic links. The causal relationship between beliefs and syntax would be practically the reverse here to what we hypothesised about false beliefs of others and the articulation of theme and rheme. That is, it is not at all a question of own out-of-date beliefs giving rise to syntactic links. But we should not get ahead of ourselves. The rest of this chapter will face those questions. We will approach them bit by bit.

18.7 Expressive inner speech 18.7.1 Inner speech, emotional reactions and sudden changes of belief We already mentioned inner speech in 15.2.2, and we opposed Vygotsky’s hypothesis that inner speech is the same as a predicate. In that section, I defended that inner speech (at least in Vygotsky’s examples – see above, p. 234, Chapter 15, note 5) would have an expressive and not a predicative function. Predicative communication implies, according to our general hypothesis, the transformation of a mental content different to what the speaker has about the matter. On some occasions, the speaker designates this false or insufficient mental content, and on other occasions, in contrast, she trusts that the communicative context will be sufficient to present it. In every case, however, the predicate, or, more exactly, the rheme, is selected precisely in order to transform this mental content. In contrast, there would be no transformation in the expressive or emotional discharge function. The expressive function, whether it occurs in open speech or is inhibited and confined to inner speech, would only be designating a complete scene with a single word. Therefore, we were insisting, terms that could never

 Becoming Human

have been changed for a specific predicative communication would be interchangeable for this expressive use. What we must ask ourselves now is: what might inner speech have to do with the maintenance of past beliefs? How could that expressive function, or almost presyntactic type of exclamation, which we have located in inner speech, favour this maintenance? At first glance no clear connection appears. There is even a piece of evidence that argues against the suggestion we are considering. The three, or even four, year old child who cannot succeed in remembering its own past false belief (its belief, specifically, that there were chocolates in the box) has expressed its belief verbally during the first step of the test. If this explicit message does not make the child successful in the following step of the test, why would the adult’s inner speech influence the memory of out-of-date beliefs? (As can be seen, I am asking a question different to the one in 18.2.2, even though it bears certain similarity to it). This would seem to have to dissuade us from looking for the solution to our question in the field of language. But the only thing that really is clear is that those specific verbal productions of a child under four years of age are not useful for this task. We might, therefore, continue our search. To do so, we should look again at what the make-up would be of the loss or forgetting which – we are presuming – occurs in small children as regards beliefs dismantled by subsequent perceptions. As was said in 18.2.3, this loss or immediate forgetting would absolutely not imply any absence of emotional reaction to the change (in this sense, I would say that unexpected perceptions are perceived as unexpected and surprising, even in animals). However, those surprise emotions would not need the maintenance of beliefs dismantled by the facts. This is, therefore, the background on which the process, whichever one it turns out to be, which rescues past beliefs from disappearing, would have to operate. We shall now try to get inner speech operating against that background. What will our suggestion be? Expressive function inner speech, that is, the speech of Vygotsky’s presumed pure predicate, would always be caused by an emotional load. On many occasions, this is due to a sudden reaction of frustration or relief. These reactions would be the same ones that we said occur in animals. The only difference lies in that the constant habit of language leads adult humans to activate in inner speech a term for that activation to channel the emotional discharge. Therefore, there will be inner speech episodes that respond to the longed-for disappearance of something or someone, or also to the surprising lack of fulfilment of an expectation. Let us read Damasio (2000, p. 59): Frustration and relief are “induced indirectly”. “Sometimes, the inducer can produce its result in a somewhat negative fashion, by blocking the progress of an ongoing emotion”. Thus, the energy that had been available for this blocked emotion will be used in expressive emotional release. These inner speech episodes (an ‘It has gone!’ or ‘It isn’t here!’) are the ones which may, as we will see later in more detail, maintain perception or belief once this had

Chapter 18.â•‡ Expressive speech and syntactic links 

become out-of-date. It should be noted that this is completely different to when the child speaks out loud its belief about the contents of the box. At that point, the child’s beliefs and expectations were stable and were not accompanied, therefore, by any sudden emotional change. If the child said ‘Chocolates’ it was only because the researcher was asking him about the contents of the box. The child had no opportunity at that point for any emotional discharge. On the other hand, the inner speech episodes alluded to accompany feelings of surprise or frustration or relief. The ‘It has gone!’ is not contemporary to the moment the danger was perceived, but contemporary to a moment when the danger is out-of-date. Can this speech enable those own out-of-date beliefs to be recovered, however? In our suggestion, expressive inner speech would be, I repeat, the cause, and not the consequence, of this recovery. We have already said that the human emotional reaction would not, in principle, have to differ from that of animals. Maintaining the out-of-date belief would not be present in the origins of expressive speech, but could be its result. How would that result be attained?

18.7.2 With no syntactic process, but with syntactic links The presumed pure predicate of inner speech would really be, we said, the reaction that, in a brain accustomed to language, would accompany the emotional reactions of frustration or relief. The definition of the expressive function of language would be perfectly fulfilled here: the emotional discharge that in humans is channelled along the hackneyed and inexpensive route of the production of a verbal term. This is just what we would have here. How is this term chosen? On the one hand, the term will have to be related to the situation causing the emotional response. On the other, the criterion of minimal mental cost, or, put another way, the criterion of maximum accessibility at that moment, would be responsible for the specific choice. In addition, I repeat what I said above (15.2.2). This term would stand alone, that is, it would not be transforming or complementing any other terms. Whatever the meaning stipulated for it by the code, the term would designate the complete perception at that moment. There would be no ellipsis here. When I am startled by the appearance of a snake, and my inner speech exclaims ‘A snake!’, this term expresses in itself that the snake has appeared. Equally, when, accompanied by a sigh of relief, my inner speech exclaims ‘It has gone!’, this term expresses in itself the relief of the whole situation. We might, therefore, view this expressive inner speech as being similar to interjections. Just as interjections are intrinsically alien to syntax, these inner speech discharges would not imply any predication process. This inner speech would not have activated, I repeat, any process of syntactic production. However, it would employ the terms of full language. There would not have been any syntactic process, but the terms employed come from a language that is entirely shaped by syntax. I have already mentioned this opposition between real syntactic process and historically-involved syntax on several

 Becoming Human

occasions. On the one hand, the semantics of language is indissoluble from syntax, since there is no word that is not placed within some syntactic category or other. On the other hand, however, it is also true that syntax only in fact occurs when words are combined with each other. We encountered this opposition when we addressed children’s holophrases, where, faced with what may have happened in a possible original Holophrastic Era, the words used are the words of our syntactic language. Clearly, the holophrastic child is not producing any syntax, but the words he uses do not for this reason cease to be shaped by syntax. But let us return to adult expressive inner speech. The production of this speech would not have meant any syntactic process. But language as a historical product might be offering syntactic links. The process of syntactic production required by that ‘It has gone!’ in inner speech would have been just the same as the one required by the sigh of relief it accompanied, which is to say, none. However, unlike a sigh, ‘It has gone!’ is a verb, and a verb is above all the type of meaning that requires a noun. Nouns, verbs, adjectives, etc. are defined above all by their links. Thus, when ‘It has gone!’ activates, simply by virtue of its configuration, the syntactic hole for the role of ‘who has gone’, it will succeed in activating the correct word for this role. This ‘It has gone!’, which was simultaneous to the disappearance of the previous scene (i.e. to the disappearance, at the hands of the implacable perceptive updating, of the perception of the snake), would thus suffice, nevertheless, to recover that perception. This is the suggestion I wished to contribute. Evidently, I am picking up the function which Luria has Vygotsky’s concept of inner speech play (cf. 15.2.2). What exactly are my differences with Luria’s hypothesis? One difference lies in that I have located the syntax of the theme/rheme articulation as being previous to the grammaticalised syntax involved in each code word. Another difference lies in that I reject that the predicates in adult predication communication have to be supplied by inner – and, consequently, egocentric – speech (It is not necessary to reject this idea with regard to the predicate in children’s early predicative communication. In children, there is no inner speech). Clearly, these differences are marked. Nevertheless, it is clear that I highly value and am attempting to benefit from that hypothesis from the Vygotskyan School.9

9. Reflecting on Vygotsky and Luria’s hypothesis is a task that, down through the years, has always been present on my task list. In reality, a simplifying but effective way to recount my journey as a researcher would be to enumerate my successive criticisms and clarifications of Vygotsky’s theories on internal speech. The same could only be said of two other questions. One is the concept of decentration, which in the beginning, being as I was still too closely linked to Piaget, I divided between spatial (the ‘three mountains’ test, e.g.) and mental (the “meaning according to the other” on which Piaget [1962] calls in his Comments on Vygotsky’s critical remarks), and which I later saw would be better divided between contents that would be possible in the subject and contents which the subject could never own. The other question is the idea that movement is the perceptible interiority in the relationship between humans – it is the body

Chapter 18.â•‡ Expressive speech and syntactic links 

18.7.3 Syntactic links and symbolic evocation: How are these related? At this point, however, we must give voice to a question that we can suppress no longer. Would the maintenance of out-of-date perceptions not have to fall within the concept of evocation? And, if it did, will our search not have been pointless striving when a solution was already available beforehand? Let us proceed slowly. We have insinuated that the inner speech ‘It has gone’, by activating its syntactic hole for the role of ‘who has gone’, would end up recovering the now out-of-date perception of the snake. This hole (the one occupying the role of subject for this specific ‘It has gone’) would constitute a well-profiled vacuum that would allow us to see what we want to evoke. As we have already seen in the chapter on evocation, the selection of a specific evoking symbol has to be guided by the knowledge of what it is we wish to evoke (see 8.9.2 above). There would be no vicious circle involved. The concept of expectation or profiled vacuum will be useful to us in this question as it has been in others. The instrument, therefore, to which we would owe, for once and for all, the recovery of the out-of-date perception would be the linguistic term ‘snake’. But the expectation which has led precisely that specific evoking symbol or instrument to be chosen would have been the expectation of content that was appropriate for the syntactic subject role of that specific ‘it has gone’. What is the difference between this and what would be achieved with symbolicplay pantomimes? The inner speech syntactic links correspond to a fully-formed language. Such a language has resources of which symbolic play pantomime cannot even dream. This much is clear. However, our question is more specific. What is it that evocation unmediated by language lacks when compared to the maintenance of own past beliefs? Let us address evocation that is unmediated by language. It is impossible to include this evocation in the current environment, and, consequently, according to my hypothesis, it would be true that the scenes evoked belong to the second mental line. Apart from this, however, the scenes evoked do not involve any location in time. Are they felt as the memory of the model scene, or as a present scene? I would say they are felt as both at one and the same time, and, consequently, properly as neither. (As you can see, here I am picking up the question which had occupied me in 11.2). In contrast, when the symbolic ability operates in conjunction with other resources, these limitations will be overcome. Let us combine the ‘It has gone’ of inner speech with the expectation of the content that has a syntactic subject function, with the linguistic term able to satisfy this expectation. If we combine, I repeat, these three elements, we would not only reach an evocation of snake, but we would succeed in rescuing the outof-date perception precisely as out-of-date. Evocation is certainly necessary for this

as a bridge and no longer as a wall. However, why am I telling you all this when the relationships between these authors and my work should already be clear to the reader by now?

 Becoming Human

recovery, but it would not be at all sufficient. This mental maintenance of the past scene would only become possible with the activation of syntactic links in ‘It has gone!’ Nowadays, there is a considerable literature (Clayton and colleagues) on episodic memory and also foresight in animals, mainly birds (Griffiths et al. [1999] or Clayton et al. [2008]). This has caused a hot debate about episodic (i.e., retrieval) memory and mental time travel (See, for example, Cosentino ([in press]) ¿Would these abilities be an exclusively human peculiarity? Suddendorf & Corballis (1997), Suddendorf & Busby (2003) and also Markowitsch & Staniloiu, in press, argue in favour of this exclusivity. This is, in my view, an extremely important issue. Corballis (2009, p. 25) adds: “The evolution of episodic memory and mental time travel created pressure for the system to grammaticalize, involving the increased vocabulary necessary to refer to episodes separated in time and place from the present, constructions such as tense to refer to time itself.” As the reader already knows, not only am I in favour of this last opinion, but have also proposed (see above, 7.2.2) extending the question beyond the episodic memory to try to find out if the mere evocation of a current goal is exclusively human. But let’s attend to the relationship between language and retrieval memory. I certainly agree with the relationship suggested by Corballis. Mental time travel would have provoked the appearance of linguistic resources to refer to time itself. However, I would add another suggestion in an opposite direction. Mental time travel would be more demanding than mere evocation: While mere evocation would only require symbols, mental time travel is, in my view, made possible by language (I do not want to be misinterpreted by this. I do not mean that our past memories consist of only verbal phrases, but that we could relive those memories by the combination of symbols and syntactic links.)

18.7.4 Which is really the important characteristic: Fully-constituted meanings or inner speech? At this point, we must address a question we have left unresolved. We have entrusted the job of rescuing own past beliefs to syntactic links. Why then are we presenting this whole issue as though inner speech were the important element? Do we need to go back and erase everything we have been saying in the last few subsections? No, not exactly that, but we do need to proceed carefully. The activation of syntactic links would occur only when the child has achieved a good command of language. It would not matter for this activation whether there has been external speech or if, on the contrary, this has only been internal. However, command of language would indeed be essential. The link between command of language and inner speech is obvious. Inner speech is, as we have already said above, a late achievement. Therefore, when there is inner speech, command of language is without doubt also present. We would hold, thus, that the adult ‘It has gone’, both in internal and in expressive external speech, would successfully preserve the out-of-date scene in the mind, in

Chapter 18.â•‡ Expressive speech and syntactic links 

adults. It is impossible, however, to enquire about what inner speech achieves in children under six years of age: in them, there is as yet no inner speech. As a result, what still has to be set out is whether the small child’s external speech would succeed in rescuing the out-of-date scene. Gopnik & Meltzoff (1997) (within their general hypothesis about the ‘scientific child’) have related the creation of neologisms to how children learn new words. In the coining of a neologism and also in a child learning of a new word, in both cases, these authors hypothesise, a concept is being discovered. Does the learning of ‘Gone’ by the child perhaps signify the conquest of the ability to preserve already out-of-date perceptions in the attention? (In Bejarano [1999a] I wrote a review of Gopnik & Meltzoff [1997], in which I pointed out this idea). See Theakston et al. (2002, p. 797): “The form ‘gone’ is mainly used by children to encode disappearance”. Remember Ninio’s (1999) ‘pathbreaking verbs’ (above, p. 259). The first thing I would suggest is that the emotional discharge can only be channelled linguistically when the language habit is deeply rooted. Suppose a small child (less than 18-months old, let us say) sees a cat nearby. A moment later, when the cat suddenly leaves, the child is surprised and frustrated. But she will certainly not say ‘Gone!’ on her own account (In Bruner’s [1983] example, the child began to comment ‘Gone!’ at 14 months, but this word was limited to the particular and very frequently repeated game context). Instead, it is very likely in this situation that the adult will say, ‘It has gone!’, ‘The cat has gone!’. Here I would like to repeat a point I made in relation to the protodeclarative (in 10.4). The protodeclarative belongs not only to the child, but to adults too. Analogically, the expressive speech episodes which may be found in the three- or fouryear-old’s egocentric language copy those which were found in the adult who addressed the child at an earlier age. There is a further point we might emphasise. The adult model that is used for learning ‘It has gone’ would have to accompany an emotional reaction in the child – a frustration which would have been awakened in the child when the cat left, for example. But let us continue with our thread. It is specifically with this type of adult messages that the child would have succeeded in learning the meaning of ‘it has gone’. By understanding ‘It has gone’, the child would be discovering the role this meaning plays as a bridge between current perception and past perception. Such a bridge is, it goes without saying, what we have been looking for throughout the previous subsections. Clearly, this bridging function of ‘It has gone’ might not appear in the introspection of this meaning. However, we could consider this an example of the general deceit in which introspection traps us (remember 17.3). The meaning of the term can only be grasped by introspection if it is isolated. Therefore, when introspectively we want to find an evoked content in that verb, we see its subject walking. But this evocation does not really correspond to ‘it has gone’. The ‘has gone’ in ‘The cat has gone’ does not correlate to any perceptive content, because its meaning, I am suggesting, is just that of a bridge between perceptions – between the scene without the cat and the scene with the cat.

 Becoming Human

However, with the above, the maintenance of the past perception is still not under the child’s control. Although the child is soon led to this maintenance by the adult message, it will not, however, be until later that the task becomes intrapersonal. In this respect, the journey described by Vygotsky’s Principle, is travelled slowly by the child. Anyway, ‘Gone’ is soon acquired: it is a holophrastic ‘function word’ in Lois Bloom’s (1973) terms, a ‘first verb’, in Tomasello (1992), or a holophrastic ‘dynamic event word’ in McCune (2006). What is needed for the child to succeed in causing itself to maintain its out-of-date belief? By this time, the child’s linguistic meaning will have to have already incorporated the links or expectations of syntactic roles. In addition, the emotional discharge will also have to be channellable by language. Frustration, for example, will have to give rise in the slightly older child to a spontaneous ‘It has gone!’ This expressive-function speech may sometimes be external, sometimes internal. This alternation will occur not only in children at the transitional stage from six to seven years of age, but also in adults. As Vygotsky saw, expressive speech would be first external and then would be internalised. More and more often, it would be internal, as the expressive function episodes become ever more frequent, with the growing ‘naturalisation’, we might say, of the child’s linguistic habits. In spite of this, the possibility of external, uninhibited, expressive speech episodes is preserved in adults, I repeat. Whether the speech episodes are internal or external is not the question. The only thing that matters to us at this point is that we are dealing with expressive speech or an emotional discharge. Let us summarise the question of requisites. What is it a child needs in order to cause itself to maintain its out-of-date perception? That child needs to have incorporated the syntactic links into the meaning and also to have expressive speech. Both requisites, it should be noted, depend, in turn, on a specific degree of command of language having been reached. As regards expressive speech, it is clear that, for language to be aligned with phylogenetically ancient ways of expressing emotions, the habit of some linguistic elements has to be highly rooted. On the other hand, meaning, with its links, would be the culmination of linguistic learning.

18.8 From simply a secondary effect to a useful resource: The relationship between own past beliefs and tracks or numbers The activation of links would recover the past belief. But we will be able to express this less imprecisely if we look back to what we saw in previous chapters. The first possibility condition for remembering own past beliefs is the double mental archive. This duality of archives, which is available for each object, would have been established through the perception of false beliefs of others. Therefore, when we speak about what causes own past beliefs to be remembered, we are referring, in reality, to what causes the dual archive resource to be mobilised towards its new function. It is this specific causal role that we have assigned to the activation of syntactic links.

Chapter 18.â•‡ Expressive speech and syntactic links 

(Clearly, we have to acknowledge at this point that we have still not made any hypothesis regarding our question of the historic origins of syntactic links. However, what matters to us here is that historically crystallised syntax – i.e., the syntax involved in all linguistic terms even though they may be isolated – would cause the activation of the syntactic links of each term used, and thus would succeed in causing out-of-date beliefs to be maintained. We are – remember – right at the middle of the ‘long version’ and there is still some way to go before we reach its final stretch). In principle, maintaining these beliefs served no purpose at all. As we said above, false beliefs of others have a nucleus of reality supporting them. It is true that John’s belief that I have just perceived will be false. Even so, it is a current fact that this belief is John’s. This belief is really in John in the same way that the picture card was really inside the letterbox in Mitchell & Lacohee’s experiment. For own past beliefs, in contrast, there is, in principle, no current reality that supports them. In short, the maintenance of past beliefs appeared as a simple secondary effect. However, even though this may have been so, past beliefs would then finally begin to be combined with other human capacities and would play varied useful functions. (Markowitsch & Staniloiu, in press, try to specify this utility. Certainly their suggestion is different from mine. However, they ask: “Does indeed the episodic memory through its intrinsic feature of mental time traveling play a main function in the survival, as it has lately been emphasized repeatedly?”.) We shall attempt to specify some of these functions. Could the maintenance of out-of-date perceptions underlie the comprehension of tracks? Vervet monkeys, according to Cheney & Seyfarth (1992), show no reaction to the eye-catching tracks left by predators. It is my view that this information fits, that it makes sense, within animal behaviour. Evolution has concentrated, we would say, on preparing animals, at every moment, for the following moment. The stimuli that announce a predator’s imminent appearance are perfectly perceived by potential prey. But a track is reliable evidence, not of an imminent appearance, but only of a past moment. As a result, it is not surprising that tracks should be an irrelevant stimulus for animals. As a result, it is logical that the ostensible track left on the sand by a snake, or similarly the bloody remains left by a tiger, should not awaken any expectation of a predator in the vervet. The associations between stimuli external to its own body never operate retroactively. To put it another way, if the bell were to ring after the meat, there would be no association. But imagine we accepted that, through a similar process (let us leave it at that, for the moment, a similar process) to the one we saw above (i.e. to a relieved ‘It has gone!’, plus the subsequent activation of the subject syntactic link of that ‘it has gone’, plus the selection of the ‘snake’ term), human beings came to remember the predator after it had left. From this, understanding of the tracks could then begin. More generally, a new ability to explain facts that do not depend on own conduct could thus begin. The absence of a connection with own movements, and the retroactive direction of the

 Becoming Human

nexus are the two characteristics that, together, would cause the type of association that animals cannot make. Let us look at this. On the one hand, it is known that the retroactive causal explanation of any result of own movements, as irrelevant as such a result may be, would be immediately accessible for animals. In very small children too, this forms the protocausality Piaget (1954) spoke about.10 Let us see what is on the other hand. On this other hand, the stimuli that have preceded an unconditioned stimulus (i.e. a tempting or feared stimulus in itself) may be located in the ancient resource of expectation (see above, 7.2.2). The proactive association toward the unconditioned stimulus between this unconditioned stimulus and external stimuli would equate to conditioning. The bell ringing becomes associated with the meat. Separately, thus, each of the two features is accessible to animals, both the retroactivity and the nexus between external stimuli. In contrast, the retroactivity in the association of external stimuli, that is, an association where the biologically relevant stimulus precedes the conditioned stimulus, would be a very difficult type of causal understanding. My point is that this type of causal understanding would require the maintenance of out-of-date perceptions, and would be an exclusively human type of causal understanding. Against Povinelli (2004), I believe that the key to specifically human causal understanding would not have to be sought in such sophisticated ground as causal explanations expressed in terms of gravity, or similar concepts (“unobservable theoretical entities”). Clearly, these linguistic and academic explanations are necessarily exclusively human. However, the exclusively human understanding of causality might be found in tasks similar to animal conditioning. If a causal understanding task requires perceptions to be maintained which have already been condemned by the implacable updating of perception, then this task would be inaccessible to animals. Even though it introduces the question of syntax of subordination, which we will only address in our final chapter, I would like to make reference here to how in “The streets are wet because it rained” a parallel can be found for what we said in 18.7.2 about ‘It has gone’. The meaning of ‘it has gone’ offers (this is what we suggested) a bridge to successfully recover the past perception, namely, the scene with the cat, from the current perception, namely, the scene without the cat. In a similar way, the ‘because’ would allow us to move from the current consequence (or resultant state) to the now out-of-date perception. Moeschler (2006) highlights how the order preferred in the linguistic expression of causal relations is reverse-chronological. Firstly, the consequence or resultant state is clarified, and after it, the cause. This point made by 10. In addition, in a fascinating experiment, Call (2004) shows that retroactive explanation produced in own prior experience (namely, that when a container with something inside is shaken, a noise will be produced) may be extended in chimpanzees to the results of a movement (of somebody else) which is just the one the chimpanzee wished to make at that moment. The capacity for attribution in chimpanzees that we addressed in Chapter 2 would have an elaborate derivation here.

Chapter 18.â•‡ Expressive speech and syntactic links 

Moeschler seems to me to fall within an essential nucleus of the relationships between cognitive abilities and syntax. The ‘consequence – cause’ order would reflect the cognitive journey which the meaning ‘because’ would be supporting. Thanks to the ‘because’, the out-of-date rain perception comes to play a role in the linguistic description of the current perception. In short, thanks to this type of meanings (‘it has gone’ or ‘because’, for example) out-of-date perceptions acquire in some measure what they entirely lacked before. Before, they had no adaptive usefulness nor did they connect in any way with the current reality. We have already said above that, in these aspects, past beliefs were much worse than false beliefs of others. Nevertheless, thanks to meanings such as ‘it has gone’, out-of-date perceptions, I repeat, would achieve their passport, first to survival and later to useful functions. We have just thought about the possibility that causal understanding of tracks is one usefulness of the recovery of out-of-date perceptions. Peter Carruthers (2002), citing Liebenberg (1990), emphasises the cognitive complexity of the interpreting tracks. This complexity would not have nothing to do with the simple fact of perceiving from above the walking of an animal which leaves a sequence of footprints in the snow. However, despite Carruthers’ emphasis, a serious obstacle to my suggestion seems to rise up from the field of historical origins: would a syntactic link be consolidated, or, at least, available in some way, before the ability to work back from the current track to the out-of-date scene which caused the track? In order to find our bearings in this question, we would have to set out some conjecture, at least, regarding the historical genesis of syntactic links. Precisely such a conjecture is the ultimate goal of this chapter. However, we shall see this later. For now, we must continue with our present train of thought. In addition to understanding tracks, the maintenance of out-of-date perceptions might have other useful functions. Think of the numerical ability nucleus, properly speaking. This nucleus is perhaps related to the maintenance of past perception. How do human beings become able to break through the ceiling of the subitisation processes? It does not matter here if this ceiling is the ‘magic number seven’, or whether it is four or five. What matters is that such a limit disappears as soon as a non-subitisable set is formulated as the modification of another set which is indeed subitisable. As soon as a set has been successfully formulated as ‘the maximum subitisable plus one’, the seed of human numerical ability would be achieved. Clearly, it would have been necessary after this to arbitrate ordered symbols to designate sets (Saxe [1991]; Nunes, for example, in Nunes & Bryant [1996], and numerous other authors). Using fingers would be evidence that a symbol had been found. In addition, in order to generate a system that is indefinitely productive and, at the same time and nevertheless, accessible to the limitations of learning, the Packing Strategy is necessary (see Hurford [2007a]). However, the nuclear key itself of the human number would lie, I repeat, in the transformation of one set into another through the modifier ‘+1’. But, what has this to do with maintaining out-of-date perceptions?

 Becoming Human

Take the example of a sequential accumulation of elements one by one. In such a context, the maintenance of the immediately prior perception would be underlying true numerical ability. (See supra, 13.4.) This maintenance would also be necessary for success in Buytendijk’s task. (Luria [1979] insisted on the importance of this experiment, although I find his comment too vague.) In Buytendijk’s experiment, animals, chimpanzees for example, are placed in front of a row of containers. Then they see that a tempting piece of food is being placed in the first container. Immediately after this, without the animal seeing, the food is moved into the second container. When, eventually, the animal is allowed to move forward, it will run, of course, to the first container and will be frustrated. Then it is shown that the food is in the second container. When the episode is repeated, the animal will, of course, look in the second container. But, this time it is in the third. This will be repeated several times, from the third to the fourth, from the fourth to the fifth, and so on. The result of the experiment was that no animal was able to discover the rule that needs to be followed to find the food. We humans, from the age of three and a half, grasp the trick after only three or four attempts. ‘You have to look in the next container’ (or, more exactly, in the unopened container which is right beside the one where the food was the last time): in this way we might state explicitly what is obvious to us human beings. Why, in contrast, are chimpanzees unable to grasp this? I would say that animals are unable to explain the datum which surprises them and which frustrates their predictions.11 In other words, they are unable to formulate a specific container (the relevant container at a given moment) in terms of the belief they were holding the previous moment. ‘Container now = Previous container + 1’. This description connects with the human numerical ability.12 In this regard, it is helpful to 11. The failure of predictions is in itself not enough for progress if the subject does not manage to explain the failure. In an interesting micro-genetic experiment, Amsterlaw and Wellman (2006) have shown that the best training children can receive in order to succeed in the Maxi’s false belief test (or, as I prefer to say, to perceive the false belief of others via the difficult inferential route) consists of the researcher asking them to explain the character’s wrong (and, for these children, surprising) behaviour. The explanation would be the crucial step, and the one that separates animal-type predictions from exclusively human-type ones. 12. The trick, thus, lies in paying attention to the final whole, and also to the past stages. Engravings from 60,000 years ago found in South Africa (Middle Stone Age engravings) might be a practical and germinal exercising of this ability. Those who discovered those engraved stones assure us that the series of lines in one direction were completed before the series of lines in the other direction was begun (Henshilwood et al. 2002). Thus, the following step in every moment of the second series would be represented (represented in a line in the first series) before it is actually realised. In this way, the two series of lines would favour the great requirement of the numeric ability, that is, the maintenance of the former set into the following, or, said otherwise, the description of a set as n + 1. Note how this is a graphic representation which, unlike ours, shares the key characteristics of fingers. The fourth finger, for instance, is present before it is activated as a symbol; it is certainly unbent unlike the other three fingers that have already been

Chapter 18.â•‡ Expressive speech and syntactic links 

stress that children’s success in Buytendijk’s task is simultaneous with the first hint that they understand numbers. But we also want to note here that this description connects equally with the task of interpreting tracks. According to my hypothesis, in human beings, and only in human beings, past perception or belief does not disappear but is recovered and may, via the opportune addition (the ‘gone’, the ‘because’, the ‘+1’) succeed in describing or explaining the current perception or belief. I acknowledge, of course, that this succession of hurried allusions is an unacceptable way of dealing with these questions. However, my purpose was simply to suggest how the maintenance of out-of-date own beliefs, despite originating as a mere secondary effect, could in the end have come to play a useful function. Let us summarise all this. How would past belief reach these useful functions? We have seen the same script repeated in the different situations analysed. Out-of-date perception becomes useful because, after a modifier is received, it becomes able to adequately reflect the current reality. The modifiers (the ‘because’, the ‘+1’) are what recover out-of-date beliefs from their original uselessness. Returning to the ‘it has gone’ (a more primary modifier), it is not correct to say that ‘out-of-date perception receives a modifier’. False beliefs of others, that is, the theme, would receive a modifier. The false belief of others is obtained beforehand, and can thus function as a starting point from which the appropriate modifier is chosen. In contrast, out-of-date perceptions, or, more generally, own previous beliefs, are recovered thanks to the modifier and in its wake. Let us open out this point.

18.9 Past beliefs and the composition of predications not based in theme and rheme This contrast between the two types of false belief fits with the other difference we saw above which separates both types. False beliefs of others have belonged, we were saying, to the current reality since the moment it became possible to perceive the mind of others. In contrast, in the current reality there is no support for out-of-date perceptions or for previous beliefs. Consistent with this, we note now that the primary modifier has to be prior to the out-of-date perceptions. It is only when, thanks to this modifier or bridge, the out-of-date perceptions connect with the current reality that the out-of-date perceptions will be able to appear in the mind. In contrast, the order is the reverse for false beliefs of others. The modifier is selected after them. First, one has the theme, and only then is the rheme or modifier selected. This later selection of the modifier, I repeat, cannot occur for past beliefs that are one’s own.

bent, but it nevers stops being present and perceptible. Nowadays, it has become unstoppably fashionable to conjecture about those engravings. And as you can see, I too have fallen into the temptation!

 Becoming Human

Let us return to where we were a few paragraphs earlier. An ‘It is not here!’ or ‘It has gone!’ is emitted with no syntactic process and no predicative function. However, by activating the corresponding syntactic links, it could end up being a combination of the verb with its corresponding subject, e.g. ‘The snake has gone’. The combination between verb and subject would equate to a formulation of the current and no longer dangerous environment, in terms of the out-of-date perception of danger. The current environment, the environment where the snake is absent, has been described as the erasure in the previous perception. Here we would already have a predication that, instead of seeking to transform false beliefs of others, would be reflecting the transition from the latest perceptual updating to own past belief. We might thus say that the small nucleus of reality (and, on occasions, subsequent usefulness) of out-of-date beliefs originates after transformation of such beliefs through the modifier, or syntactic transformation in the wide sense. Contrastingly, the corresponding nucleus (of current reality) of false beliefs of others originates before syntax. Such contrast between beliefs of others and own past beliefs only confirms what we had suggested earlier. Where we have now described the reality or usefulness of the belief of others as previous to syntax, it had earlier been hypothesised as the original cause of syntax itself. As for own past belief, it is no surprise that we should have now located its usefulness as derived from syntax. Above, we had specifically opted for having its recovery depend on the activation of syntactic links. The first result of such links would clearly be the intrapersonal recovery of own past belief. However, a communicative use would soon be given to this combination between own past belief and modifier. Thus, instead of being based compulsorily on theme and rheme, predicative communication might state explicitly in a completely different way what is current at the moment of speech. Now, thanks to syntactic links, the speaker will be able to communicate the current reality through a combination of, on the one hand, her own past belief and, on the other, the modifier thanks to which the belief was recovered. It will no longer be necessary for the speaker to make use of the platform offered by the audience. Now, as a result of the consolidated syntactic links, the speaker herself will have been able to recover the content (the content that is modified) after the modifier and precisely as a result of it. The past belief is the content which one accesses in accordance with the expectations raised by the modifier’s syntactic links. Now that we have reached this point, the question with which we opened this chapter appears ever more pressing. We have been addressing syntactic links as something already constructed beforehand. In our current language, this is in fact the case. However, how would they have originated historically? It is time to tackle this advertised final step where we would converge with what we called the ‘short version’.

Chapter 18.â•‡ Expressive speech and syntactic links 

18.10 Expressive speech and disoriented recipients: The point of historical origin of grammatical links, at last Clearly, speech with a merely expressive function is a type of speech that underuses linguistic resources. In this speech, the very condition of being learned and socially shared, which is characteristic of linguistic resources, is totally underused and is used in vain. It does not matter if the expressive speech is deployed muscularly or if, on the contrary, it is internalised. In both cases, it constitutes a type of egocentric speech, or ‘speech radically for oneself ’. However, it is also clear that when it is not internal, this speech can have recipients, however much it lacks addressees. Imagine now a moment from the Post-Holophrastic Era when there may still not have been truly grammatical syntax, but only the simple theme/rheme composition; there would already exist, however, sufficient ‘naturalisation’ of linguistic signs to allow them to be used expressively. In this scenario, a sigh of relief or a gesture of frustration would already have totally automated or ‘naturalised’ linguistic alternatives. What would happen when this emotional relief was heard by recipients who had not witnessed the previous scene, that is, the scene whose sudden modification provoked the emotion of relief or frustration? It should be noted that in this case it is highly possible that the ‘recipient’s contextual inferences’ will not be at all sufficient. By definition, the scene relevant to the appropriate interpretation is here out of date at the moment the recipient would need to turn to it. As a result, hearers would show reactions of incomprehension. These reactions of incomprehension would normally follow these merely expressive uses of a word. As a result, the prediction of such reactions would have been associated with such a word in the mind of the speaker.13 This would be the seed of properly grammatical or abstract links. In the same way, we would hold that those reactions of incomprehension and subsequently also the crystallised grammatical links would be the seed of the ability to remember out-of-date perceptions. The nexus between the two issues (the issue of links and the issue of memories) has given rise to what, above, we called the long version of our hypothesis. If we focus only on the journey from expressive speech to links, then we are looking at the short version. But let us continue with the point we were making. Where does the difference lie between the prediction of possible questions and the other links or predictions that may also be associated with a word? Or, put another way, why are pre-grammatical (theme/rheme) combinations of words not enough to establish grammatical links? I think there are good reasons to focus on expressive speech and the consequent reactions of incomprehension. The prediction of information requests would be applicable for all the episodes of expressive speech that use a particular word. Thus (unlike in theme/rheme combinations, where the closeness of 13. This would be a particular, primitive example of a much wider phenomenon. See Steels (2003): The speaker reenters the utterance to predict the effect it might have on the hearer.

 Becoming Human

two particular words may be linked to a specific single situation), the request for information would follow a word every time that word was used in expressive speech. This constant repetition might end up being crystallised historically in that particular type of link that could be aligned with a syntactic rule. We are hypothesising this, I repeat, for historical origins. But it should be noted that it might also apply to the acquisition by children. Reactions of incomprehension could occur both in the adult faced with expressive speech produced by the child and also in the child faced with the adult’s expressive speech. This would be a relief for the statistical processing through which the brain would succeed in abstracting grammatical links, that is, those links which give shape to the syntactic categories of verb and noun, for example. In short, towards those links there would be not only the statistical processing route but also this shortcut or alternative route. We now begin to recognise the peculiarity of the abstract and grammatical links that, in the picture we painted in the previous chapter, were too lost within the great cerebral edifice of meaning. However, now that we have reached this point, you may ask why I am concentrating only on information requests that follow expressive speech. Do I think hearers cannot react with incomprehension in any type of communication? Do I think it will only be in expressive speech when the hearer needs and asks for new informative clarifications? Certainly, I agree that this can occur in any speech no matter how deliberately the speaker takes his addressee into account. However, expressive speech would be particularly favourable to provoking reactions of incomprehension, both in the historic origins and during the child’s learning. Truly communicative speech that attends to the hearer, that is, speech that is not merely expressive, is characterised precisely by its attempt to, in some way, avoid such reactions of incomprehension.14 But there are also other reasons why expressive speech is ideal for causing precisely the reactions of incomprehension that interest us. Firstly, because, as we said above, this speech is linked to sudden alterations in the speaker’s beliefs, and, as a result, it cannot be adequately understood without grasping the speaker’s out-of-date belief, a grasping which, it goes without saying, is precisely the least likely to be reached through attention paid to the environment. Secondly, because the information demands are more abstract here than demands about details on how to carry out an 14. The experiment by Olson (1970) is cited on occasion in order to characterise this attention to the hearer. This experiment demonstrated that the linguistic designation of a single object varies according to the situation or, in Olson’s words, ‘mental alternatives’. The object is chosen as a child watches (or an adult, it matters little), and the child has to tell a friend who has not seen the scene which one is the ‘correct’ object. The object is always the same. Only the other objects present vary in their shape and colour. However, the object in question will be described differently each time. This is, I repeat, often cited as a product of the attention to the hearer. I do not agree. My view is that the characteristic differentiating one thing from everything similar which surrounds it would already be stressed on the level of pre-linguistic perception and at the edges of communication. Think back (18.4) to what constituted the task where the chimpanzees were successful with the colour terms.

Chapter 18.â•‡ Expressive speech and syntactic links 

order can be, for example. Thus, given this greater abstraction, requests for information would almost always be the same, and would be, as a result, ideal to succeed in crystallising in properly syntactic links. But perhaps we have not yet faced the main objection that can be made to the proposal of this chapter. I have demanded episodes of egocentric speech to initiate both the process which would lead to syntactic links and the one which would take us to the remembrance of outdated beliefs. “But – the objector would say – the episodes of speech are only necessary to give way to syntactic links. On the other hand, in order to put into action the process which would reactivate the outdated perceptions and beliefs, it would be enough with a natural sign – a gesture, a sigh...– of frustration or relief. Any of these natural signs could provoke the curiosity and questions of disoriented recipients. But the wronged insistence to explain syntactic links and memories at the same time would have deteriorated the proposal.” What is my answer to the accusation of this objector? I would reply with another question: Is it possible that the recipient of a natural, phylogenetically old sign repeats immediately this message as a message that was produced by his interlocutor and not as a mere production of his own (that is, of whom we are calling recipient)? That repetition would be vital to make clear the signs of astonishment and lack of understanding which accompany it. But, I insist, could such a repetition appear for natural signs foreign to phonetic articulation and learning? ‘Quoting’ (see above, 13.2) could support the metarepresentation (that is, the grasping of others’ beliefs – or, more concretely, the grasping of the belief of first speaker) thanks to the fact that the first message had been received in production-format and consequently in the second mental centre of the recipient. On the other hand, for natural signs there could not be true quoting, nor could there be found the crucial questions of disoriented recipients. As you can see, following my critique in 15.2.2 (and my clarification in 18.7.2) of the Vygotskyan treatment of inner speech, I am now picking up his ideas again. I still reject the idea that the selection of the nuclear element of communicative predications takes place in inner speech. However, I suggest that ‘egocentric’ speech (which is, as Vygotsky discovered, the ontogenetic precursor to inner speech) would have been involved in the historical genesis of syntactic links (apart from assisting learning of such links by children today). The merely expressive egocentric speech episode would be followed by requests for information on the part of the hearers. In this way, grammaticalised syntax or syntax crystallised in the code would have had its origins in the interpersonal or in dialogue, and Vygotskyan ‘interpersonal origin’ would thus be fulfilled on the historical level also. Starting from this origin, the progression towards the full establishment of syntactic links would have been a long route through history. At this point, we must remember that, when from syntactic links we derived the ability to interpret tracks, we found it implausible that this ability should have to wait until a fully grammaticalised syntax crystallised. Now, with the interpersonal and progressive appearance of syntactic links, this problem has been resolved.

 Becoming Human

We should review the overall panorama of this chapter. What interested us was the historical genesis of links. In the short version, that is, which has been set out in the present subsection, the hypothesised sequence is ‘expressive speech – reactions of incomprehension by hearers – association of these reactions to the term used in the expressive speech’. The sum of the term and the prediction of these reactions would give rise to grammatical links. In contrast, in the long version, that is, the version that has taken up most of the chapter, we have combined the question of links with that of the memory of out-of-date perceptions. The links as well as the type of situation where they have originated historically would be the cause of memory.15 It may be useful to highlight the difference between the two historical geneses suggested – the one we have just proposed for the recovery of own past beliefs and the one we proposed in previous chapters for the easy, non-inferential, perception of false beliefs of others. In the suggestion relating to false beliefs of others (and to theme/rheme pregrammatical syntax) only two interpersonal activity turns were necessary. Remember, although it takes us onto the level of ontogenesis, the example of the request for more blocks: the mother makes the request and the child perceives the mother’s false belief. In contrast, for the recovery of own past belief (and grammatical only-rheme syntactic composition), the activity would be spread over three turns: merely expressive or emotional discharge speech, incomprehension on the part of the hearers, and recovery by the first speaker of the perception or belief which he had experienced before the sudden change which caused his expressive speech.16 Here, with this contrast, I am focusing on the question that Casielles & Progovac (2010) or Progovac (2010) examine, but my answer is different from theirs. (See above, p. 244, Chapter 16, note 3.) These authors point out that these complex rheme-only structures (also called ‘comment-only structures’ or also ‘wide-focus structures’), which in Spanish present the order Verb-subject, are better candidates (better than ‘theme-rheme structures’) to be “linguistic fossils” in Jackendoff ’s (2003) words. Certainly in our fully grammaticalised language, a sentence such as ‘Llegó Juan’ (John arrived) is a more unitary and intonationally simple structure. However, I think that at the historical origin, these structures would have been later than theme-rheme structures. For children, as well as for people concentrated on another task or exhausted, it is easier to start with an ecolalic theme and then add the rheme. Firstly, because producing a syntactic composition within a uni-member intonation is a task which is further away from the holophrase than when it is done in a bi-member intonation. Secondly, because the echolalic theme implies the distribution of tasks amongst speakers.

15. McCormack & Hoerl (2005) and Hoerl & McCormack (2005) suggest that parent and child together construct a temporally structured narrative that explains the influence of the past on the present. Here, we are making a similar suggestion (a suggestion of interpersonal origin), but focusing on historic and not ontogenetic genesis. 16. Cf. the ‘other-initiated self-repair’ that Forrester & Cherington (2009) study in children.

Chapter 18.â•‡ Expressive speech and syntactic links 

Let us leave the secondary question of own out-of-date beliefs. Our goal is fully linguistic meaning. How would the historic development have reached full language? We have a label for this historic process to which we are referring, the ‘primitive grammaticalisation of syntax’. Unfortunately, however, under such a label we can do no more than raise an enormous number of questions. Nevertheless, we shall devote the next chapter to giving thought to the questions to which our probably idle curiosity is leading us.

chapter 19

Historical grammaticalisation The answers are lacking, but the questions are good

19.1 Theme/rheme syntax, and grammaticalised syntax: Suggesting two historic stages We have made suggestions here and there in previous chapters regarding the historic development of language. Repeating those here would clearly be tedious. However, it may be helpful to sketch out a picture of where we might place them. Let us turn back to Section Five. There are two parts in the ideas set out there between which it will be helpful to differentiate. One is my reformulation of the concept of theme. For the speaker, we have suggested, theme is a belief different to her own, or, put differently, a second-order mental state. This false, or insufficient, or not up-to-date, belief is modified by the speaker through the addition of the rheme. At the moment, I am alone, I believe, in proposing this reformulation. Alongside this, however, I have also defended the theme and rheme structure as being the original structure of predication, and consequently, the original structure of all syntax. This idea about initial syntax is defended nowadays by many authors: Aitchison (1998), Tomasello (1999), Hurford (2006), and, more importantly, a generativist such as Jackendoff (2002). (See also Krifka [2008] who proposes a suggestive similarity between bimanual coordination and topic/comment structure). Let us address now this second and widely-shared suggestion. If we assume it is correct, the question immediately arises: how did we get from there to conventionally understood syntax? Clearly, theme and rheme structuring exists still in all languages. In this sense, it is a syntactic universal that can be observed (heard, as intonation) and not merely postulated. However, an enormous and highly varied set of resources specifying the syntactic role of the terms within each sentence has developed in each language alongside this structuring. As a result, the suggestion in Section Five, we have to acknowledge, dealt with only a small part of what would be the genesis of syntax. What about the rest? The slow historic process that would have led to the highly varied syntactic resources of the different languages has become known as ‘grammaticalisation’: see Heine (2003), or Heine & Kuteva (2002), or Traugott (2008). There we have studies of many languages, but the syntactic resources appear already formed and constituted in every language which can be studied. Certainly, there are beautiful analyses of the

 Becoming Human

historic origin of specific elements: see, for example, the bleaching process that would underlie assorted future forms (‘I go and eat’, for example, for our ‘I am going to eat’; or ‘I get and eat’, for its counterpart in Chinese). However, the language taken there as a starting point is always a language which is already syntactic. Of course, one may ask if the mechanisms of grammaticalisation we see at work in historical linguistics would suffice for language origins as well as language change. In addition, we must pay attention to Jackendoff (2002) and his extremely interesting ‘fossil’ analysis: previous stages of human language would be present in the grammar of modern language itself. Based on evidence from child language, aphasia, early stages of untutored second language acquisition and pidgin languages, this author has proposed various protolinguistic “fossils”. There is the possibility that research will succeed, in the future, in making steps along these routes. But now, should I give up, and finish this point here? It may be helpful to specify what the genesis we are defending for predicative communication would allow us to deduce. Original predication, initial theme and rheme, would have arisen as a response to a prior linguistic message from the interlocutor: this statement, as the reader already knows, I find reasonably trustworthy. Beliefs of others would, in the beginning, always have to be embodied as speech acts. Rheme, then, at that time, would always correct or complete the interlocutor’s prior speech. This explains why in the beginning it was not necessary to specify syntactic roles. The belief of the hearer would be corrected or completed in the most obvious and relevant direction in each case. This direction, we can assume, the hearer was always able to guess. But then, and returning to our question, why would the resources specifying syntactic role have been created? All those who defend the grammaticalisation hypothesis are clear on one point. The process of grammaticalisation would have begun only when the need to specify the different syntactic roles was felt. What is not clear is why such a need would ever have begun to be felt. It is certainly possible to state that the need to specify the syntactic roles would have begun to be felt as the hearers became unable to guess the direction of completion. Unfortunately, however, this is almost a tautology. (Infra, in 20.3, we will return to this issue) On this question of the genesis of grammaticalised syntax, it is possible also to bring in something similar to what has always been called thematic progression. Having come to the predication of response and to theme and rheme syntax, one of the interlocutors might take the whole predication as an element that it was useful to transform or complete. In this case, the old predication would have come to constitute a kind of prefabricated block, whose internal cohesion would be greater than the cohesion between theme and rheme.1 This prefabricated block would be available in 1. Undoubtedly, in all the theories of linguistics there are concepts similar to what I have called ‘prefabricated block’. We could perhaps bring in here the distinction established by some generativists between ‘articulators of the event’ and ‘participants’. “A non-referential reading is possible for the ‘articulators of the event’, but not for the ‘participants’. Note how in ‘When I saw

Chapter 19.â•‡ Historical grammaticalisation 

memory, not only in immediate memory, but also longer term, and could thus, in the future, be used directly as an element in a new composition. In this new composition, the specifying resources of the syntactic roles would become more urgent.

19.2 The influence of cultural learning on the cognitive abilities themselves Before moving on to the more specific questions that we will examine in the following subsections, I would like to make an additional comment. Where in other chapters we spoke about a Holophrastic Era which would have been followed by theme/rheme syntax, we would now have to add a new achievement, the achievement of established syntactic links or grammaticalised syntax which all known languages present. Admittedly, we have no way of knowing when syntactic links became part of meaning. However, it is clear that this historic achievement must have been very important. It is worthwhile – this is my comment – highlighting that we would then have a striking new example of a principle on which we have been insisting. Cultural learning would be woven into not only the fruits of cognitive abilities, but also the very origin of those abilities. We have already made this suggestion with respect to the original perception of false beliefs of others, which, through the protodeclarative and disambiguation, would depend in the last instance on signs being learned culturally. Here, analogously, we are suggesting that learning syntactic links would be at the root of all the varied cognitive consequences that might be attributed to links as well as to the rapid intrapersonal recovery of own past beliefs. Even though these applications of the principle are new, the principle itself has often been defended. It is, of course, authors with culturalist or anti-innatist leanings, and more specifically, those who favour the ‘extended mind’ à la Andy Clark (1997, or 2001, e.g.), who stress this principle. Culturally-learned signs would succeed in forming a cognitive niche. Donald (1991) in the field of writing and external memory, might fall within this trend. I repeat my endorsement of the general principle, although perhaps not some specific instances of the trend. An argument that has been doing the rounds for years within this current of thinking is the argument about how symbols could increase the abilities of chimpanzees. Given that animals have been taught symbols to express the ‘It is the same’ or ‘It is different’ relationships between any pair of objects, they will be able to express the relationship of similarity or difference between these two relationships, that is, they will be able to master the second-order relationship, e.g. ((A ≠ B) = (M ≠ N)) or ((A = B) ≠ (M ≠ N)) (Thompson et al. [1997]). I agree with the importance of cultural resources you, I saw my mother in your smile’, the ‘my mother’ element needs to be interpreted as a simple meaning or denotation” (Uriagereka, conference paper on clitic doubling, 2006). In this example, ‘I saw my mother’ would be acting along the lines of what, above, we have called a prefabricated block.

 Becoming Human

and symbols, but I have my reservations about this argument. We do not know how a chimpanzee understands each of the symbols that are related in the second-order operation. It is by no means clear that the chimpanzee is taking the final response as expressing the relationship between the previous relationships, and not merely between the intermediary responses. In short, process could be quite different even with the same task. But let us go back to what is important: cultural resources are a prosthesis (let us use the old Vygotskyan phrase) which expands human abilities.

19.3 Evolutionary precursors for links? Would there be any evolutionary precursor for this network of links in long-term memory? In the bibliography we find an idea that we might relate to this question. For any social primate, episodes linked to each of the other individuals in the clan constitute important information. These episodes would be associated with the pattern of recognition (in my terminology, they would constitute a network of expectations which would be activated as soon as the individual was perceived). Thus, it would be possible to believe that this mechanism in non-human primates readapted itself to the new purpose of storing the links within which words have been received. We find two slightly different versions of this presumed ancient mechanism in the bibliography. How are we to understand what I have called ‘episodes about each of the other individuals in the clan’? This is the point that separates one version from the other. On the one hand, in King & Figueredo (1997), it is suggested that the ‘representation’ of an individual would be connected to the ‘representation’ of its most typical behavioural responses. This connection would allow this individual to be characterised temperamentally, which is required in order to attempt to predict its conduct2 (the need for this requirement is strikingly apparent from a reading of the criticisms which Goldie [2002] has made against the predictive use of simulationism). On the other hand, it is possible to sketch another version of the same possible mechanism starting from the fact that “primates understand such things as the kinship and dominance relations that third parties have with one another” (Tomasello [1999, p. 17]). This understanding, which allows primates to make transitive inferences about allies or enemies, would involve a specific network of links associated to a particular mental content. As you can see, in this second version, the mechanism is more similar to general outlines. Beyond the preference for one version or the other, my questions and doubts are more general. I ask myself if these primate expectations are, in fact, closer (closer, I stress, to meaning links) than other types of animal expectations. With this question, 2. Cf. Harris (2009, p. 30): “Children construct separate working models for each of the important people in their lives, and expectations developed in one relationship are not carried over to other relationships”. Cf. also Bem & Allen (1974).

Chapter 19.â•‡ Historical grammaticalisation 

it is not that I cannot decide on an answer; rather, I am unable even to wager one way or the other. However, I am placing on record my doubts regarding this possible evolutionary precursor.

19.4 Conative holophrases and verbal imperatives: Arthur Diamond’s hurried identification Clearly, hypotheses about meanings and their formation in children are still a relatively vague, nebulous area. Even so, they do, however, constitute a field of study, whereas, contrastingly, we can say that there is simply no field of study for the historic process. Can we add anything more? I think it will at least be possible to refute the hypotheses made by some authors. To do so, it may be help to begin by repeating what we said about what presyntactic language would have been like after the protodeclarative. After the protodeclarative use of, for example, ‘mummy’, the shout calling for mother would no longer be confused with a request for maternal care. It is as a result of such disambiguation, we said, that it would have become possible to perceive beliefs of others for the first time. But this now disambiguated call to mother, for example, would not yet be a noun or a verb. As was made very clear by Structuralism, there is no noun (or verb or any other syntactic category or part of the sentence) if the other categories or parts of the sentence are missing.3 This is what we said at that point. Now we might ask about a new nuance of the disambiguated meaning of the holophrase. Is this meaning primarily about action, or not? We already know that, at the beginning, the only immediately and directly useful communication was conative, that is, calls, requests or orders (the protodeclarative use of words, as we know, would not be immediately and directly useful, but would have the function of facilitating linguistic learning). Conative communication is the type of communication that attempts to get hearers to perform an action and shape the world more closely to the speaker’s desires. The hearer will be asked to come, to do, to give, to listen, to go away... The nucleus common to any conative message consists of exactly this, of asking the hearer for an action. As a result, we may think, the meaning of a conative holophrase would be primarily a meaning of action. However, I believe the correct conclusion to be precisely the opposite of this. For the precise reason that these holophrases had always to request some action or other, for the precise reason that this was guaranteed, I repeat, its meaning did not have to be a meaning of action. Meanings would be concentrated on clarifying where 3. More recently: Heine & Kuteva (2007, p. 300): “...an initial stage of nouns only. This does not mean that the first language had nouns the way we know them. The reason for that is simple. As Maggie Tallerman and Jim Hurford (p. c.) point out to us, one cannot talk of any distinct category until there is another category to contrast it with”.

 Becoming Human

the speaker would have to go (‘There!’, ‘Out!’), or what it was she had to give (‘Water!’), or who the addressee is (‘Mum!’), or how the action had to be done (‘Quickly!’), or what relationship the action requested had with what she, the hearer, was doing (‘More!’). The request itself for action would be implicit, or, more exactly, would be given through intonation. Let us bring in here some very well-known data. Unlike nouns, which have only their syntactic properties and not their semantic nucleus in common, “verbs can express only actions and states” – cf., e. g., Jackendoff (2003, p. 257) – (or also the bridge between current perception and the past perception: supra, 18.7.4). The characteristic about nouns signalled there would be a consequence of the fact that the nominalisation process can be applied to any sentence that has just been used.4 But I am interested in what is pointed out about verbs. What do we derive from this? We derive that holophrastic meanings, if we accept what was suggested above, would not have culminated in verbs. When these meanings began to become germinal parts of the sentence, none of them would have opted for the category of verb. This would perhaps be consistent with the consideration of “infant nasalized demand vocalizations (of the kind noted by Goldman [2001]) as originally serving to name the recipient of those vocalizations” (MacNeilage and. Davis [2004]; see also Falk [2004]). (Jakobson [1960] suggests an explanation for the prevalence of nasals in words for the female parent. See also MacNeilage [2008].) Clearly, in this context of questions about historic origins, invoking data about children always runs the risk of being as absurd a strategy as looking for keys under a lamppost even though they had been dropped somewhere else. Nevertheless, in this specific case, the lamppost and the keys might not be too far away from one another. At least, it might be possible to think this if we heed Falk (2004) and MacNeilage & Davis (2004). At this point, we cannot delay no further in referring to Arthur Diamond (1959). Everything we have suggested in this subsection is in sharp opposition to this author’s hypothesis. At the beginning, language would be limited, Diamond says, to imperatives. As a result, he argues, imperatives are formed by the bare root of verbs. And for the same reason, the roots of the most widely-used verbs would be the elements underlying and common to all languages. There are two criticisms I would make of Diamond. The first, and less important, relates to the structuralist statement that we have repeated on several occasions in earlier chapters. There are no verbs until other parts of the sentence exist. Similarly, there would be no imperative mode until other verbal modes existed. I have said that this criticism is less important in that it would only require the terminology to be rearranged in order to save Diamond’s hypothesis. This is the shape things would take with 4. I am thinking of anaphora that refers to the previous utterance, and also of ‘nominalisation in strict sense’ (Probably ‘nominalisation in strict sense’ was a very late achievement: cf. Deutscher [2005, p. 248–251]). These elements would constitute a grammaticalised resource for what in 19.3 we call ‘thematic progression’.

Chapter 19.â•‡ Historical grammaticalisation 

such a superficial rearrangement. We should not at all say that there were verbs or imperatives in the Holophrastic Era. Until the grammaticalisation process develops, no words warrant these labels. However, after these clarifications, it would still be accepted that the meanings used in conative holophrastic message would be action meanings. In contrast, with the second criticism this would no longer be the case. As the reader already knows, this subsection has been written precisely to formulate the second criticism. But in that case we have a task to face. What about the argument put forward by Diamond that the imperative is the bare root of verbs? Of course, Diamond’s linguistic erudition, as admirable and enormous as it undoubtedly is, comes up short for his goal of discovering universal semantic roots. Nevertheless, the data he contributes is more than enough to assure us that the identity between imperative and verb root is very frequent. As a result, we absolutely cannot pass over this fact. How, then, do I explain it? In order to account for this, I would look to the fact that languages have to make it easy for children to learn them (this is, I am taking the opportunity to mention it, an extremely important point5). Children today, surrounded by full language, can learn verbs quickly. At first, they will use these verbs to communicate conatively. In children, therefore, the imperative will be the beginning of their use of verbs. As a result, it is helpful for that verb form to be particularly easy to learn. That is why the imperative normally restricts itself to the bare root of the verb. However, this would not support Diamond’s suggestion about the historic primacy of the verb.

19.5 How, then, did verbs originate? 19.5.1 What can we say about verbs? How, then, did verbs originate? We have just suggested that the original holophrastic meanings would not have been action meanings, but only meanings that clarified the action the speaker was requesting or ordering. After the Holophrastic Era, we have postulated a stage with response predications. We shall now explore those responses. Was it for the first response rhemes that action meanings would have originated? I am not overly convinced by this. The original holophrastic meanings might have continued being used with this new function. Why do I tend toward this conjecture? We should remember two points developed in earlier chapters. Firstly, in 15.1.2, we tried to bring together predications and orders. Predication, or, more specifically, theme and rheme syntax, would come to be an order that has to be obeyed in the mind of the hearer, not in the world. Secondly, at the beginning of predication it is highly plausible that speakers would not have succeeded in cleanly uncoupling the mental state of the 5. Christiansen and Chater (2008, p. 489): “Language has been shaped to fit the human brain, rather than vice versa”.

 Becoming Human

hearer and the message that has just been heard from the hearer. In the ‘Más, no’ with which the child replied to the request for more blocks, the child’s ‘más’ represented its mother’s speech act and her wrong belief together and at the same time (see 13.5). What can we deduce if we bring these two points to our question? In a predication of response similar to the one in the example, the action the hearer is asked to perform has to be performed by him at the same time on his mind and on his speech act. Let us go one step further. Given that the hearer’s speech act has been reproduced by the speaker on the articulatory-phonetic and semantic levels, we believe that the speaker’s predication will be almost necessarily understood as an order that must be performed on this speech act, the reproduction of which acts as the first part of the predication. We come thus to our suspicion. In these initial response predications it would not have been necessary to create any new resources to ensure that hearers understood the correct field in which the speaker has to be obeyed. As a result of the first part of the response predication, hearers will understand clearly that the field where they must obey the speaker is not the world. In short, the old holophrastic meanings would continue to be sufficient in the response predications stage. In order to say that someone has left, or has fallen, or that something is not present, it would be enough to add to the first parte (that is, to the repetition of the message of the interlocutor) someone of the (conative) meanings inherited from the holophrastic stage. ‘There!’, ‘On the ground!’, ‘Out!’: these primitive orders would be successfully recycled as predicates as soon as they were said as the second part of a response predication. What, then, would the origin of verbs have been? We should remember what we said about the recovery of own past beliefs. The sequence leading to that recovery would have been the following. One, thanks to some deep-rooted linguistic habit, expressive speech would be produced in reaction to sudden and emotionally impacting changes. Two, incomprehension on the part of hearers. Three, the need to clarify to hearers what the expressive speech was referring to and, to achieve this, the speaker recovers her own previous belief. A further consequence flows from all this. The specific word which had been used as expressive speech will, from that moment on, be linked to a very specific prediction, namely, the prediction that hearers, to try to understand this expressive speech episode, would ask for more information. This type of prediction would become the original matrix of the meaning syntactic links. What has all this to do with the origins of verbs? The suggestion is that a meaning becomes verb meaning when it demands syntactic links such as who, or what. The form of the verb, the form it takes in each particular language, could only have originated when a content with such links appeared. According to this view, verbs would have originated primarily as the modifier able to convert past beliefs into current reality. We said above, in 18.7.4, that we should not trust the content evoked by the introspection of meaning in the case of ‘it has gone’. The same could be said of other verbs of momentary action such as kill, fill, break, get angry, clean, manufacture, etc. (Ryle (1949, p. 124) calls them ‘achievement verbs’. See Ninio [2008, p. 1]: “Children’s earliest concept of a linguistically encoded past event

Chapter 19.â•‡ Historical grammaticalisation 

appears to be an occurrence that happened just a moment ago but does not happen any longer. Such events are, for example, objects falling, breaking or blowing up, liquids spilling, actions finishing, or entities leaving the scene”. It is true that nowadays many verbs such as manufacturing or eating add to meanings of this type another that forces us to pause our attention on the process itself –‘I am manufacturing’, ‘I am eating’–. However, it seems plausible that communication with still limited means would concentrate on results and changes before focussing on the interior of a maintained situation. In short, the idea is that the meaning of the original verbs would be better identified as the bridge between the current and out-of-date perceptions. More than an action or state, the meaning of a very broad type of verbs is still today a causal relationship, which, like all typically human comprehension of causality, enables us to reach from the consequence to the recovery of a different and prior, which is to say, out-of-date, situation. In ‘The cat left’, the meaning ‘left’ is the bridge between the cat’s current absence and its past presence. Clearly, biological actions such as running (moving in a self-propelled manner, in short), eating and fighting are perceptive contents which are perfectly attainable for an animal, even when it perceives them outside its own species. Premack & Premack (2003) tell how a crow and a monkey reacted angrily when each one saw that the other was being fed first. Nevertheless, none of this impedes our point. The meaning of the verb “eat” is in no way identical to the content that animals succeed in recognising in a perception. On most occasions, the meaning of this verb is much more complex and much less perceptive: it provides the bridge between the moment the eater and the food are together, and the moment when the food will have disappeared. More in general, Gleitman & Gillette (1995, p. 415) write: “The uses of verbs are often asynchronous with corresponding actions”. According to this view, the original verbs and words of action would not properly be evoking symbols. While nouns, whether proper or common, are instruments which allow absent objects to be brought into memory, verbs, on the other hand, would not have any real perceptive correlate. This, although it clashes, as I have already said, with semantic introspection, fits very well with the data that several studies, such as Goodglass et al. (1966) or Damasio & Tranel (1993), have managed to outline from the old dichotomy between Broca’s and Wernicke’s aphasias. The case is that verbs and words of action are generally represented in the frontal cortex, while proper and object nouns are represented in the posterior associative (and probably also evocative) cortex. This status as the transforming modifier of the past scene would be inherent to verbs. But at the origins, there were still (we cannot forget this) no meanings which had the bridging function which we awarded to a ‘has gone’, or, on a more complex level, to a ‘because’. The first step on the route that will eventually lead to the creation of such meanings would have been precisely the association of some holophrastic emotional meanings with the prediction of requests for information by hearers. Syntactic valency, the necessary expectation of some content that plays the subject role, for example, is undoubtedly a defining characteristic of verbs. What we are now

 Becoming Human

asking ourselves is if this would not also be perhaps the origin of the type of word that is the verb. Where would we have to look for the original matrix of verb meaning? Introspection says that it is in the production and comprehension of actions and movements. However, not trusting this report, perhaps we should look back to those other defining characteristics of the verb – their function as a causal bridge, and their associated expectations of questions.6

19.5.2 What linguistic signs would be chosen for this egocentric use? An unavoidable issue which must be dealt with But if we admit that these words (those which would eventually become verbs) at first did not have any meaning other than that of expressing frustration, or relief or surprise, then we will have to explain why they were learnt articulatory-phonetic signs and not mere innate emotional signals (see above, 18. 10, p. 299). Certainly, in expressive speech nowadays we can use verbs and these will get disorientated hearers to ask “who” or “what”. However, in historical origins, there weren’t such verbs which could be subemployed in a non-social expressive use. Are there any solutions to this problem? What linguistic signs and, consequently, primarily social ones, would be chosen for this egocentric and, consequently, non-social use? Perhaps the signs used to communicate solidarity with the success or misfortune of others. Although derived from emotional shouts, these signs would have acquired a learned articulatory-phonetic form. Otherwise, they could have been misleading for the hearers, who could have thought that it was the speaker himself who was successful or miserable. We must ask ourselves here: what communicative function would these signs have had in their primitive, that is, social use? An emotional-cooperative function, I would say. As soon as the pointing gesture and the four-hand-actions encouraged both cooperative perception and cooperative action, cooperative emotions must inevitably have emerged. Although in a highly institutionalized and sophisticated way, triumphant and plaintive chants are still remnants of that communicative function. But let us return to our argument. In my view, two possibilities must be distinguished. The first one focuses on the use of those signs to express one’s own emotions in egocentric speech. It is important to note the relapse or step-back that that use would involve. I have just said that the articulatory-phonetic form of those signs emerged to indicate that the happy or unhappy event did not happen to the producer. 6. Let us focus on verb constructions, which occur in many languages, creoles or otherwise. (According to Tallerman [1998], the serial verb construction is not totally unfamiliar to speakers of English; see also the very suggestive Hopper [2008].) In colloquial Spanish, in narratives, it is very frequent that an apparently superfluous verb (e. g., ‘agarró/cogió’; English, ‘got’) takes place before ‘y se fue’ (‘and left’), or ‘y rompió the vase’ (and broke the vase). These uses would probably make the understanding of the process of transformation easier by prompting, for the hearer to imagine, the previous scene whose transformation the main verb describes.

Chapter 19.â•‡ Historical grammaticalisation 

Afterwards, however, with the growing ‘naturalisation’ of linguistic habits, those conventional signs would serve to express one’s own emotions in egocentric speech. Contiguity (with emotional situations) prevails over the origin (which is different from the direct emotional expression). This relapse phenomenon is in no way exceptional. At a different level, we can find it in euphemisms. Originally created to avoid a taboo expression, they end up becoming themselves taboo expressions. The second possibility is to allow the merely expressive episode to be of co-operative emotion (that is, that the articulatory-phonetic sign used in the episode works there with its originary function). According to this possibility, we would not have to wait for any ‘naturalisation’ of this sign. The path towards abstract syntactic links (and also towards the keeping in mind of one’s own past percetions or beliefs) could have started before than suggested in Chapter 18.

19.6 Conjunctions and relatives: Repeating the classic suggestion that they originated as a result of deictics The label ‘grammaticalization’ tends to be identified with the genesis of meanings which could be described as grammatical (or not contentive).7 I have said nothing on this issue so far and there is little I can say. However, since I always have in my mind the specialness of deictics and how different they are from other meanings, even at the very moment of their being learned (remember 10.5.1), I feel tempted to repeat the classic suggestion that ascribes the origin of conjunctions or relatives to deictics (See Davidson [1968] for the conjunction ‘that’).8 The bridge could be cataphora, that, as with anaphora, involves the transition of demonstratives from a spatial level to a symbolic level (that is, to the text as a space). In present-day languages cataphoras are very infrequent as compared with anaphoras, probably because the former have undergone a deeper transformation and become elements which nowadays are described differently, either as relatives or conjunctions. This is related to the only question that I intend to raise here: why would cataphora be more vulnerable than anaphora to transformation into grammatical (not contentive) meanings? In my view, the hearer’s prediction or the speaker’s intention shorten the experience of the interval between the cataphora and the later moment. When an expected effect (cf. 18.7.3) follows a voluntary action, the experience of the interval between these events is compressed in time, a phenomenon known as ‘intentional binding’. In 7. ‘Grammatical’ versus ‘contentive’, or also ‘lexical’ vs. ‘functional’, or ‘open-class’ vs. ‘closedclass’, or ‘categorematic’ vs. ‘syncategorematic’ (A commentary about these terminological options can be found in Tallerman [2009, p. 138]). 8. This paragraph would not have been written without the encouragement of Joaquín Romero. Apart from this particular contribution, I want to thank him for his sound and to-the point comments during our long chats on linguistic issues.

 Becoming Human

addition, in the context of shared actions, intentional binding has been observed for other-generated actions (Strother et al. [2010]): This is further evidence to extend the suggestion to comprehension, not only to linguistic production. By contrast, that projection into the future is missing in an anaphora and not surprisingly anaphora has undergone a less dramatic transformation. But let us leave this issue and go on to the next one.

19.7 Heralding a return to firmer ground All these paragraphs appearing under the label of ‘historic grammaticalisation’ have not involved a single hypothesis. Everything has been reduced to questions, criticisms of some authors, groping around in the darkness. As I earlier announced, I have done no more than think about some questions that I was not able to take the sensible decision of passing by. This chapter could have had no other outcome, because the only real hypotheses I can suggest with regard to grammaticalisation relate to a stage which we have not addressed before now. This stage would be the stage of the genesis of the syntax of subordination. In the final chapter, we will attend to the question of how this type of syntax would have come from the hand of ‘reported speech’. However, before we do this, we must finish with the question of simple syntax. We have still not seen the other communicative function that, as with predication, cannot do without syntax. The following chapter will address, therefore, interrogative communication.

section seven

Syntax beyond predication

chapter 20

Interrogative communication In an earlier chapter, I connected syntax to the communicative function of predication. Syntax is clearly a requirement sine qua non for this function, and thus I have hypothesised that syntax would have originated to serve it. However, present in all language is another communicative function that similarly requires syntax: asking questions.

20.1 Characterising the interrogative communicative function 20.1.1 The successive definitions of the interrogative The interrogative function has its own linguistic form in every language. In spite of this, interrogative communication has frequently been forgotten or concealed by communicative function theorists. Bühler (1934), Cohen (1929) and Jakobson (1960) are good examples. Questions do not appear in Jakobson’s well-known classification. (Certainly, they are mentioned in the presentations that have been made of this classification in almost every textbook, but within conative messages.) Bühler, too, opted to assimilate questions to requests. Cohen, in contrast, preferred to define questions as a judgement in which one of the arguments has been substituted by an unknown element (a wh-question like ‘Who wrote Hamlet?’ is synonymous with an open sentence ‘x wrote Hamlet’). In short, none of these authors award questioning a standalone identity.1 Curiously, although each of them integrates it into a different function, Bühler and Cohen would each be signalling a real aspect of questioning. However, precisely because both aspects and both integrations do justice to questioning, we have to reject the concealment made by both Bühler and Cohen of questioning as an independent function. I suspect, as the reader will already have imagined, that the theoretical mistreatment to which interrogative communication has been subjected is the result, merely, of an improper understanding of predicative communication.

1. It is only fair to mention some authors, such as Meyer (1985) and (1988), who did point out the importance and peculiarity of the interrogative communicative function.

 Becoming Human

20.1.2 Questioning, predication and syntax There is an obvious relationship between predication and questions. All questions request a predicative response. Clearly, the converse rule ‘If there is predication, there was a question’ cannot be upheld. It is obvious that not every predication follows a question. Nevertheless, according to the model of predication I have suggested above, a predicative-function communication would always treat the addressee as if she were in a situation where she would have found it helpful to ask a question. As a result, predication and questioning appear to us like two sides of the same coin. What, then, about the primacy of predication in the genesis of syntax? Would this primacy have to be shared between the two functions? In order to attempt a response, we shall establish a separation within questioning, and leave to one side what are called echo questions.2 This type, as we shall see, is extremely suggestive, and will be studied later. For now, we shall concentrate on total and partial questioning. Total and partial questions need predicative syntax. It is clear that a total question, ‘Have they arrived?’, for example, asks for a ‘Yes’ or ‘No’ precisely about a predication –‘They have arrived’. It is equally clear that, in partial questioning, the unknown is integrated into a syntactic combination. But why should this necessarily be the case? I will try to show the reasons that would force interrogative communications to depend on syntax, or, in other words, the reasons for the primacy of predication in the origins of syntax.

20.2 Animal curiosity and human questioning Think of an animal which has found a refuge but which has still not entered it. It will go in slowly, paying attention to all the clues that might indicate whether or not there is any danger inside. The term ‘curiosity’ may appear excessive to many readers. This term has, in fact, too many connotations. I think, however, it would be unfair to deny to that animal an attitude of searching or inquiry. Let us compare our animal’s mental state with the inquiry a human being might perform in a similar situation. In principle, there would be no reason why the mental state of the animal and the mental state of the human being should necessarily be different. Both would be looking for some perceptive clue that would provide them with greater knowledge about the refuge’s interior. Where would the difference between the animal and the human being in this situation originally begin? My hypothesis will be that it begins with exclusively human communication. But let us proceed slowly.

2. From this mention of echo questions, we must remember the importance the ‘echoic’ concept takes on in Sperber & Wilson (1986) (this was the background to Origgi’s participation to which we alluded in 13.2). See especially Wilson (2000) and also Noh (2000).

Chapter 20.â•‡ Interrogative communication 

Let us assume that a second animal, a cub of the first, for example, remains outside the refuge, and does not go in. It is clear that this second animal will flee if it sees its mother fleeing from the refuge, or even earlier, when it hears the cry of alarm. What exactly does the cub’s waiting have to do with questioning, and the mother’s cry of alarm with a predicative communication? The question is all the thornier as there are reasons to believe that chimpanzees detect when a fellow chimpanzee is or is not seeing a specific visual field (remember 2.2). Chimpanzees would indeed have this ability, as I was already inclined to accept. However, what does this ability lead to? Chimpanzees are not observed making any pointing gesture to communicate the direction the uninformed chimpanzee should look. We addressed this in Chapter 4. If we now raise another aspect of this question, namely, whether it is or is not the case that the chimpanzee which has remained outside the refuge would be able to request information from the other chimpanzee, we meet the same issue once more, namely, we find a lack in chimpanzees similar to the one we were looking at in Chapter 4. In my view, such communicative interaction would be impossible as long as the second centre in one’s own mind has not been implanted. The interiority of others is detected by the chimpanzee, but without becoming radically and intrinsically different from its own interiority. Thus, since the interiority detected is not truly different from one´s own interiority, the subject cannot communicate with it, and, likewise, all communicative signals will be signals in which the recipient will not see any of the producer’s mental states. In short, according to the hypothesis, for interrogative communication to arise, perceiving an interiority (kinaesthetic or visual, or any other type) that is radically different to one’s own interiority will be an essential requirement. The basic human capacity which we have been sketching out from the very outset would, thus, be involved in that communication, just as in predicative communication. The mental double line, which unfolds on different levels, would be the nucleus which makes it possible. We had already seen all this in previous chapters. As you will have noted, I have restricted myself to repeating it half-heartedly and with an excessive number of omissions. Why, then, have I raised the question of the mental requirements of questioning? Because something more is needed for the perception of the difference between my ignorance and the knowledge of others to become an interrogative communication. This second requirement will move a little away from previous chapters.

20.3 The need to be able to say what one does not know I know which aspect or thing I want to know, but inevitably I lack the appropriate word to designate the thing or aspect in question. Neither proper nouns, nor properly contextualised common nouns can resolve this problem. I cannot give a name to the thing

 Becoming Human

I do not know. But if I do not succeed in designating the element I do not know, I will be unable to make my interlocutor aware of what I want him to tell me. Plato’s old paradox emerges here. His attempted solution, the anamnesis that he used as an argument in favour of his Theory of Ideas, is inadmissible here. Clearly, in a different field (specifically, the field of innate consummatory patterns), we accept an analogue of Plato’s solution. Even though, instead of speaking about ‘memories of a previous life in the higher world of ideas’, we have to refer, of course, to ‘inherited patterns which phylogenesis would have been progressively constructing’, Platonic anamnesis undoubtedly continues to be of use in this field (remember 7.2.1 and also 1.4.1). However, we must look to a different solution to Plato’s for the seeking that will be expressed in interrogative communication. Such a solution is apparent in equal measure in partial questioning and in the description (relatively similar to ‘attributive, or non-referential’ description, in Donnellan’s terminology – Donnellan [1966]) that would be used to nominalise what is unknown to it. With both the focus of the partial question ‘What has the child eaten?’ and the attributive description ‘what the child has eaten’ the speaker is able to designate perfectly the object which she does not know (we might remember, here, the paraphrase which many logicians used to make of questioning –‘The child has eaten x’. Although the request ingredient is concealed, this aspect is, in contrast, successfully illuminated). Using another example, the partial question ‘Where is the conference being held?’ and the attributive description ‘The place where the conference is being held’ are both based on the same resource, and both involve the same solution. The point we are interested in is that this solution necessarily has to be supported in syntax. The unknown element can only be designated through the description of the specific syntactic role it occupies. The x in ‘The child has eaten x’ there attains a univocal designation (Undoubtedly, it is univocal only within a specific context, with a specific child and a specific time, but this is not important. Absolute univocality almost never occurs in language). Thus, what is unknown has been, I repeat, successfully named. In contrast, how would it be possible to perform such a task having only simple designators? If there were no syntax, the task the question must perform would be an impossible and self-contradicting task – to say what one does not know how to say. But if we have syntax, then the specific syntactic role of the unknown element can be described and we will be able to perform the impossible task perfectly.3 (Thus, the 3. I wish to point out here something I will not develop at any point in this book. It relates to this astonishing resource that enables precisely what is not known to be designated. We have addressed its role in interpersonal communication. But this may be only the first of its uses. ‘How many cakes were bought?’ or ‘Work out the number of cakes that were bought?’ is clearly the way the statement of a problem communicates the object or goal of the problem to its readers. However, ‘the number of cakes’ will not be the only description of this type to intervene here – in the problem-solving process, I mean. It is clear that in the intrapersonal process of solving any problem of this type one description will have to be substituted by another at each step (‘The money which had been earned less transport costs’, for example) until it can reach a

Chapter 20.â•‡ Interrogative communication 

need of more varied resources specifying the syntactic roles, i.e. the need of more grammaticalisation, would have been provoked by partial questions. Let’s think about how inefficient could ‘Hunt x’ be in a pregrammatical language. The recipient would not know whether to answer ‘John’, or ‘rabbit’ or ‘spear’: See supra, 19.1) In this way, syntax must already be established before interrogative communication can appear. While in earlier chapters we suggested that the appearance of predication and of syntax were the same thing, questioning, on the other hand, must be considered as being secondary and subsequent to that double and simultaneous appearance. But we should give voice to an objection at this point. ‘The price of that coat’, ‘the height of that building’, ‘your home’: all these designations could be used perfectly to nominalise an element unknown to the speaker (i.e. could be used in an ‘attributive’, ‘non-referential’ way). However, contrary to what we have been saying, these nominalisations of unknowns involve no syntax – or, at least, no predicative syntax. We must note that these designations could even be the object of a request or conative message. ‘Give me your date of birth and contact number, please’, ‘Your home, please’, follow the same pattern as ‘Give me a coffee, please’, ‘A light, please’. What happens then to the involvement of syntax that we considered an indispensable requirement for a question? I think this objection can be refuted. Note the nouns used in these simple nominalisations of something unknown. These nouns are extremely special. In the following chapter, we shall attempt to trace their origin, and will make a suggestion about how concepts ‘of dimension’ may have been attained. For now, it will be enough to place the Spanish verb ‘saber’ (know) in front of them. In Spanish, the verb ‘saber’ has different links to the verb ‘conocer’ (know). ‘Conocer’, which suggests more intimate personal knowledge or experience of something, can be used in examples such as ‘I description which finally connects with the data and can, consequently, cease being simply attributive and become a referential designation. At each jump, new links, which will be chosen from among the episodic links this context has assigned to the elements, will come into play (“Flexible thinkers use more abstract representations than perseverators”: Kharitonova et al. [2009]). All this, I repeat, is an easily observable fact (at least, in arithmetic problems of the kind we have seen, where the context is closed). The question that woulad have to be asked is to which point the general characteristic of the resource optimised for interpersonal questioning might be related to the intrapersonal processes of creative problem solving. Let us make some comment on this. In animal conduct, expectations themselves attract and select the means by which they will successfully be satisfied. Properly speaking, there is no searching, just the simple appeal to a chain prefabricated by experience (i.e., by conditioned associations). In contrast, finding for oneself the means of satisfying the empty and expectant profile posed by attributive descriptions is a very different and immensely more difficult thing. Now (in creative problem solving, that is), there is no previous experience of satisfaction. How then does the human brain succeed in finding the means that are appropriate at each moment? Might the process of producing an attributive description for interpersonal use be illuminating for those difficult and creative intrapersonal processes? (It is to this question that I alluded above, p. 234, Chapter 15, note 5, and also in 17.5, when I stressed the value which semantic or episodic ‘links’ might have for high-level cognitive processes). Might the interpersonal process have been the origin of the others?

 Becoming Human

know Manolo’, ‘I know Madrid’, ‘I know your car’ [‘Yo conozco a Manolo’ ‘yo conozco Madrid’, ‘yo conozco tu coche’], while ‘saber’, which shows knowledge of facts, cannot [*‘yo sé a Manolo’, *‘yo sé París’, *‘Yo sé tu coche’]. However, there are also some nouns that are grammatical as the object of ‘saber’. ‘Yo sé el precio de este abrigo, la altura de ese edificio, la edad de Mercedes’ [I know the price of this coat, the height of this building, Mercedes’ age]. Similarly, only these nouns are correct as the object of ‘decir’ [‘say’ or ‘tell’]. Remember the requests to which we alluded earlier. Without any difficulty, we can place the imperative form of ‘decir’ in front of these. Think also of texts of arithmetic problems. ‘Say the height of the intermediate building’ or ‘Work out the number of cakes bought’ may very easily constitute the end of such texts. It is clear that these nouns form a extremely special class of nouns, and it is possible to suspect that they appeared late in history and derived from syntax (infra, 21.4.3). Therefore, the lack of syntax in those nominalisations of unknowns would have no reason to oppose our suggestion. The idea that questioning necessarily requires syntax can remain in place.

20.4 Predication, questioning and ‘Theory of Mind’ 20.4.1 Merely insufficient, but not false, beliefs From the point of view of the ‘Theory of Mind’ or perception of beliefs of others, we can establish a relationship of inverse similarity between predication and questioning. In communications of both types the speaker needs to pay attention to the recipient’s level of knowledge. However, while typical predicative communication is made when the level of the knowledge of the interlocutor is found to be insufficient, questioning, in contrast, requires the speaker to conjecture a level of knowledge in the recipient that is superior to her own.4 From the one who knows to the one who doesn’t, on the one hand; from the one who doesn’t to the one who knows, on the other: this is the inverse similarity. But we need to nuance this inversion of questioning in relation to predication. In predication, the party that doesn’t know (always from the speaker’s perspective) will on some occasions simply not know, but will be wrong on others. The belief of others involved in the ‘thematic element’ (or theme) of the predication may be either insufficient or erroneous. 4. Exam questions are a well-known exception. Originally, however, only children would have been asked these questions (With these questions, adults imitate the questions that the child could formulate. We should remember the similar cases of the protodeclarative, and quasi-expressive speech episodes – above, 18.7.4–, made by adults to children.) In Vygotsky’s study (see Luria [1976a]) on deductive reasoning where illiterate communities were compared with formally educated ones, we see examples of how the test questions (exam questions, in short) surprised the subjects in the communities unused to any form of schooling. Returning to the point that we were exploring, I think that interrogative linguistic form would not have originated either for exam questions or for ‘indirect requests’.

Chapter 20.â•‡ Interrogative communication 

In contrast, the questioner can never conceive the unaware party to be wrong, since they are one and the same. The own current beliefs can be felt as insufficient but never as wrong as long as they continue to be current. There is a matter it is useful to insist on here. The questioner is conceiving her belief merely as an insufficient belief. Does this depend on the ability of metarepresentation? Or, on the contrary, is this similar to the hesitation and search for new information we attributed to animals? This is certainly a controversial issue: Smith (2005) and Call (in press). But I would tend towards replying as follows. Animals, of course, look for new perceptions to guide them. However, in my view, there is no reason why this should mean that they are judging their current perceptions as being insufficient. They simply look for new realities to perceive. As far as they are concerned, their current perceptions have to be supplemented not because they are mental states different to reality, but because they relate only to a section of the environment (see 15.2.1 above). In the questioning speaker, in contrast, something very different occurs. As a result of syntax (i.e. as a result of what is unknown being designated in terms of data), own current belief appears as a mental state that does not completely cover the reality to which it points. In a perception, any details not present are simply not present. It is only with the establishment of syntax that what is not included in a scene or perceptive content can appear in that same scene as an empty hole. Turning to Piaget will help us in commenting on another aspect of the relationship between questioning and the ‘Theory of Mind’. Although he does not deal with language in the majority of his work, he makes, however, the valuable and subtle observation that children mistake adult questions for requests. I will give an observation of my own as an example. ‘Where are your boots?’, a mother says after putting on her child’s socks. The child (this child, let us be specific, will come out of the holophrastic stage within a month) can see the boots perfectly well from where he is and, as soon as he hears his mother, fights to stand up and go over to them. The mother, who does not want her child to walk barefoot, tries in vain to stop him. The child picks up the boots and brings them to her. What makes this intriguing is that, at this stage, the child was perfectly able to point using his finger to make a pointing gesture, both as an imperative and as a protodeclarative. Why did he not reply with this gesture? Piaget was probably right. The child was not understanding the question as a question, but as a conative message. Of course, linguistic understanding was still very limited in that child. The interrogative adverb ‘where’ was obviously too large for this level of the child’s understanding. While I repeat that all this is true, we should not reject, however, the possibility that the problem lay in the interrogative function itself. It should be noted that, at the end of the day, both the routine of the operation as well as the mother’s searching looks might have enabled the understanding of the term ‘where’. Therefore, the problem may have lain in the type of communicative function. In order to comment on this possibility, let us compare the results of two adult messages (the error-revealing request for more blocks – see 13.2– and the question about the boots) on the children involved. The mother’s ‘More’ revealed the mother’s

 Becoming Human

false belief to her child. In contrast, the question ‘Where are your boots?’ does not succeed in revealing the mother’s insufficient knowledge. The child, instead of perceiving in the question its mother’s mental state as a mental state, interprets it as a simple request. Certainly, the ‘more’ was easier to understand than the ‘where’. We have seen this already. However, did the difficulty lie in the words of the mother’s question or in the interrogative function itself? I would say that, after the clash between the box without blocks and the mother’s request, the rupture, or splitting in two, of the mental archive was easier – easier, it should be understood, than after the reception of the question about the boots. At the beginning of perception of the mind of others, the clash or frontal opposition would afford the best results. The tiny difference between the two levels of knowledge would be unable to set this type of process in motion for the first time. (We had already referred to this idea in 13.2.3, where we related it to a question raised by Reboul. But we are using it here to examine the relationship between predicative and interrogative communication). We might say that this point suggests a second route by which to reach the anteriority or primacy of predication over questioning. The first route consisted in highlighting how the question, as it says what is unknown, could not have arisen if it did not have, first, an already established syntax. Something unknown can only be communicated through the description of a specific syntactic role. The second route, in contrast, relates to the situations which make it easier for the mental archive to break (in other words, to split in two) and which would be present, therefore, at the origins of predicative syntax. This rupture would be easier after the clash with the false belief of the interlocutor than after the reception of a question. Certainly, in this question, the interlocutor would have explicitly recognised the insufficiency of her knowledge. However, this is not as efficient as that clash in producing the rupture. It is clear that the two routes fit well together. If in the first we were suggesting that the production of questions depends on syntax, now, in the second route, we are considering that the rupture that gave rise to syntax could not have been caused initially by the reception of questions. Both routes would lead, I repeat, to the same result. Should we leave it at this, then? I would prefer us to go a little deeper into why things would have to be like this in the relationship between predications and questions.

20.4.2 The required contrast with the longed-for sufficiency: Going deeper into the difficulty of partial questioning In previous chapters, we saw that, in comparison with insufficient beliefs, false beliefs have less need for syntax when they are communicated to the person who will judge them. At its limits, a person’s false belief, we suggested, may be detected in a conative message from that person, as long as the disambiguation (e.g. between request and call) was already established in that message. In contrast, for an insufficient belief of others to be detected, such austerity of linguistic wrapping would not suffice. We wish

Chapter 20.â•‡ Interrogative communication 

to focus at this point on the greater syntactic demands of insufficient beliefs and, at the same time, on their lesser ability to cause the rupture of the mental archive. What is it that underlies this? The possibility we shall explore in this subsection is that insufficient beliefs of others could only cause the rupture when they (i.e. the insufficient beliefs) finally come to be seen as false. At first glance, this may appear to contradict what we have been saying. Had we not been emphatic in establishing the distinction between perception of false beliefs and perception of insufficient beliefs? How is this compatible with our stated suggestion? Let us proceed slowly. Let us think of some insufficient own beliefs: I know there is a conference today, but I don’t know what time. As has already been said, own current beliefs cannot involve any element one knows to be false. But if I, the speaker, wish to ask, then I will have to contrast my beliefs which were insufficient with the synthesis which we might describe as ‘the beliefs themselves with an addition describe as ‘the beliefs themselves with an addition’ (even if I still do not know what that addition should be). My point is that, alongside that synthesis, the insufficient belief will immediately appear as the belief that is no good for guiding conduct; in short, it will appear as the false belief within the pair. If, on the other hand, this synthesis is not achieved, then no such contrast will occur and, as a final consequence, there will be no rupture, or splitting in two, of the mental archive. As you can see, we are insisting here on an idea we already put forward just before in 20.4.1. An animal’s perception or current belief, we hypothesised, may be more or less encompassing, more or less detailed, but in itself it will never appear to the animal to be an incorrect reflection of reality. This absence of metarepresentation regarding own perceptions would also occur in humans until the intervention of syntax. Only when through the working of syntax I succeed in making explicit what I have not noticed within my perceptive field, only after my cognitive lacunae have been made explicit, will my own perception appear to me as a mere mental state. This is what we had said in the previous subsection. Now we have added that if I succeed in judging one of my perceptions or beliefs as insufficient, this will be because I will have contrasted it with the correct sufficiency, even if I still do not know what this sufficient knowledge consists of. This contrast with the correct belief is what characterises false beliefs. As a result, we can say that when the questioner presents his insufficient beliefs in his question, at that moment such insufficient beliefs would have become false. However, this only occurs, I repeat, because syntax is able to build, even though only as an empty profile, the correct sufficiency to oppose those beliefs. We can set out the above more precisely. Let us ask what causes the child not to understand the ‘Where are your boots?’. Or, more precisely, let us ask what causes the child to be unable to surmise the meaning of ‘where’, while it can, on the other hand, guess other meanings. We would have two replies to this. Here is the first. In order to guess the meaning of ‘where’, it would be necessary to have the expectation of an

 Becoming Human

addition that specifies the whereabouts of the boots.5 But the child does not yet have any expectation of such an addition. Let us move on to the second reply. The child is not able to surmise the meaning of the ‘where’ because this would mean the understanding of the ‘insufficient belief of the interlocutor’ – the understanding of the ‘boots with unknown whereabouts’. We can now set out our point once more. Both replies would coincide deep down. In other words, only when the child can conceive the real boots as the synthesis of ‘boots plus the addition of the place where they are’ does it succeed in rejecting, and placing in a different archive, the idea of the ‘boots with unknown whereabouts’. The expectation of the addition is the key both for learning the meaning ‘where’ and for perceiving the insufficiency of the mother’s knowledge about the boots. The addition, the synthesis, in short, syntax, is what would sustain the understanding of the interrogative function (not only production but also understanding, which is what interests us at this point). Let us return to the comparison of the mother’s false belief about the blocks. Why was this false belief easier to perceive than the insufficient belief about the boots? The mother’s request clashed, without requiring any mastery of syntax, with the reality the child knew. In contrast, the rupture with the mother’s insufficient belief about the boots would have required greater effort from the child – it would have required him to conceive the synthesis ‘boots plus place’. However, (we might object of ourselves) why will the understanding of the mother’s request for blocks not have the same requirements? Are not two syntheses (‘blocks, yes’ and ‘blocks, no’) present there also? Contrary to what this objection assumes, I believe the child would not have to make any synthesis about the blocks. Let us begin with the denial of the blocks. Perception is what it is, and it does not need to deny what is not present. An absence can impact emotionally on an animal, but this does not authorise us to attribute to this animal the denial of the expected presence. We have already hypothesised this on several occasions (above, 18.2.3). What about the ‘blocks, yes’ synthesis that the child would have to attribute to its mother? I would like to make a comment here on how incorrect the logicians’ point of view can be if it is applied to this question. “A request does not state the presence nearby of the element requested; therefore this presence will have to be inferred through an additional effort of reasoning”: this is what Logic would report. But this report can only be applied where it has been stipulated that nothing not explicit should be accepted, or, more specifically, where inferring the presence and inferring the predication stating it have been identified. Therefore, given that small children, and normally adults too, operate almost completely on the margins of explicitations, we must distrust such a report. With a view to the child succeeding in splitting in two its mental archive about a reality, the mother’s request would be in reality much more efficient and facilitating than either explicit questions or even predications. The child 5. It would be necessary to have this expectation. “Ahead”: See Fernald & Hurtado, 2007. This article describes the long historical trajectory of this idea, and cites relevant bibliography.

Chapter 20.â•‡ Interrogative communication 

simply imagines (as a result of Saussurean parity) the appropriate situation to request, and thus attains the false belief contrasted with reality. In contrast, in the case of the boots, the detail that for the mother they are somewhere unknown is, I repeat, more difficult to perceive. This detail will be impossible to perceive unless the syntactic synthesis ability is already present. Let us sum up. Questions, both in their production and reception, have high requisites. Syntax has to be established beforehand, and so too (this is a more concrete description of the same requisite) does the idea of ‘insufficient belief ’ or, in other words, the idea of the contrast between a belief and the synthesis formed by ‘that belief plus its correct completion’. Questioning is, thus, demanding. But what I wish to add is that its demands are well worth the while. Think of how children quickly succeed in replying to partial questions concerning already out-of-date facts (We have suggested some cognitive consequences of this retrieval: see above, 18.8). The reception of these questions would cause a highly-profiled expectation that would guide the child in choosing the correct instrument of evocation. We have already spoken about this in 8.9.2. The new development that is of interest here is the sophisticated construction of expectation. Expectation does not arise in the child; instead it is received linguistically via the sophisticated syntax used by partial questions. However, once activated, this expectation would behave like any other expectation and would operate as motor and selector of the appropriate conduct. Thus far we have been stressing the link between questioning and syntax. It is now the moment to acknowledge that there are questions for which this does not hold. It is time to address echo questions.

20.5 Echo questions 20.5.1 The lack of requirements in echo questions: From echo questions to predication In these questions the speaker echoes the message (the complete message, or part of it) she has just received. The echo is articulatory-phonetic, but not intonational. The intonation will always be, of course, interrogative, independently of the intonation of the original message. The questioning itself may relate to any of the levels of receptive processing. If the recipient has not heard the message clearly, then her echo question will seek to ensure the articulatory-phonetic reception. If, in contrast, it is a case of the recipient being unaware of the stipulated meaning for this articulatory-phonetic pattern, the echo question will have an effect on this level. Lastly, it may be the case that the difficulty lies in assigning contextual reference. But we should address a characteristic that is general for all echo questions. They all suppress any intonation and communicative force present in the model message.

 Becoming Human

An order repeated as an echo question will cease being an order, a call will no longer call, an exclamation will cease being an expressive relief. García Calvo (1983) places much importance on this fact.6 We have already mentioned Sapir’s ‘rule’ (the less the force of each term, the greater the linguistic product that can be achieved). The ‘bleaching’ label is equally appropriate. In echo questions, the term of the interlocutor is cleaned and stripped of all the communicative force with which it had been emitted. Emptying or bleaching occurs even if the original message was itself a question. What 6. It was precisely when I read this author that I begin to get interested in this type of questioning. In general (I will take this opportunity to state this) García Calvo’s two volumes Del lenguaje [On Language] were extremely important in my early training, and even in my vocation, I would say, thinking back to my early days (for something on this, see Bejarano [1993], a review of an anthology of García Calvo’s work). Having started to talk about my training, I will have to mention the factor that was essential and decisive far above any other. My reading of Sánchez de Zavala’s books was not just one ingredient, it was everything. I did not know him then. Quite simply, I came across Indagaciones praxiológicas [Praxiologic Inquiries] wandering through a bookshop. Before that, I had, indeed, always been passionately curious about how the human had originated. But this curiosity was at the margins of my future projects. Studies in Prehistory were not at all what I wanted. Bones and hatchets would not connect with my interests, no matter how much I read Darwin or Piaget. And as regard speculations à la Leroi-Gourham, I was very clear that this would never be my route. As a result, I had decided to devote myself to History of Philosophy and leave my old passion to the side. But then I came across Sánchez de Zavala’s book. Language is not limited to communicating thought, but transforms it. This transformation (from unarticulated to articulated thought) would be, thus, the essential human cognitive characteristic, and, at the same time, would be something whose origins could be sought in language. How did language begin to be articulated or compositional? At last my old question acquired a concrete form. This is the principal debt I owe Sánchez de Zavala. However, the second debt is nothing less than all the bibliographic help I needed to get beyond the absolute ignorance in which I found myself at that time. The Sobre el lenguage de los antropoides anthology [On the language of anthropoids], the two volumes of Semántica y Sintaxis [Semantics and Syntax] on generative grammar: what would I have done without these? The least of my problems was the lack of recent foreign books in the libraries. What was really serious is that without Zavala I would have had no idea what to ask for. But it was not just the anthologies. He also put me on the trail of Bühler, and of Vygotsky and Luria. And, even more profoundly, his books set before my eyes a concrete example of how to do science. Before this I had the unwavering ideal of wanting to do science, but this was in the abstract. It was with Sánchez de Zavala’s style that this took on a working style. Whenever I reflect on this I realise more and more how much I owe him. I would like to toast his memory. Of course, there are many reasons why I want this work of mine to be good. But one of them has to do with this. Given that we are talking about the 1970s, I realise that I owe the reader a certain clarification. Why did I say that this book is based on the advances of the last twenty years when I have just now acknowledged that reading I did thirty-odd years ago was crucial for me? These statements do not contradict each other. In the last twenty years, I have found some marvellously useful data. But earlier, in my far-off youth, I had found the reformulation of the questions that motivate me still.

Chapter 20.â•‡ Interrogative communication 

the echo questioner is then asking is a metaquestion; it is in no way the same thing as the original message was asking. The communicative force of the model message always evaporates in echo questioning. At the beginning of this chapter we saw the strong requirements of partial interrogation. Addressing echo interrogation now, we note the enormous difference. No syntax is now required beforehand. But we will be able to take a further step in this direction if we go back to a specific case of echo questioning. We were catching sight of this idea in earlier chapters when we viewed the most primitive predication. At one and the same time, this predication would reject false beliefs of others and speech acts of others. But this rejection would begin by being a repetition (echo repetition) of that speech act. In the specific echo questions in which an interlocutor’s error-revealing message is repeated, the content of that message would not only be losing its original communicative force. It would also be becoming a false belief of the interlocutor. These cases are certainly very interesting for our web of suggestions. However, I think it will be helpful to study echo questions more generally.

20.5.2 Echo questions, choral holophrases, Saussurean signs: Three different modes of imitation In every type of echo question, the repetition of the message received would cause the content to be devalued. What in the model message was request, order, call, exclamation, is now nothing of the sort. What is the cause of this general result? Why does the repetition of the interlocutor’s message always have this consequence? In echo question, a model is imitated openly. But the key cannot lie in this. Any word pronounced is always an imitation of the social model. Simple imitation, therefore, does not help us. The type of imitation that interests us here has a specific feature, namely that it imitates a given speech act as such. Even with this, we still have not sufficiently narrowed down our description. We should remember choral holophrases. They, too, are the imitation of a speech act that has just been heard. Nevertheless, in the choral holophrase the force of the model message is preserved intact. In them, the child really is calling the cat or saying good-bye to her grandmother. What is the difference between the imitation involved in the choral holophrase and the imitation involved in echo questioning? The adult message which the choral holophrase repeats is never addressed to the child, but, as we have seen, to the cat or the grandmother, that is, to elements which are different from the choral speakers. (In 12.3, we saw why it would be impossible for children at this age to address to the adult the repetition of the speech act which the adult had addressed to her. The child, who is still too early in her articulatory-phonetic learning, would not yet have reached vocal ‘Saussurean parity’.) In contrast, echo questions are addressed just to the speaker of the model message. For the producer of the echo question, unlike the child of the

 Becoming Human

choral holophrase, the original message is a content radically and intrinsically different of his own contents – a content that is addressed to him. It is precisely for this reason that the devaluation we noted in all echo questioning takes place. In the child’s choral holophrase, the model’s communicative force is not lost at all. But for the production of contents radically different of one’s own contents, things are very different. I do not order; the other was ordering. I do not call; the other was calling. This devaluation (or loss of communicative force from the model message) occurs in any echo question. It is true that the content radically different of one’s own contents which appears there will be the ideal place for the perception of false beliefs of others to first appear. However, this perception would be, I repeat, exclusive to a particular type of echo questions. On the other hand, the loss of communicative force from the original message is a more general feature of echo questions. This loss would be simply the other side of the coin of the perception of a radically and intrinsically different self – of the self which the speaker of the echo question perceives in the model she is copying. It could be argued that when an ‘Out!, Go!’ is answered with another ‘Out!, Go!’, there would also be an echoing and reciprocal act of speech and there would be, however, communicative strenght. But I reject that there is indeed an echoing act of speech. It is the vocal Saussurean parity which is intervening there, which allows ‘reception in production-format’ to nevertheless be a true reception and therefore very different from one’s own productions which on other occasions (or – like here – right coming up) will take place: See above 6.2 and 6.3.3; see also 9.1.3 or 12.2. Thus, in the second ‘Out’ its producer does not intend at all to repeat in the shape of an echo the message received, but is instead choosing, amongst the linguistic meanings that he already possesses, the one which is most appropriate for his conative communicative need. On the other hand, with the echoing interrogation, the speaker intends to repeat the message received. See above 13.4 and Chapter 13, note 6, p. 211, and also an ‘objection-andreply’ (“the main objection”) included in 18.10. Let’s sum up. Certainly, echoing interrogation and the initial acquisition of Saussurean parity would rest on the same resource, that is, the second mental centre which succeeds in receiving in production-format the messages distally heard. However, once language has been acquired, the use of words no longer has to conciously resort to messages from other speakers in which the word in question is used.

20.5.3 Echo questions on the auditory and real-world levels Repetition of the speech act of the interlocutor, and loss of original communicative force: thus we have characterised echo questioning. But echo questioning, one might say, does not consist only of a repetition of the message of the interlocutor, but is also, as its name suggests, a question. This is true. What relationship is there between echo questioning and other questions? This is what I shall explore in this subsection.

Chapter 20.â•‡ Interrogative communication 

Let us start by observing that in every question there must occur an interpersonal contrast between two contents and a distancing from one of them. A piece of wisdom contrasts with a cognitive resource which is lacking, and faced with this contrast the speaker adopts the wise knowledge as a standard, and distances herself, therefore, from the cognitive resources lacking. These features that characterise the attitude of questioning are also present in echo questioning. Let us study how these features occur there. What is the content from which one distances oneself in echo questions? Is it one’s own or the content of the interlocutor? Or, inverting the question, which of these contents is set up as the standard? This criterion allows us to trace a frontier within the set of echo questions. When an echo question concerns the auditory level, then the object of the distancing is the questioner’s own content. The standard to which one must keep is in this case the message of the interlocutor. In contrast, when the questioning impacts on the level of signified realities, then the content of the interlocutor is the object of the distancing and which is placed in question. The standard in this case is the world (the reality) itself, which for the speaker, that is, for the producer of the echo question, cannot be separated from his own perceptions and beliefs. Therefore, if the speaker of the echo question adds something further, that is, if she adds to her message a second part that no longer repeats any model, then this will indicate that the level involved in this echo question is not the auditory level. On the auditory level, the speaker of the echo question has no authority, and has to wait for further information. As a result, the method of signalling which level the questioning focuses on in each case, this most useful of communicative resources, would have been very easy to find. We have said that, if there is a second part, then the echo questioning concerns real-world realities. In this type of echo questioning, the standard comes from the producer of the questioning. The producer is here wiser than his interlocutor. As you can see, we are heading along the road towards predicative communication. Yet, we had already suggested this, the junction from echo questioning to predication (see 20.5.1).7 Now we shall search for something else: can normal questions perhaps be derived from echo questioning?

20.5.4 From echo questioning to normal questions We have seen that normal questions require syntax, and a particularly sophisticated syntax when they are partial questions. It is clear, in contrast, that echo questioning could have appeared already in a holophrastic language. In light of such a contrast, the

7. Regarding the other type (auditory level, distrust of the speaker’s own content), it is convenient to underline that in Chapter 18 we saw a particularly interesting case of this type of echo interrogation – the repetition as an echo of speech episodes of the interlocutor that were merely expressive or emotional discharge.

 Becoming Human

idea of that derivation becomes more and more attractive. However, is that derivation possible, I ask once more, or is it not? Let us try to see generally, in the set of all questions, where the distancing from content of the interlocutor, or, in other words, the adoption of own content as the standard, would be found. Firstly, in echo questions which address the level of real-world realities (within this type of question, the simplest, and undoubtedly most primitive, cases, as we saw earlier, would be the ones which repeat a speech act of request or call). Secondly, it would be found specifically in those total questions that, if they do not already have the form of an echo question, could adopt it. After hearing ‘They have arrived’, a questioner may say ‘Are you sure?’, or may even echo ‘They’ve arrived?’; in either form she will be distancing herself, and distrusting, the content of the interlocutor. On the other hand, in partial questions, we saw that the speaker always distances herself from her insufficient own content and prefers the presumably wiser content of the interlocutor. Thus, we can point out that, while the form of the echo is adopted on occasions for total questions, this is impossible for partial questions, in contrast. The latter, with its need to profile what is unknown, rests (as we saw above) on a resource which is too specialised to have already occurred in a previous non-interrogative message. Let us sum this up. Echo question on the auditory level: distrust of the speaker’s own content. Echo question on the real-world level: distrust of the content of the interlocutor. Normal question, that is, non-echo, and not paraphrasable as an echo question: real-world level, and distrust of the speaker’s own content. In light of this picture, what about our project of deriving normal questioning from holophrastic echo questioning? We have to acknowledge that the route for such derivation does not appear at all clear. Let us explore this out. Non-echo questions have two features. On the one hand, they ask about real-world realities. On the other, they trust in the hearer’s mental content and not one’s own. As we can see, these two characteristics never occur together in the same echo question. In echo questions, the real-world level always goes together with distrust of the hearer’s mental content. And equally, distrust of own content only occurs in auditory level echo questions. Will it then be impossible to derive normal questions from echo questions? For the moment, let us allow the difficulty of the presumed step toward normal questions to take root in us. Normal questions have to be able to express the distancing of the speaker from her own current belief. About the perception both of beliefs of others and of own past beliefs, we have written much in this book. But now it is a question of perceiving and communicating the mental character (that is, the insufficience) of one’s own current belief. In normal questioning, the speaker has to communicate that his own current belief is insufficient for him. If we are interested in simply having beliefs or having a mind, then own current belief is, clearly, a primary animal mental state. At the beginning, reality and own belief or perception are the same thing for the subject. However, if we are asking about the perception of mental states as mental states, then own current belief is the most difficult to perceive. The uncoupling of own current belief and reality is a genuine feat.

Chapter 20.â•‡ Interrogative communication 

Metamental states or second-order mental states are always more accessible when one moves outside one’s own, and therefore unnoticed, point of view.8 The enormous difficulty entailed by normal questioning has thus come into full view. Would we have to renounce our cherished derivation project, then? Would we have to say that echo questioning would be crucial only for the genesis of predication, and not for the genesis of normal questions? Let us not get ahead of ourselves. We should remember that there is a type of echo questioning where content of the interlocutor is adopted as the standard. This is, as we have already seen, echo questioning which focuses on the auditory level itself. On this level, naturally, the producer of the echo question gives up all authority. This particular type of echo question makes, thus, the same choice (to distance itself from own content and to long for the content of the interlocutor) for which all non-echo questions opt. In this sense, echo questioning on the auditory level seems to us to be similar to non-echo questions. Might there be anything significant in this similarity? This is what we will suggest. But we shall begin by insisting on the difference. Note how different this type of echo question appears compared to normal questions. The echo question is repeating, it goes without saying, a message of the interlocutor. In addition, the reality this echo question addresses is the message of the interlocutor specifically on its auditory and not its real-world level. None of these two characteristics occur in normal questions. In light of these strong differences, we might feel inclined to give up on that similarity and to adopt, in contrast, the idea that, among echo questions, those which address the real-world or referential level would be the most similar – at least a little more similar – to normal questions. But we should remember what occurred with the echo questions that addressed real-world realities. Those questions ended up becoming the first part of a predication. Regarding the real-world realities, the producer of the echo question felt she had the authority and devoted herself to adding or completing the content of the interlocutor. As a result, in this regard, the echo question on the realworld level is absolutely not the most relatable to normal questions. Do we go back then to rethinking the similarity between echo question on the auditory level and nonecho questions? Would there be any way to derive the latter from the former? Let us imagine that, at a given moment, questioning intonation (which would have been being optimised for the only type of question then existent, that is, for echo questions9) is used without any repetition of messages. This was an unprecedented 8. In Bejarano (2003), I devoted some paragraphs to this. Remember what social psychologists call ‘fundamental attribution error’ (“we see our own acts as determined by circumstances, while seeing the acts of others as stemming from traits of character”). Certainly, small children and superior animals have internal states of belief. However, the ‘concept of belief ’, i.e., the metabelief, would take a long time to be applied to their own standing beliefs. (See infra, 21.1) 9. “In animals, the degree of perceived dominance is inversely correlated with voice F0. Likewise, this accounts for well-documented common cross-linguistic pattern (some would say “universal”) that questions, uncertainty, and pleading are marked by a high F0 somewhere in the utterance (when these intentions are not indicated by lexical, morphological, or syntactic means).” Ohala (2010).

 Becoming Human

anomaly. But let us push on. Although that was not an echo question, however, given that only that particular idea of questioning was available, it would have had to be asked which of the two available kinds of echo question, the auditory level or realworld level, would be the most appropriate interpretation. There was a criterion for this, as we have already said. If the speaker of this interrogatively intoned production does not add a second part, then he is not pointing to himself as wise, and therefore he is addressing the auditory level of the message of the interlocutor. However, if there is no previous message of the interlocutor, how could the articulation of it become the standard? How could something that does not exist be operating as the standard? Indignant protest comes pouring out: this interpretation contradicts the lack of previous message. But does it really? Might this not be as absurd as it seems? Would it be possible to interpret this questioning intonation as addressing the auditory level and not the real-world one, even if there is no previous message? The speaker would then be asking about whether her own message was adapting itself to the message she takes as standard. By interpreting this production in this way, the hearers would be pushed into supplying the standard message to which implicit allusion had been made. Here we may think of “metarepresentations of desirable utterances”, in Sperber & Wilson (1986), and especially in Wilson (2000). We might think, similarly, of children’s protodeclarative holophrase. It should be noted that, when a child pronounces its protodeclarative, it believes the adult is wise on both levels, the level of signifiers and also the level of realities: See Southgate et al. (2007). This makes children’s protodeclarative a privileged precedent also in the sphere occupying us in this paragraph. But let us return to our argument. The suggestion is that this could be the origin of typical (i.e. non-echo) interrogative communication. Through such a detour a method of communicating intrapersonal (not interpersonal) distrust about current real-world beliefs would have ended up occurring. In primitive echo questions, the contrast with the real-world reality was possible only for the content of the interlocutor. Now, at the moment of transition, that is, in our anomalous echo question which has no prior message, the own message is the one which, as the object of distancing and distrust, contrasts with the standard supposed by the expected messages of the interlocutor. What has led me to this suggestion? The key lies in my idea that at that time there would still not be sufficient syntax to express distancing from one’s own current belief about the world (in other words, to express that one is viewing one’s own current belief as a mental state, and not as reality itself). There would still not be any resources available for a speaker to point to a separation between real-world realities and the insufficient form in which she knows them. In our full language we have, as we saw above, a resource as sophisticated as the partial questions. However, could this type of resource have been available at the beginning? I do not think this would be the case. This suggested route of access into the distancing characteristic of our typical questions, that is, the distancing from one’s own knowledge about the world, may at

Chapter 20.â•‡ Interrogative communication 

first seem monstrously over-elaborate. I am aware of this. Consider, on the one hand, this complex and difficult interrogative focus on the own message (is my present message the same as the one you would produce?), and on the other, the natural way of asking about things simply and directly. With our fully established language, we have an immediate reaction. Of course the means of asking about things will be simpler and more direct. The derivation of the normal questioning function from echo questions seems too abstruse a suggestion for us. However, perhaps we are too imbued with all the sophisticated resources that the development of language has placed at our disposal. Perhaps our language prevents us judging appropriately the difficulties a human mind without language (or without syntactic language) would find. By having insisted on this caveat, these last few paragraphs have finally borne some clearly useful fruit. If they have served no other purpose, they have pushed us toward remembering the threat of epistemic mistake that hangs over us, human beings today, when we come to focus on pre-linguistic points of view.

20.6 And what of syntactic subordination? Echo questions, we have just seen, have no need of syntax. However, in contrast, other questions, both partial and total, would be based, necessarily, in syntax. The communicative functions intrinsically linked to syntax would thus be predications and part of questions. Having addressed both of these, it is the moment to acknowledge that we have only addressed simple sentences up to now. However, we will not be talking about language, properly speaking, until we consider complex syntax. As a result, although it is only by allusion, we will have to introduce this issue into the picture. Our immediate task will be to address indirect-style reported speech. These reported messages would be, I suggest, the location of the historical genesis of complex or subordination syntax. We will deal with all this in the next chapter.

chapter 21

Toward complex syntax The crucial role of reported speech

21.1 The beliefs of others are not only perceived but also explicitated: The first fruit of reported speech Reported messages, a complex type of metaspeech, open up a stage in the path of cognition. Certainly, reported speech (‘John believes that the cat is in its basket’) is not, in my view, the origin of the perception of beliefs of others. This has already been stated above. Contrary to what is hypothesised by several authors, such as, for example, de Villiers & de Villiers (1999), understanding that complex syntax would not be necessary to access John’s belief. In order to perceive false beliefs of others, it is enough to hear ‘The cat is in its basket’ while the basket can be seen to be empty. In this aspect, praise for reported speech has been excessive. Nevertheless, in other different aspects, the (indirect) reported speech would have had an enormous influence on cognitivelinguistic growth. This is the first of those aspects: Indirect-style reported speech, although not responsible for the perception of such beliefs, is indeed responsible for their explicitation. Statements appear, in principle, to be intended to speak about the external reality, or, more exactly, about the external reality such as the speaker conceives it. However, the syntactic complexity and the interpersonal detour entailed by the reported speech create a statement that seeks to reflect a mere cognitive state: ‘John said that p’, or ‘Mary thinks that p’. As we have just seen with the sentence ‘The cat is in its basket’, the primary access to false beliefs of others is in the reception of that sentence. As a result, the knowledge of such beliefs is originally explicitated as something said by someone: ‘John said that p’. Here, with the verb ‘say’, the cognitive state is now perceived as a cognitive state. And as a result, the complete statement can have truth value even though the subordinate may be false: ‘The ancients used to say the Earth was flat.’ But with the verb ‘say’, the speaker’s attitude to the subordinate content is not completely explicitated. In order to explicitate it, the speaker has two verbs (‘know’ and ‘think’) from which to choose. If she chooses ‘John knows’, she will be showing solidarity with John, and as a result there is no proper explicitation of a cognitive state in such cases. For the speaker, what he thinks is not actually a belief but reality as is, and so too will be any belief of others which coincides with his own. For the speaker, then, John does not have beliefs or cognitive states, but contact with reality itself. This has been formulated as a principle

 Becoming Human

of Social Psychology for some time now. The appeal to another person’s mental characteristics appears only when the beliefs (or any other type of attitude) of that other person do not coincide with one’s own. In contrast, when they do coincide, the subject does not call on the mind at all, that is, she does not appear to be aware of it, since in the beliefs which coincide between herself and the other she sees only the reflection of the reality itself just as it is. In ‘John knows that p’, there is, therefore, no explicitation of cognitive states, despite the formal similarity between that sentence and ‘John says that p’ or ‘John thinks that p’. As before, the relationship between the truth value of the whole and of the part is an indicative criterion. In ‘John knows that p’, truth value cannot be assigned to the whole if it is not also assigned to p. In contrast, with ‘think’ in the third person, the explicitation of thoughts or cognitive states as such has been achieved: there, and earlier in the indirect reported speech with ‘say’, the oratio obliqua appears, if we wish to employ the classic term. Of course, it will then be possible to nominalise this thought. And thus it will be able to be commented as that thought: ‘The thought that p is debatable/or false/or true’. Very similar to the previous is the resource of, first, describing a situation in detail, and then, only then, adding ‘This is not yet real. Let us work to bring it about/or to avoid it’. (In this latter, shocking mode, the detailed descriptive statements will permit a more complete affective evaluation of such a goal. So, as shocking as this way of describing something may be, there may, however, be good reasons for speakers to use it). Another task which depends on the use of thought merely as thought is the preparation of different plans: ‘If ‘the thought that p’ is true (that is, ‘if p’), then we shall act according to one plan’; ‘if ‘the thought that p’ is false (that is, ‘if not p’), then we shall act according to another plan’. The latter has been noted on many occasions, from the ‘symbolic trial and error’ of the behaviourists, to the ‘hypothetical simulation’ of ‘Theory of Mind’ studies. The explicitation of thoughts as such is evidently an enormous cognitive gain. I would suggest that subordination became necessary only with reported speech. The rest of relationships that are nowadays expressed with subordinate sentences could at first have been understood without the resource of a complex syntax. In initial language, the hearer would merely conjecture the correct relationship between two adjoining sentences, that is, communication would have been achieved without subordination. Using Sperber & Wilson’s (1986) terminology, we might say that optimisation or, at least, the sufficient satisfaction, of relevance would have been enough. A sequence of two sentences normally succeeds in conveying the speaker’s mental process, that is, the reason she has decided to say these sentences and in this order. A narrative sequence, however much it may appear to obey the facts, can never be explained as simply a mirror of the sequence of the facts. As a result, the very smallest of sequences of sentences would be obeying one of the producer’s mental processes, and would also be contributing to reveal this mental process to the audience. By contrast, the distinction between reported false beliefs and real information cannot be left in the hands of the hearer and his optimisation of the relevance. Thus,

Chapter 21.â•‡ Toward complex syntax 

subordination would arise with reported speech. Later, with progressive grammaticalisation, other types of subordinates would emerge, and the formal differences between the different types became more clearly developed. However, the explicitation of beliefs of others, which we have had to discuss in other chapters (in 16.4.2, especially) in order to contrast it with the perception of beliefs of others, will remain almost at the margins in the following pages. What will occupy us is the breakdown of deixis. This breakdown (the second cognitive influence of the reported speech) deserves to be brought out of the oblivion in which the present work has kept it until now.

21.2 Deictic derivatives In order to be understood, the deixis in the original message needed the speaker, place and moment that corresponded to that original message. When this original message becomes a reported message wrapped syntactically in a different message, the original deictics are therefore no longer suitable. Their substitutes have been studied for a long time. The heart of the substitutionary resource is known in recent cognitive semantics as ‘allocentrism’* (Levinson [1996] and Levelt [1989], among many others); Benveniste (1956) alluded to the same thing, although without employing any new terminology: “When, rather than referring to the instance of discourse, they refer to ‘real’ objects, to ‘historical’ times and places, the terms change. ‘I’, ‘here’, ‘today’, ‘three days ago’, become ‘he’, ‘there’, ‘that day’, ‘three days earlier’”. Cf. Sutton (2009, p. 233), who revisites the point that mature autobiographical memory requires us to coordinate and align egocentric and objective conceptions of time. Two different systems of co-ordinates come into play here. The centre of the original system of co-ordinates (the ‘I, here, now’ of the original speaker) has to be relocated in the new system of co-ordinates, and become ‘John’, ‘there’, ‘then’. But although it has to be relocated and reformulated in this way, it does not, however, lose its status as the centre for all specific designations of the subordinated message. ‘Near there’, ‘the day before’, ‘three years later’ are the ‘deictic derivatives’ which translate the deictics ‘near’, ‘yesterday’, and ‘in three years’ of the original message. Derivatives allow an objectivity and an explicitation which were totally alien to deixis, which was understood only by those who had access to the circumstances of the speech act. In contrast, ‘indirect reported speech’ now requires the creation of a new system for it to touch down in reality, a system in which it is possible to operate independently of the situation of the speech act. The ‘synsemanticity’ that Bühler linked so closely to writing would have appeared in oral language with the reported speech. As you can see, I am insisting that language would not have created deictic derivatives if it had not been for the indirect reported speech. Only when the need arose to relate to an audience the messages heard from an earlier speaker would the formidable communicative and cognitive resource of deictic derivatives have appeared. This is

 Becoming Human

certainly my view of the origin of deictic derivatives. However, it must be added that, once they had been established, the potential of deictic derivatives would quickly be exploited outside reported messages. This is precisely evidence of the worth and usefulness of deictic derivatives. It should be noted, for example, how easily I can designate a specific day by saying ‘the day before my wedding’, and how difficult, in contrast, it would be to calculate years, months and days to make this designation using a deictic. We shall address this point in more detail a little later. For now, we will be content to have made mention of it. It is more pressing to define and systematise this question a little at this point. There is a deictic derivative for every deictic. To compare one with the other, allow me to bring in the general-knowledge description of both types of resources. Let us begin with deixis. In their field of application, deictics are magnificent communicative resources. They are the great trick, we might say, through which every specific object in the world can be designated univocally and without us needing to have memorised a proper noun for each one of those objects. It is clear that our brain, even though it successfully masters a large number of class terms (tree, poplar, leaf, for example) would not cope with learning a different label for each specific example of poplar, or for each leaf. But, in spite of this inability, what may interest human beings (interest their attention and their desire to communicate) cannot be limited in advance. Deixis has, thus, continually to appear in language. It is only in the (highly specialised and sophisticated) area of metalanguage (whether metalanguage properly speaking, or its encyclopaedic-scientific version) that speech stops dealing with specific objects and moves on to do so around concepts. As a result, everything outside of metalanguage will have to be absolutely impregnated with deixis. The zero point of the system of co-ordinates is the ‘I as speaker of this specific speech act’ (that is, the ‘I – here – now’). This ‘I’ does much more than designate the speaker. This centre of the system of co-ordinates is necessary in every message (nonmetalinguistic, that is), however much this message neither refers to the speaker nor has to designate her. As a result of the establishment of the co-ordinate zero point, any specific point in time and space can be designated. This is the great function played by the ‘I’ in language. Of course, this feature of the co-ordinate zero point is shared by linguistic spatial deixis and animal perception. I am not, however, conflating one with the other. As has been repeated ad nauseam throughout this book, I am of the view that pre-linguistic thought would not be a mental syntactic composition but a unit whose parts lack all attentional independence. There is, in my view, no independently-addressed feature in pre-linguistic perception which corresponds to ‘near’, or ‘far’, or ‘up’... In addition, the zero point of linguistic deixis has to be differentiated from the ‘I’ designation. The ‘I’ serves to designate the speaker (in any space and any place, and not necessarily in those of the speech act). Certainly, the function played by the ‘I’ in those messages that take the speaker as part of their content, is, of course, important. Although individuals do have their own name, unlike what occurs for each leaf of each

Chapter 21.â•‡ Toward complex syntax 

tree of each forest, the designation provided by ‘I’ goes much further (in a specific direction) than any other means of designating the same individual. As was shown some time ago, the ‘I’ is perfect for this function, and cannot in any way be paraphrased exactly and totally. However, the really important role of the ‘I’ in the statement is not this, but to act as the zero point for co-ordinates. Without the zero point, i.e. without the ‘Ihere-now’, the omnipresent feature of language that is deixis could not be established. Deixis will be able to distance itself from this co-ordinate zero point as much as it wishes. The distance from the zero point can be as large as is wanted or needed. ‘4763 years ago’ is clearly a deictic: in order to understand what someone is referring to it is necessary to know when she is speaking. This criteria (the need to know who is speaking, and when or where) is always the touchstone. For this reason, a vocative ‘Dad!’, and similarly, the person and time verbal morphemes (of the absolute past, present and future tenses) would be deictics. Moving on finally to deictic derivatives, let us list the features derived from allocentrism that differentiate deictic derivatives from deictics. Firstly, the greater syntactic complexity of deictic derivatives. Since the centre in deictic derivatives does not coincide with the current reality, this centre has to be explicitated linguistically, and will appear as a syntactic dependence of the deictic derivative. We might point out here the parallel between the syntactic complexity of indirect reported speech and the syntactic complexity of deictic derivatives. In both cases, one element can govern a same-level element. There is recursivity in ‘(Sentence: John said) that (Sentence: the car was not working)’ and similarly in ‘(Noun: the father) of (Noun: Mary)’. The speech of John himself (‘The car is not working’) or Mary herself (‘Dad!’) were simple in comparison to the respective derivations. This mechanism would thus be an advanced stage in the derivation of recursivity. And, just as was suggested in 13.4 regarding the genesis of the most primary recursivity, interpersonality (in this case the introduction of the ‘third person’) would be the key here also. Thus, in the origin of deictic derivatives there would be the quoting of a word uttered by someone plus the remark about the identity of that original speaker. Mary says ‘Near’, and the second speaker (the speaker of the referred speech) says ‘near’ of Mary (i.e., the ‘near’ said by Mary). Or, using another example, ‘Dad!’ (that is, the word said by Mary herself) is repeated and ascribed to Mary. The original vocative, as a result of being repeated by another speaker, would have lost its communicative force and become a general concept with a compulsory syntactic link (the father of _). Let us compare this with what Tallerman et al. (2009, p. 142) suggest: “If we seek an evolutionary origin for recursion in human language, it is possible that we are really asking about the origin of recursive semantics (John’s mother’s brother’s lover’s chauffeur), which may in turn rest upon intrinsecally recursive concepts, such as kinship relations” (let us remember above 19.3). That attempt at deriving recursivity (and syntactic links) from prelinguistic concepts does not convince me. The reader already knows my hypotheses about recursivity. Firstly, the repetition of a linguistic message uttered by the conversation partner, that is, by the second person, would have given rise to the original

 Becoming Human

recursivity (a message inside a message) and, consequently, to the original syntactical distinction of thema/rhema. Secondly, the repetition of a message of a third person would have given rise not only to syntactic subordination, but also to the basic scheme of deictic derivatives – N (...N (N)): John’s mother’s brother –, which can be used outside the ‘reported speech’. But let us continue our analysis of deictic derivatives. As a second characteristic, we might point out that meaning in deictic derivatives, although similar to that of the corresponding deictic, is less experiential and immediate in character: we might speak about an emptying of meaning. Let us think of a conversation at the foot of the Giralda (the Giralda is the bell tower of Seville cathedral). “Can anyone see our friends who have gone up?” “I can see them now. They are just under the belfry”. Certainly, the meaning of the deictic derivative ‘under the belfry’ has affinity with the meaning of the deictic ‘down there’. But it is also true, and this is what interests us here, that ‘under the belfry’ points to a place high above the speaker and her hearers. While the meaning of the deixis is linked to the speaker’s bodyscheme, this link and these experiential roots have disappeared in the deictic derivative. On many occasions, as in the example, this uprooting ends up as an inversion of the meaning. Correspondingly, we can say that deictic derivatives are a typical case of grammaticalisation. Let us take ‘on top’ as an example. This would have moved from the deictic phase (i.e. from the adverb ‘on top’) to the deictic derivative phase (the prepositional locution ‘on top of ’).1 The grammaticalisation rule is absolutely still adhered to here: the less the strength of the meaning, the greater the syntactic complexity, or, what is just the same, the less the syntactic complexity, the greater the strength of the meaning. The third feature, derived from the other two, is the greater difficulty of comprehension. While children understand deixis with great ease from very early on, deictic derivatives, in contrast, entail a task that is not mastered until approximately the age of seven. It is true that, in everyday speech, children fluently understand instructions such as ‘Put the keys on top of the table’, or ‘Put your shoes away underneath your bed’. But it is certainly acquired habits, knowledge about the way things are, and other skills of this kind that are operating here. This, at least, is what appears to emerge from the well-known inability of 5 or 6-year-olds to carry out instructions such as ‘Write the date underneath your name’, or ‘Write your name above the date’. Any teacher knows 1. In Spanish, this example would also adhere to the cyclical repetition of phases (‘en la cima’ (Eng: at the summit) – ‘encima’ (Eng: on top) – ‘encima de’ (Eng: on top of)), which is the argument normally put forward against the enthusiasts of grammaticalisation (see this argument in Newmeyer, 2006, e.g.). How can grammaticalisation, it is said, be a clue to the origins of language if it is so often the case that the supposedly less grammaticalised, or, in other words, fuller, word (as, here, the adverb ‘encima’) derives etymologically from a totally syncategorematic (namely, the preposition ‘en’ in ‘en la cima’)? To this I would reply that my purpose is to explain genetically the derivation that leads from deixis to deictic derivatives, or from simple to complex syntax. In this respect, the cyclical repetition of phases is not an obstacle.

Chapter 21.â•‡ Toward complex syntax 

that, with instructions such as these, children at this age would only understand that they have to write their name and the date, and that one thing has to be on top and the other underneath, but not which goes on top and which underneath. But this is the case not only with children. Imagine we give an adult uninterrupted instructions of the kind ‘Draw a square to the right of a circle’, ‘Draw a triangle under a square’, and so on, again, and again, and again. Within a very short time, the adult will be extremely tired, even though this adult might have been able to listen to a conference paper many minutes longer in duration. There are, of course, factors that increase the difficulty in this type of instructions. These factors are, on the one hand, the non-human and non-living nature of the chosen centre of reference, and, on the other, the fact that this centre has not yet been drawn or written down, and, therefore, is still excluded, not only from the field of perception, but even from absent realities. But in children, I repeat, it is possible to verify the difficulty of every deictic derivative compared to the easiness in understanding the corresponding deixis. If we review the different kinds of deictic derivative, we will be able to observe the three characteristic features mentioned. The derivation from ‘Dad!’ is ‘the father of N’ (remember how in descriptions by anthropologists the family structure focuses on an ego, which can be occupied later by any proper noun). Regarding the difference in difficulty, we have all experienced what Piaget (1962) pointed out, namely, how difficult small children find it to understand that their father is the son of their grandfather, for example. For their part, the absolute verb tenses, which are focused on the ‘now’ of the speech act, become relative (pluperfect, future perfect, in Spanish) when they come to depend on a linguistically explicitated centre which does not coincide with the moment of speaking. As long as the moment of the speech act is the current reality and therefore does not need to be explicitated, any other different moment will have to be explicitated. Between the ease of the egocentrism of deixis and the extreme difficulty of the deictic derivatives just mentioned, there are two intermediate points. An examination of each intermediate point will help us in analysing the tasks of this entire cognitive arc. The first corresponds to the understanding of deictics ‘which cannot be repeated as an echo’. Certainly, this is a question we addressed in 6.4, when we tried to explain the great achievement of parity of meaning between speaker and hearer. However, we shall now review it from another perspective. In fact, this is precisely what we shall do throughout this chapter – situate extremely well-known facts within our general outlook.

21.3 Second-person allocentrism: An intermediate milestone which occurs in both consequences of the reported message We have stated that an indirect reported message would give rise to two new developments. The first is that the subordinated statement does not refer immediately to things it is about, but only to the speech or thought of the subject of the verb ‘say’ or ‘think’.

 Becoming Human

The second is the deictic derivative that comes to replace the deixis of the original speaker’s message. These two new developments are, I repeat, linked by their common origin. But we can now highlight a new analogy between them. To do this, we shall focus on what preceded the establishment of reported speech. Remember what we said about the origin of the perception of beliefs of others. Although we can certainly perceive these when we hear a reported speech that informs us about them in the third person, they would originally have been perceived much more easily and immediately. A recipient needs only to hear a predication with which she does not agree for her, without needing anything else, to perceive the false belief of the speaker. This false belief is not yet explicitated there as a belief of others. The explicitation does not occur until the end of the journey, that is, until it becomes a reported message, subordinate to ‘John said that’. But access to the false belief of others would already have occurred previously. This intermediate milestone (the perception of the ‘second-person’ false belief) must be assessed and highlighted, just as we have done in earlier chapters, as the reader will remember. What interests us at this point, however, is that we find exactly the same thing in the journey from deixis to deictic derivative. The intermediate milestone in this new journey is completely analogous to the one we have just described, and equally warranting of being valued and underlined. Up to now we have depicted deixis as being egocentric. The ‘I-here-now’ is the co-ordinate zero point. But, at this point, let us address the reception, and not the production, of deixis. The reception of time deictics entails no problem: the speaker’s ‘now’ is just the speaker’s ‘now’ (Benveniste, 1965). What happens with ‘here’ will depend whether it is a ‘my here’ versus a ‘your there’, or if, on the other hand, it is a wide here in which both interlocutors fit. But what is certain is that the reception of the ‘I’ will lead us to pose ourselves a problem. We might call those problematic deictics, as was suggested in an earlier chapter, ‘deictics which cannot be repeated as an echo’ – which cannot be repeated, it is clear, if we wish their reference to remain the same. How does the recipient understand the ‘I’ that she hears? Perhaps it will help to set out once more what we saw in Chapter 6. We might think that the reception of the term ‘I’ would only require us to know the meaning the code assigns to the ‘I’. We might formulate this meaning as ‘Look at the person who is speaking’, an instruction which would only be a particular form of the instruction ‘Look at the circumstances (who, where, when) of the message’.2 According to this hypothesis, the semantic rule is applied and that is it. If we accept this, then egocentrism would not be involved at all when the ‘I’ is received, and there would thus be no problem with the recipient’s own egocentrism. However, I think this neutral

2. In this instruction, the dictionary (that is, the code) transfers its instructing function to the (circumstances of the) message. Remember the C(ode)/M(essage) space, corresponding to deixis, in Jakobson’s (1957) study on the four linguistic spaces where there is a complex relationship between code and message.

Chapter 21.â•‡ Toward complex syntax 

and disembodied semantic rule is not the one at work here. Let us look at ‘behind me’, ‘to my right’. These expressions deserve commenting in several respects. Note first how, exceptionally, the I is explicitated as the co-ordinate zero point. Never, or on very rare occasions, do we say ‘that street is quite far away from me’ instead of the shorter ‘that street is quite far away’. Why do we not add the ‘from me’? Obviously because the interlocutors are normally in the same place, and there is thus no possibility for the ‘far away’ to be misinterpreted. However, in contrast, what is behind or to the right of the speaker will not coincide with what is behind or to the right of the hearer if, as very often occurs, the interlocutors are facing each other. As a result, in order to avoid the hearer interpreting these expressions according to her own body schema, expressions occur which have the form of a deictic derivative but which nevertheless are egocentric in the pure deictic sense. To conclude, by opposing the dangers which the recipient’s egocentrism might entail, ‘behind me’ and similar expressions testify that the egocentrism is present during reception. The second, and of course, related comment will oppose the neutral semantic rule which we were considering. Is it possible to correctly interpret the ‘behind’ or ‘to the right’ of the speaker whom I am facing, using just one rule? In order to avoid involving the body schema, the rule would have to be highly detailed. Think, too, that instead of facing one another (180º) the speakers may be at 73º, for example. But there is, in addition, the gesticulation that normally accompanies such expressions. If there is no choice but to include body schema in order to interpret this gesticulation, why would we make the semantics so diabolically complex? But if the recipient involves the body, then it is necessarily the body of the speaker that the recipient is using. There is thus allocentrism in the reception of some deictics; however, it is a second-person allocentrism, and not the third-person allocentrism we have seen in deictic derivatives or in the reported message. This allocentrism is much easier than the other. The fact that children come to master the reception of deictics much earlier, several years earlier, in fact, than they can produce or understand deictic derivatives cannot surprise us in the slightest. The speaker is present, alive and addressing the recipient (that is, the subject we are considering here). In contrast, it is possible that the centre of the deictic derivative has not yet been constructed, or that it is absent, and, even in the best of cases, it may be inert or will not be acting in the speech act. As has already been stated, this subsection has returned to a question we had already addressed in the chapter on Saussurean parity. Undoubtedly, we have repeated ourselves. Nevertheless, the reception of ‘deictics which cannot be repeated as an echo’ has been situated here as an intermediate landmark on the route toward deictic derivatives. In addition, we have also related this route with the route that would occur for beliefs of others. Both routes would travel from second person to the third.

 Becoming Human

21.4 Difficulties, advantages and consequences of the deictic derivative 21.4.1 Postposition of the centre versus linguistic platform One of the factors responsible for the difficulty in the allocentric process involved in all deictic derivatives is one we have not yet mentioned. The question is that the term designating this ‘allocentre’ (the centre of reference both for speaker and for hearer) in many cases comes after the instruction (‘behind’ ‘before’ ‘under’, etc.) that appears as the first part of the deictic derivative. If we remember that, in many cases (as we saw in the example of the Giralda), the deictic derivative will require the immediate experiential meaning of the instruction to be inverted, the point to which this postposition can complicate things will be obvious. In fact, until the centre is finally designated, the instruction will have to be held in memory, but held (this is the point) without having yet been understood as an instruction. Speakers have always intuitively understood that this factor is responsible for an enormous increase in difficulty for recipients. Remember the teacher of the six-yearold children. If he is a teacher with any experience at all, he will not give them instructions such as ‘Write your name under the date’. What will he do? He will turn to a simpler form and split the simple instruction into two successive instructions. “Write the date. Have you done that now? You have? Then, now write your name underneath (/under it)”.3 The element of interest to us is, of course, this ‘underneath’ or ‘under it’. This element is neither a deictic nor a typical deictic derivative. It is not a deictic because the co-ordinate zero point, that is, the centre regarding which we have to understand the ‘under’, is in no way the teacher’s body (nor is it the body of the child recipient). But neither is it the classic ‘deictic derivative’. Why not? Certainly, the centre of reference (the allocentre, we might say) has been explicitated. However, this centre (‘the date’, in the example) is not subordinated to the deictic lexeme. Instead of being subordinated and subsequent to it, it has been explicitated in a different sentence spoken before the one containing the ‘underneath’ or the ‘under it’ (‘it’ is an anaphora standing for ‘the date’). What is the recipient’s task now? The recipient has to pay attention to, or evoke, the reality designated by the first sentence. When her attention is settled on this, she will be in a position to easily understand the particular instruction meant by the ‘underneath’. This instruction will be easily understood, of course, but only by a recipient whose attention is placed on ‘the date’. By thus paying attention to the reality designated by the first sentence, the recipient will begin to understand the ‘underneath’, not in relation to her real body, but in relation to what we might call her attentional location. The platform the first sentence provides to the recipient is, thus, the key to the facilitation achieved. As a result, we might call these intermediately-difficult elements 3. Is there any similarity between this and what Hopper (2008), says about ‘emergent serial verbs’? Certainly these are two very different issues. However I find a little bit of analogy.

Chapter 21.â•‡ Toward complex syntax 

between the deictic and classic deictic derivative ‘deictic derivatives with linguistic platform’.4 Placing oneself imaginatively in a specific scene would be an ability the child had previously mastered. As a result, recourse to this ability may suppose a great reduction in difficulty compared to the classic deictic derivative. We will find examples of this intermediate form at every step, if we wish them. “We were at the García’s house the other day. It’s a lovely house. The bedrooms are upstairs”, “Last summer we were in Athens for 5 days. Afterwards, we went to Crete”, “To get to the station, you have to take the second turn on the right, and go straight on until you come to a square. There, turn left.” It should be noted that in indications such as the previous one, about turns and squares, practically no one opts for the classic deictic derivative (that is, almost no-one would say ‘you have to turn to the left of the fountain which is in a square at the end of the second turn on the right’). Speakers are subject in this case by an intuition similar to that of the experienced teacher. The hearer’s young age, in that case, and the complexity of the instruction, in this, advise (and almost coerce) the speakers to avoid the elements we have called classic deictic derivatives. This phenomenon, which we might call impregnation by the scene evoked in the previous sentence, occurs very widely in language. See, for example, ‘She was wearing a lovely coat. The buttons were highly original.’ What relationship is there between all 4. A slightly different use of ‘linguistic platform’ is the one that would appear in the message in ‘direct-style reported speech’. “Napoleon said, ‘I will enter Moscow before winter’.” This ‘I’ spoken by the narrator does not refer to him, but to Napoleon; the future tense does not have the narrator’s speech act, but the original speech act, as its centre. “Napoleon said” is a previous linguistic platform which involves the explicitation of everything which is present, but implicit, in the normal deictic. The linguistic platform was also, we have seen, the key to building an easy alternative to the typical deictic derivative. “Write the date. Underneath, the name.” “We have already seen the García’s new house. The bedrooms are upstairs.” In this case, just as in directstyle reported speech, the linguistic platform consists of an independent sentence prior to the one in which the deictic element is included. The similarity between the two platforms is clear. But we should also pay attention to the difference. The difference can be mentioned in an instant. In one case, there is quotation, or as Frege might put it, ‘signs of signs’; in the other, there is not. But let us comment further on this. The platform that facilitates the deictic derivative places a scene before the recipient’s imagination. The ‘Napoleon said’ platform does exactly the same thing. But the person in the first case who comes to fulfil the deictic instruction in the evoked scene is the recipient herself. Once the recipient is imaginatively in the García’s house, it is she who in this fictitious location will explore the ‘upstairs’. On the other hand, in the evoked scene of Napoleon’s speech act, it will not be the recipient but Napoleon who takes control. The recipient can succeed only in integrating herself imaginatively in the evoked scene, to imagine herself listening to Napoleon. Naturally, as soon as she achieves this, the recipient will apply her ever easy understanding of ‘deictics which cannot be repeated as an echo’. Thus, in the case of the message in ‘direct-style reported speech’, there would be a double moment in the task of the recipient. First, she will have to imagine herself in circumstances in time and space that are not her real current ones. Then, once she has imagined this scene, she will have to perform the easy second-person decentration which all recipients of any ‘deictic which cannot be repeated as an echo’ must perform.

 Becoming Human

these cases (the ‘Upstairs’, the ‘on the left’, or also ‘the buttons’ without the possessive) and anaphora? This comparison may turn out to be useful. Anaphora – let us begin by describing these – normally take the form of demonstrative deictics. When ‘this one’ or ‘that one’ operate as true deictics, they are referring to something more or less near the speaker’s ‘here’. On the other hand, when they become anaphora, they point to some element of the text (without it mattering if this is a written or oral text, and, if it were oral, without it mattering either if the element in question was pronounced by the same speaker or her interlocutor5). The demonstrative in this case would not point, therefore, to real space, but to the space that the text constitutes. However, although the type of space thus changes, one of the requirements that were indispensable for the demonstrative deictic is preserved in the anaphora. The space where the referents are to be situated will always have to be shared by speaker and hearer. Using an anaphora in front of a hearer who had not heard the previous statement would be absurd and analogous to using truly spatial deictics in a telephone conversation. Anaphora, therefore, depends on the previous linguistic platform. Is there then any difference between anaphora and our ‘deictic derivative with a linguistic platform’? We should remember the ‘upstairs’ or the ‘on the left’ which had to be understood against the background of the scene whose evocation by the recipient had been previously provoked by the speaker. Clearly, these meanings do not fit the definition of anaphora. However, the relationship between these meanings and anaphora is undeniable. Anaphora substitutes a previously designated element.6 Those meanings are, on 5. “In children, anaphoric pronouns most often refer to a discourse object previously mentioned by the child’s interlocutor. This suggests that the anaphoric value of pronouns is first acquired through dialogue before it is extended to monological uses”: Salazar-Orvig et al. (2009). Remember again the Vygotskyan ‘General Principle’. 6. This substitution should not be understood as the substitution of the term. The ‘donkeysentences’ proved clearly that anaphora cannot be understood as Benveniste (1956) still understands it, namely as merely a matter of articulatory saving. ‘If Smith has a donkey, he will beat it’. Here it is obvious that we cannot replace the second anaphora, that is, the ‘it’, with ‘a donkey’. Although in the protasis or conditional, the ‘a donkey’ remains completely indefinite, in the apodosis such indetermination of the donkey is now restricted by the need for it to belong to Smith. As a result, anaphora, more than having the simple function of saving syllables, also designates the sum of the previous term along with its specific syntactic role. My suspicion is that explanations that call on simple articulatory savings are never very accurate. In another regard, I also wish to add that incorporating that specific syntactic role within the meaning of the anaphora would not, in my view, be an exceptional fact. As has already been seen in an earlier chapter, links (those from speech episodes heard in far-away childhood just as much as, in the case of anaphora, those from the immediate textual past) are not an addition to the meaning but are part of the meaning itself, although, of course, the strength corresponding to each link varies from one use to another. Links from the immediate textual past are the ones that intervene in donkey sentences (as in the poetic refrain which gains continually in meaning with each verse, or – remember Chapter 20, note 3, p. 320 – in problem-solving.)

Chapter 21.â•‡ Toward complex syntax 

the other hand, an instruction to operate on the background provided by that element. They would thus be perhaps slightly more complex than anaphora. However, given the ease with which deictic instructions are carried out in real space, this additional complexity does not pose any substantial increase in difficulty.

21.4.2 Dispensable perhaps, but certainly highly useful: Deictic derivatives outside ‘indirect reported speech’ Each deictic has a corresponding deictic derivative. This has to be the case, because each of a message’s deictics will have to be included in the corresponding ‘indirect reported speech’. The change in the co-ordinate base will always allow no point of the original message to be lost. But the deictic derivative, although it would probably originate for this function, that is, to make ‘indirect reported speech’ possible, can also be used outside reported speech. The example about the Giralda in the previous paragraph showed this use. The deictic derivative would probably never have originated for non-reported speech. However, once the resource of the deictic derivative had been created, it would have been possible for this resource to be employed in some functions which, independently, would absolutely not have had sufficient strength to create it. Once the cannon had been invented, it can be employed, not only to knock down walls, but also to kill a mosquito: this is the omnipresent rule in the evolution of language. Let us attempt a description of when, in which type of occasions, the deictic derivative may be, if not strictly necessary, then useful for non-reported speech. Let us imagine that I, having married years ago, wish now to signal deictically the day before my wedding. I would have to do a complex calculation –36 years, 2 months, 28 days ago. And, worse still, in that deictic designation, the relevance that led me to choose to speak about that day would be lost. For which reason, ‘the day before my wedding’ is an optimal solution in such non-reported speech. Naturally, it would translate very well the ‘yesterday’ that would have been said at the time of the wedding. But I do not need to be referring in reported speech to any speech act which occurred on the day of my wedding. I am simply designating a specific day that interests me. As we have been stressing, the noun (or more generally, the element) ruled syntactically by deictic derivatives (whether these be ‘near Acapulco’, ‘the day before my wedding’, ‘Juanita’s father’, ‘they had already arrived when I called’) designates the point – spatial, temporal, or personal – from which it would be possible to employ the deictic corresponding to this deictic derivative: ‘near’ would be said in Acapulco; ‘yesterday’, at the moment of my marriage; ‘dad’, by Juanita; ‘they’ve arrived’, at the moment I called. Let us move on from ‘the day before my wedding’ to ‘three years after my wedding’. As you can see, this type of deictic derivative is the basic resource of the calendar. “An axial moment acts as the starting point for the computation. There are two directions (before.../after...) in relation to the axis of reference. The numbering system and some (astronomical) units are added to the above”: these are the three features Benveniste (1965) used to characterise calendars. The most fundamental difference lies in the

 Becoming Human

wide social acceptance of the centre chosen for the calendar. The birth of Christ, the Hegira of Mohammed, etc. are events known by more people than my wedding. Yet the system is the same, although, of course, with the calendar, or also with the system of longitude and latitude, we have a system that can generate an indefinite number of proper nouns. Or, more specifically, which can generate individual labels which, while they achieve even more synsemanticity (or independence from the situation) than some proper names with a capital letter, do not, however, need the intolerable expenditure of memory which would be necessary for the assignment of a proper noun with a capital letter to each day, or each point in the Pacific Ocean, which we might be interested in naming. We had already seen this for deictic derivatives in general. In these, the advantage of proper nouns (synsemanticity) converges with the advantage of deictics, namely, the ability to name any specific referent without needing to have learned an individual element to designate it. Deictic derivatives are, we might say, a true cognitive-linguistic gem. As a result the mechanism formed by deictic derivatives may be applied, not only outside the reported message, but also beyond deixis, i.e. although there is a deictic derivative for each deictic, the inverse statement is not true. Which are the cases for which the term ‘deictic derivative’ would be correct only if we understand it as describing the genesis of the general resource, and not of the specific element? My reply would be adjectival comparatives.

21.4.3 From vague adjectives to the pure dimension Let us think of those adjectives which logicians call vague: “tall”, “short”, “young”, “old”... These meanings are enormously dependent on circumstances. (Cf. Vallée, 2010.) It is typical to refer to pairs such as “A young cardinal”/“A centre-forward who is not at all young”, “A tall girl”/“A short giraffe”. But these examples have been chosen precisely because we all understand very well the expectations operating here. Things can be much worse outside of pedagogic examples. In the application of all of this type of adjective, the criterion lies in the expectations, the canon, the standards accepted by the speaker for the case in question. If these expectations or standards are also those of the hearer, communication will be achieved without difficulty. But only in this case. The vagueness which takes over the meaning in these adjectives has no comparison with the opening of the term ‘I’. ‘I’ can designate anyone, but it does so in accordance with a stipulated rule –‘I’ is the speaker. On the other hand, for this type of adjectives there is no stipulated linguistic rule – anything can be called short, anything can be called tall. The criterion according to which height, or heat or size, is evaluated is inside the mind of speakers. It is not only that it is not explicitated. It is not explicitated in deictics either, but not for this reason does the centre (the speaker in her time and place) cease to be objective. In contrast, the centre here is not only implicit, but is totally subjective. The term ‘sympractical’ is very useful here – Luria’s (1979) term for the type of language which needs a very high level of ‘affinity between interlocutors’ if

Chapter 21.â•‡ Toward complex syntax 

it is to be understood. Vague adjectives are communicatively perfect when there is sufficient affinity between the speaker and hearer’s subjectivities. This coincidence between the mental states which both apply may at times occur beforehand and gratuitously, but also, on other occasions, it may be the product of the speaker’s previous linguistic effort (or even talent) in order to achieve the correct contextualisation. Only in the first case would we be speaking about sympractical language. Certainly, vague adjectives could often, I repeat, give communicative success. However, these adjectives could undoubtedly improve in objectivity and make misunderstandings more difficult. This was a challenge which language faced and overcame. The resource constituted by the deictic derivative, which, as we have seen, appeared in order to make possible the ‘indirect reported speech’, would have come, once it was constituted, to take responsibility for this new function. The comparative form of adjectives has the structure of the deictic derivatives and carries out the function of remedying the vagueness of the adjectives.7 Consider the structure, “taller than Juanita”. Here the features we have observed in the deictic derivatives are evident. Firstly, and fundamentally, the explicitation of the centre regarding which the derivative is operating. (We might even bring in here the description we used earlier –‘Juanita’s father’: ‘Dad’ would be said by Juanita. Likewise, ‘Stronger than Juanita’: ‘Strong’ would be said by Juanita. Kinship names and comparatives would thus be on the same side of the table.8) Secondly, syntactic complexity: compare “taller” with the simple “tall”. Third, semantic emptying, the loss of the immediate and experiential meaning of ‘tall’: if the centre, that is, Juanita, turns out to be very short, the description “taller than Juanita” does not evaluate the height of the target in any positive way. If we turn now to its remedying function, we will also find it to be obvious. A simple qualifier such as ‘big’ can correspond to anything, to an ant or a star. In contrast, when the comparative appears, there is at last an explicit and objective statement, namely, that the object described is greater in size to another. 7. Adjectives’ degrees of comparison have a structure similar to deictic derivatives, but they do not derive from deictics. This contrast between structure and origin can be seen when the vagueness which the comparative attempts to remedy affects a deictic adjective or adverb. It is well known that deictics such as close and far, or also before and after, can signal very different distances depending on the context. In short, ‘near’, ‘nearby’, and similar terms are both vague and deictic. Being vague, they can be subject to the comparative’s remedying action: “This town (about which you are now telling me) is quite a bit closer than that one”. But these comparatives continue to be deictic. The centre of reference to measure each closeness is in both cases the city from which one is speaking. This example is thus highly illustrative of how there is no reason for a structure similar to deictic derivatives, that is, the adjectival comparative structure, to turn those adjectives into deictic derivatives. I should note that I am making this clarification as the result of comments by Manuel Rosa. 8. ¿Are there languages where these two kinds of meanings share the same suffixes or some other morphological marker?

 Becoming Human

But the cognitive advance can go further. Two opposed comparatives can be projected onto a single object. The centres of each of the comparatives only have to be on both sides of the object described. B is taller than A but shorter than C.9 What is happening here? At this point, the height has now lost completely its connection with the positive feature. Of course, in the simple comparative, there was no longer necessarily any positive evaluation (“taller than Juanita” could designate a reasonably short stature) but, in any case, ‘tall’ in this comparative is still understood positively. Now, in contrast, in the series of three terms, B is at one and the same time taller and shorter, and, consequently, ‘tall’ is now designating a mere dimension, height, made up of the continuum of all possible degrees of comparison, from least tall to most tall. Having thus reached the dimension, the need to quantify has now been sketched out. “Don’t tell me if he is tall or short; tell me what height he is.” Naturally, the true end of this story, that is, the stunning result which is quantitative science, had to wait for favourable social conditions, but it is clear that the key to such deployment is present here, in this cognitive-linguistic resource. It will also be helpful to point out how, in order to make full use of the advantages of the double and opposite comparison ‘B is taller than A, but shorter than C’, the preferable resource is the classic deictic derivative itself and not its easy homologues. We should remember that, as a result of the prior explicitation of the centre, the deictic derivative became much easier to understand. Nevertheless, however much this linguistic platform has its undoubted advantages, it is completely inadequate for double comparison. In order for double comparison to be able to focus its two opposing aspects on B, it is indispensable that the comparison centres come after the comparative. What in one aspect is an added complication has created ease in another. I wish to repeat once more that the type of resource formed by deictic derivatives is originally linked to the ‘indirect reported speech’ – to complex syntax. This genetic link is the key to this subsection. Although the function of the points of reference has indeed been discussed by many authors, some do not consider this possible genesis at all. If we take Langacker (2006) as an example, several paragraphs of this article address the comparison between the reference point and the target object as well as the double comparison of a mean term with terms opposing each other. In this sense, there is no difference with what we have done here. But in another sense there is. My hypothesis insists above all that these tasks would constitute derivations from the breakdown of deixis, i.e. derivations of the complex syntax of the ‘indirect reported speech’. In short, what I am seeking is to show the high level of complexity in these tasks and the long cognitive-linguistic trajectory that would have preceded them. An animal could certainly choose the larger of two inviting elements. To me, it even seems likely that, with the appropriate training (of aversion to the largest as well as the smallest), any animal would become able to choose the middle-sized one of three elements. However, these choices would be completely different to the processes involved in 9. On the difficulties in the three-term series problem, see Johnson-Laird (1972).

Chapter 21.â•‡ Toward complex syntax 

deictic derivatives or in access to the concept of dimension. Animal choice does not in any way imply the task of formulating an element (A) via the synthesis of ‘a different element (B) plus the appropriate transforming addition’. In contrast, this is just the task we perform when we say that ‘A is taller than B’. At first glance, the comparative may appear very different from other deictic derivatives. Perhaps the comparative ‘taller than (Juanito)’ may be seen as more complex than ‘(Juanito’s) father/mother/brother’. However, I do not think we should allow ourselves to be deceived by the apparent simplicity of this last (more frequent!) designation. There is a great cognitive advance between the deictic formed by the vocative ‘Dad!’ spoken by Juanito and the objectivisation of the relationship between Juanito and his father.10 But let us leave this to one side, and focus on an important point which has appeared with the question of comparatives. What is the relationship between this type of deictic derivative and what in earlier chapters we saw about Buytendijk’s task? In order to address this question, we must start by analysing the entire journey of a term such as ‘More’.

21.4.4 ‘More’, ‘another’, ‘next’: Deictics or deictic derivatives, as the case may be ‘More’, ‘another’, can be deictics: think of the situation where the speaker, who is receiving objects, asks her hearer for ‘another’, or ‘the next one’, or, as in the example of the mother’s confused request, ‘more blocks’. These meanings relate to a situation, to a specific moment. If it is not necessary to specify which moment, this is because in these cases the reference is to the moment when they are being uttered. As you can see, the essential characteristic of deictics is not absent in this use of ‘more’ or ‘another’. The centre is not explicitated, but is given with the circumstances (who, where, when) of the speech. However, the terms ‘more’, ‘another’, ‘next’ can on other occasions be used as deictic derivatives. Consider, for example, “When John looked out he saw many more soldiers than (those he had seen) the day before”. The narrator is describing what was happening when John looked out, that is, what was happening at the moment now past, the moment t. At that moment t, there were more soldiers. Clearly, this ‘more’ is not deictic. Its centre is not the amount there may be at the moment when he is speaking. The centre regarding which the ‘more’ should be understood is the amount there were at a moment before even the moment t now past. We can see the same thing in 10. In kinship-terminology studies, deictic vocatives are frequently forgotten. For example, in the study by Jones (2010) and the comments by different authors regarding that article, I have only found one reference to those deictics – in Bloch (2010). Certainly for the strict objective of those studies, that point is not important. However, if we want to study the origin, both ontogenetic and historical, of kinship terms, it is indispensable, in my view, to take into consideration the deictic vocatives.

 Becoming Human

the case of ‘other’, which is clearly deictic in the situation of the speaker receiving objects, but ceases to be deictic and becomes deictic derivative when it is used in ‘I hope someone other than Manolo comes’. (Other than Manolo: ‘He’ spoken by Manolo. Remember 21.2 and 21.4.2) In 18.8, it was suggested that once outdated content had been maintained, it could be used in conjunction with other abilities and begin thus to have useful functions. One of these functions may be, we said there, to reformulate a non-subitisable set in terms of a past set which will, indeed, be within the subitisable (or, in other words, within what is ‘calculable exactly with merely perceptive means’). We took our inspiration from Buytendijk’s experiment, and from the persistent failure of animals when faced with this task. This is now the moment to look again at this whole question. The only way of discovering the trick in Buytendijk’s task lies in re-describing the location of the reward in terms of the location where it had been seen the immediately previous moment. The demanding requirement of Buytendijk’s task, the requirement the animals would not succeed in meeting, is the maintenance of the outdated perception. Similarly, the only way of interpreting a footprint is to recover the previous scene. We said all this several chapters ago. We have now begun to talk of deictic derivatives: ‘beyond container n’, ‘after moment m’. This appears to contradict the above. Would the requirement of those tasks be assimilable to the processes which lead to the simple syntax of ‘The reward has moved’, ‘The snake has gone’, or would they involve, on the contrary, the complex syntax we have linked to deictic derivatives? What do we think? I think we must differentiate the levels. Deictic derivatives are undoubtedly necessary for the explicitation of the rule. This rule covers any (past or future) ‘footprint interpretation’, or any formulation of an element depending on the previous element. But to perceive the trick it would not be necessary to have the rule explicitated. Certainly, if the theorist wishes to describe the trick, she will have to explicitate it, and turn to the sophisticated resource of the deictic derivatives. But in the subject himself, the trick can be perceived in a much simpler way. This is not the first occasion in which I have opted to separate comprehension and explicitation. The reader will remember other occasions. False beliefs of others are perceived much earlier, not only before it can be explicitated as a belief of somebody else in a reported message, but even before the predication corresponding to this belief appears in the mouth of the person believing it. For its part, the ability to conceive a visual perception radically and intrinsically different from one’s own ones would have occurred much earlier than the ability to express linguistically ‘my companion is watching me’. Analogous to these contrasts is the one we have just hypothesised between perceiving the trick and explicitating the rule. Of course, the explicitation of the rule will bring enormous advantages for subjects. Without a rule the type of process could not be generalised, nor could there be any way to communicate the trick in question out of context. All this is obvious. But it

Chapter 21.â•‡ Toward complex syntax 

does not prevent us holding the idea that merely understanding the trick does not require explicitated rules and, in spite of this, still be important. By merely understanding a similar trick, we could reach an exclusively human causal understanding (let us remember the comprehension of tracks). The explicitation posed by causal conjunctions would not be at all necessary. ‘The footprint is there because the snake was there’ formulates very well the discovery that, after several episodes of holding the outdated perception, would arrive. This is true. However, in the beginning the only thing that would have been needed is a conditioned association in which, as a result of the revolutionary maintaining of the outdated perception, the bell would ring (we might say) after the meat. Do we insist once more on the other side of the coin? We could not have reached any discussion about whether one phenomenon is the cause of another, nor could we have come to the concept of proof, nor any of the cognitive refinements of this tenor, without the explicitating resource of ‘because’, and beyond this, the ‘cause’ and ‘effect’ nominalisations. Let us attempt a more encompassing formulation. What deictic derivatives or conjunctions offer above all is a range of crystallised and systematised ways to change point of view. This ability is always linked to full language, and is certainly an immense cognitive resource. ‘A after B’/‘B before A’ are two ways to refer precisely to the same sequence. But in some cases the first way will be preferable, and in other, the second. The same thing happens with ‘A gives to B’/‘B receives from A’; ‘A in front of B’/‘B behind A’; ‘A hurts B’/‘B is hurt by A’; ‘A is the cause of B’/‘B is the consequence of A’; ‘eight is one more than seven’/‘seven is one less than eight’ (see above, 17.5). Without these prefabricated cognates which language provides it would be much more difficult to perform these changes in point of view. Such changes are, it is clear, highly important in order to achieve coherent communication that the recipient can understand with ease. But perhaps it is not only a question of this. Perhaps each alternative formulation has a set of different associations, and it is among such associations where the expectation outlined as the goal in a problem will select the means to become satisfied. Let us focus once more on deictic derivatives, or, even more specifically, on the route which would run from vague adjectives to the pure dimension. In this case, the resources crystallised in language allowed enormous flexibility to change the centre. ‘A taller than B’/‘B shorter than A’. We even looked a step further, ‘M shorter than A, but taller than B’. It is undeniable that the greater the flexibility to change point of view, the closer we come to objectivity. Human perception, like all animal perception, has its centre in the perceptor’s own body. From this unconscious subjectivity about themselves, human beings advance towards objectivity. But they would not reach objectivity by breaking from all centres (I do not agree with Nagel [1979]), but, on the contrary, by switching flexibly and agilely from one centre to another (This might be somewhat in line with the ‘generalized other’ posited by Mead [1934]. See also Carpendale & Racine [2010]). The basic human ability, which is manifested in the communicative gestures of pointing, would have begun through the duality of mental centres. And this duality, transformed and improved by the systematisation and

 Becoming Human

crystallisation through time of more and more resources, would continue to be the nucleus of the subsequent human abilities. The second centre may be the second person or, later, the third person. And even later, the flexible constant change takes root.11 But at all times the link with a point of view would continue to be irremediable even where, thanks to our second mental centre, we move out of our own point of view.

21.5 From the indirect reported message to writing The indirect reported message, which, by invalidating deixis, had been an enormous challenge for language, brought equally enormous cognitive gains. Syntactic subordination and the enormous device of deictic derivatives would have arisen, we have said, in order to make reported speech possible.12 These resources will be exploited by writing. Or, to be more precise, by that phase when writing seeks to replace oral communication. As is well known, the initial function of writing was not this, but the much simpler function of keeping lists – lists of articles, kings, tributes... In this first phase, writing, although it could have had wide social consequences (see Goody [1986]), did not involve any restructuring of language. In contrast, in the second phase, that is, when it operates as a substitute for complete oral communication, writing is a challenge that forces language to grow. Returning to our thread, this challenge and opportunity for growth only served to intensify what the indirect reported speech had already initiated within oral language itself. In this regard, Olson (2002) is right to link writing and the indirect reported speech. Both need more synsemantic resources than direct oral speech. 11. During narrative comprehension, pronouns (that is, the characters mentioned more than once in a story) guide action simulation. See Ditman et al. (2010). 12. Their emergence suggests why I have left the ‘direct reported speech’ to one side. But it may help to insist a little more on this question. The ‘direct reported speech’ is extraordinarily dangerous for communication. One cannot be sure that the hearer will overcome the temptation to take the deictic literally. It is completely logical for language to flee from direct reported speech on most occasions. It is only when it has theatrical characteristics (only with diction or gestuality that are somehow different to normal, or more specifically, which resemble those of symbolic play) that this type of speech is admissible. But we can also say that fictional narrations not only make the direct reported speech admissible, but actually require it. In fact, if we want hearers to remain imaginatively immersed in the fiction or narration, it will help if the narrator’s ‘I, here, and now’ fade as much as possible, it will help, in short, if the deictics are those of the characters in the fiction. Of course, the narrator must first situate the fiction in relation to his own speech act (this is never missing, even though these normal deictic elements may be reduced merely to a vague past time or remote location, as in the formula “once upon a time in a land far, far away” in fairy stories). However, once the narration is in motion and hearers’ attention has been turned on to the narrated world, once this has been achieved, anomalous or synsemantic deixis (that is, direct reported speech) will be frequent.

Chapter 21.â•‡ Toward complex syntax 

However, in spite of this similarity, there are many consequences of writing which are not shared by the indirect reported speech. I would point to cleft sentences, especially. In oral language, where intonation is always present (even in the subordinate clause of the indirect reported speech) the need to signal the theme/rheme structuring with morphologic or syntactic resources barely exists. In orality, in short, the abstraction formed by a syntactic combination stripped of its theme/rheme structuring never occurs (see 16.1). In contrast, writing, which would strip the message of not only its context but also its intonation, had to design the new syntactic form which is cleft sentences. More precisely, given that the validity of a sentence’s inferences depends on its theme/rheme structuring, language would have to design cleft sentences for contractual or legislative compositions, that is, for text where what matters is ‘the truth, the whole truth, and nothing but the truth’. This resource would thus come to involve a synsemanticisation that would make up for the loss in intonation and context. Before finishing up this point about the consequences of writing, it will be useful to address the syntactic innovation which has arisen in the literature in the past two centuries and which has been called the quasi-indirect reported speech. In Lucy (1993), this style appears within the general framework of reported speech. The quasi-indirect style would be appropriate especially to refer to the thoughts of a character rather than to refer to her speech acts. Giving the thoughts of a character as a direct reported speech may create excessive verbalisation, which almost certainly will sound false. Such verbal clarification is not admissible in inner speech, in one’s own thoughts. But, on the other extreme, if the narrator relates them in indirect reported speech, the immediacy of the character’s point of view will be lost. ‘That day’, in short, does not resound with the urgency of ‘today’, we can say along with Kaplan (1989). These are the two dangers which the novelists who attempted to depict their characters’ stream of consciousness found, which is why the middle way of the quasi-indirect style arose: the character’s point of view is respected, but without falling into the undesirable verbosity of the direct reported speech. However, if both the cleft sentences and, above all, the almost anecdotal quasiindirect style are principally literary resources, is it worth us bothering with them? The following is an initial response. Those resources are, it is clear, literary in nature, but language is used there too. If we were interested by the indirect reported speech, it was because of the richness of both the linguistic and cognitive resources to which it gave rise. Therefore, and for the very same reason, all linguistic advances connected to writing should be of interest to us.13

13. Writing also has an influence on deductive reasoning. This influence, which (we clarify in honour of Scribner & Cole [1981]) would occur both on the historical and (to some degree, little as it may be) individual levels, was the last issue Vygotsky worked on (it was published much later, in Luria [1976]). In Bejarano (1999b) I attempt to re-elaborate Vygotsky’s intuition. Both monotonic thought and the formal logical method would be derived from writing.

 Becoming Human

But there is also another more specific reason why I have wanted to highlight these advances. Are we accepting that writing, a late and absolutely not universal event, has succeeded in influencing some aspects of syntax? In that case, the historical condition would become very prominent. All this leads us, in the end, to a question that fits nicely in these final lines, and allows us to insist on the approach we have taken to the relationship between biology and history. Human exclusivity clearly has a biological foundation (it is precisely the decisive core of this foundation that we have attempted to find) but there would be no need to include within the biological foundation itself any feature which could be historically derivable from it. The task I recommend, as the reader well knows, is to explore the evolutionary genesis of the biological base, and the historical derivation of the other exclusive characteristics.

Preliminary conclusion and the main thesis recapitulated The main thesis of this work is not new at all. The idea of a second mental centre within the very own mind can be found nowadays within the studies of the Theory of the Mind – more specifically in its simulationist trend – or also in Bräten, who uses the the term ‘altercentrism’ to refer to that idea. Going back in time, we would also have to point at the Piagetian concept of decentration, and even at the philosophers who treated sym-pathy or at the Verstehen tradition. Thus, it is not the main thesis that I can present as a contribution. The only thing that I aimed at is to propose some changes about what would be the abilities enabled by the second mental centre. On the one side, I have increased the list of those abilities by adding some new ones; on the other side, I have rejected some of those which, especially as a result of the discovery of mirror-neurons, had been added. With that double movement, the field enabled by the second mental centre has become more similar to that of the exclusively human abilities. In the general evaluation, and if we dispense with mirror neurons, we can say that the traditional picture has grown by both extremes. In one of the new inclusions proposed in the book – that of the pointing gesture –, that picture is surpassed by its simplest extreme. However, the origin of predication and syntax is added on the opposite extreme. That origin (which, according to my hypothesis, coincides with the grasping of a knowledge which is inferior to one’s own) would certainly be linked to communication and interpersonality. Nevertheless, syntax, once built, would hold up some aspects of intrapersonal intelligence and cognition (logical reasoning, at the very least). Therefore, we can say that on this extreme, that is, on the most complex extreme, the field of derivations of the second mental centre could be substantially extended with regards to its traditional limits. Certainly, this remains far from proposing that the grasping of a radically not-own self was crucial for the original emergence of all the exclusively human intellectual abilities. Certainly, this latest proposal is not a part of the hypothesis presented in the book. However, I think that the development proposed here, that is, my treatment of the second mental centre, has achieved to take this concept a little bit beyond the field (social or moral or interpersonal) where its derivations were usually confined. According to my hypothesis, there would be a wide range of exclusively human abilities which derive from the grasping of a radically not-own self.

 Becoming Human

To recapitulate here the substance of the book, I am now going to outline in a table the sequence of causes and effects which have been suggested throughout this book. 1. A new and more cooperative life style > The emergence of the second mental centre: Pointing gesture and four-hand tasks (Genuine simulation of a self which is staring at me, or which approaches me, i.e., of a radically not-own self) 2. Plan of four-hand actions (the result of at least one foreign movement must be imagined) > Latent imitation of motor sequences (the postural results of each step must be imagined: The big extension of the simulatory centre or, in other words, the new function that got to be performed by this centre) and imitative learning of complex motor patterns 2.1 Imitative motor learning > an ever-increasing number of techniques > the requests during the implementation of tasks must be more and more differentiated > Communication had to incorporate the ability to imitate sequential motor patterns 2.2 Latent imitation of motor sequences (in latent imitation, the postural results of each step must be imagined) > evocation of the model scene (this evocation, which arises during muscularly displayed imitation, would complete the big extension, or new function, of the second centre) 2.3 Imitation of self-perceptible motor patterns (that is, manual or vocal motor patterns) > the evoked content can no longer be centred on the external aspect of the movements > more types of contents can be evoked 2.4 Voice joins the communicative force of old emotional intonation (direct and immediate impact on the receiver) and articulatory-phonetic patterns (linked to the second mental centre. During learning, latent motor imitation; after learning, reception in production-format) > primacy of the vocal communicative modality 3. Learned communicative signs > need of the protodeclarative 4. Protodeclarative > a new meaning that is linked to an object and, with it, is freed from old ambiguity between the ordering of an action (an interpretation that ceases to be available) and the asking for an object > a well-informed recipient can understand a false belief of the speaker > the primitive, pregrammatical syntax (‘the interlocutor’s message, plus this message’s correction’) 5. Repeated sequences as this (expressive speech – reactions of incomprehension by hearers – association of these reactions to the term used in the expressive speech) > grammatical, syntactic links of words 6. Grammatical, syntactic links > partial interrogations (the unknown element can only be designated through the description of the specific syntactic role it occupies) > need of more varied resources specifying the syntactic role 7. Reported speech > full syntax (syntactic subordination, deictic derivatives)

Preliminary conclusion and the main thesis recapitulated 

In relation to the whole of this table, two pairs of concepts must be underlined. i. Biological innovation/Historical acquisition ii. Interpersonal processes/Intrapersonal processes Regarding the general issue dealt with in the book, there would be other interesting questions. For example, if it is considered credible that syntax and also inner speech have an influence on other exclusively human abilities, we would have to ask ourselves, What relationship could there be between syntax and creative intelligence? Or also, Is inner speech involved at all in the ability to self-regulate one’s own attention, thought and behaviour? The same happens with other different question, Which of the particularities of the brain go with the second mental centre? Certainly my curiosity regarding these unanswered questions is huge: If curiosity were a virus which would crouch down on the pages of a book, you readers would already be infected. However, these questions are practically absent. But this is not what needs to be corrected, but the overall of what has in fact been done. I certainly have offered a great deal of data in favour of the hypothesis; in addition, there is coherence between its different parts. However, I am aware that this is not enough at all. We would have to keep on correcting and specifying and working much more: that, you can all notice. But my turn ends here.

References Abbot-Smith, K., Lieven, E. & Tomasello, M. 2004. Training 2;6-year-olds to produce the transitive construction: the role of frequency, semantic similarity and shared syntactic distribution. Developmental Science, 7, 48–55. Aitchison, J. 1998. The Seeds of Speech. Cambridge: Cambridge University Press. Alexander, R. D. 1962. Evolutionary change in cricket acoustical communication. Evolution, 16, 443–467. Albrecht, K., Volz, K. G., Sutter, M., Laibson, D. I. & von Cramon, D. Y. (in press). What is for me is not for you: brain correlates of intertemporal choice for self and other. Social Cognitive & Affective Neuroscience. Allport, F. H. 1924. Social Psychology. Cambridge, Ma. The Riverside Press. Alston, W. P. 1964. Philosophy of Language. Englewood-Cliffs, N. J.: Prentice-Hall. Ambrose, S. H. 2001. Palaeolithic technology and human evolution. Science, 291, 1748–53. Amsterlaw, J. & Wellman, H. M. 2006. Theories of mind in transition: A microgenetic study of the development of false belief understanding. Journal of Cognition and Development 7, 139–172. Anderson, M. L. and Oates, T. 2003. Prelinguistic agents will form only egocentric predicates. Behavioral and Brain Sciences, 26, 284–5. Anderson, M. L. 2008. Circuit sharing and the implementation of intelligent systems. Connection Science, 20, 239–251. Anderson, M. L. 2010. Neural reuse: A fundamental organizational principle of the brain. Behavioral and Brain Sciences, 33, 245–266. Andrews, M., Vigliocco, G. & Vinson, D. 2009. Integrating experiential and distributional data to learn semantic representations. Psychological Review, 116, 463–498. Andrews, M., Vigliocco, G. 2010. The Hidden Markov Topic Model: A Probabilistic Model of Semantic Representation. Topics in Cognitive Science, 2, 101–113. Anisfeld, M. 1991. Neonatal imitation: A review. Developmental Review, 11, 60–97. Anisfeld, M. 1996. Only tongue protrusion modeling is matched by neonates. Developmental Review, 16, 149–161. Apperly, I. A. & Robinson, E. J. 2003. When can children handle referential opacity? Journal of Experimental Child Psychology, 85, 297–311. Arbib, M. 2005. From monkey-like action recognition to human language: An evolutionary framework for neurolinguistics. Behavioral and Brain Sciences, 28, 105–124. Armstrong, D. F. 2003. Creative solution to an old problem. Behavioral and Brain Sciences, 26, 211–212. Asendorpf, J. B., Warkentin, V. & Baudonnière P. M. 1996. Self-awareness and other-awareness II: mirror self-recognition, social contingency awareness, and synchronic imitation. Developmental Psychology, 32, 313–321. Bar, M. 2007. The proactive brain: Using analogies and associations to generate predictions. Trends in Cognitive Sciences, 11, 280–289.

 Becoming Human Barclay, J. R., Bransford, J. D., Franks, J. J., McCarrell, N. S. and Nitsch, K. 1974. Comprehension and Semantic Flexibility. Journal of Verbal Learning and Verbal Behavior, 13, 471–481. Bard, K. A. 2009. Social cognition: Evolutionary history of emotional engagements with infants. Current Biology, 19, R941–R943. Baron-Cohen, S. 1999. The Evolution of a Theory of Mind. In TheDescent of Mind, M. C. Corballis & S. E. G. Lea (eds.), 261–277. Oxford: Oxford University Press. Barresi, J. &. Moore, C. 1996. Intentional relations and social understanding. Behavioral and Brain Sciences 19, 107–122. Barresi, J. (in press). On seeing our selves and others as persons. New Ideas in Psychology. Barrett, A. M., Foundas, A. L. & Heilman, K. M. 2005 Speech and gesture are mediated by independent systems. Behavioral and Brain Sciences, 28, 125–126. Barsalou, L. W. 2009. Situating concepts. In The Cambridge Handbook of Situated Cognition, P. Robbins, M. Aydede (eds.), 236–263. Cambridge/New York: Cambridge University Press. Bates, E., Camaioni, L. & Volterra, V. 1975. The acquisition of performatives prior to speech. Merrill-Palmer Quarterly, 21, 205–224. Batki, A., Baron-Cohen, S., Wheelwright, S., Connellan, J., & Ahluwalia, J. 2000. Is there an innate gaze module? Evidence from human neonates. Infant Behavior and Development, 23, 223–229. Bays, P. M., Flanagan, J. R. & Wolpert, D. M. 2006. Attenuation of self-generated tactile sensations is predictive, not postdictive. PLoS Biol 4(2): e28. Bechtel, W. 2009. Explanation: Mechanism, Modularity and Situated Cognition. In The Cambridge Handbook of Situated Cognition, P. Robbins, M. Aydede (eds.), 155–170. Cambridge/New York: Cambridge University Press. Bejarano, T. 1985. Comunicación Descentrada y Creatividad. Tesis doctoral no publicada. http://fondosdigitales.us.es/tesis/tesis/673/comunicacion-descentrada-y-creatividad/ Bejarano, T. 1989.Sobre la génesis de la conciencia de sí mismo. Thémata, 6, 23–44. Bejarano, T. 1991a. Sobre la negación. En busca de un nuevo argumento contra el origen intrapersonal de ese tipo de pensamiento. Pensamiento, 188, 469–479. Bejarano, T. 1991b. La metáfora como resolución de un problema comunicativo-lingüístico. Diálogos, 58, 129–162. Bejarano, T. 1992. Las dos triangulaciones: hacia el objeto externo y hacia el contraste objetivo/ subjetivo. Fragmentos de Filosofía, 2, 23–52. Bejarano, T. 1993. Recensión de A. García Calvo, “Hablando de lo que habla”. Er, 15, 251–261. Bejarano, T. 1994. Consideraciones cognitivas al hilo de una asimetría lingüística. Actas del X Congreso de Lenguajes Naturales y Formales, C. Martín Vide (ed.), 365–367. Sevilla: Promociones y Publicaciones Universitarias. Bejarano, T. 1995. Las emociones ante la ficción. Thémata, 13, 73–95. Bejarano, T. 1997. La explicación de la conciencia: ¿Qué se puede hacer hoy? Revista de Filosofía, 17, 83–104. Bejarano, T. 1999 a. Recensión de A. Gopnik y A. N. Meltzoff, “Words, Thoughts and Theories”. Thémata, 21, 309–317. Bejarano, T. 1999 b. Progreso evolutivo e histórico y deducción: Comprensión prelingüística de una conducta, lenguaje y escritura. Thémata, 21, 45–67. Bejarano, T. 1999 c. Prelinguistic metaphors? Pragmatics & Cognition, 7, 361–373. Bejarano, T. 2000. El ‘sentido’ de Frege, estado mental de segundo orden: Replanteamiento pragmático-cognitivo de algunas cuestiones fregeanas. Revista de Filosofía, 23, 213–233. Bejarano, T. 2003 a. El gesto de apuntar: Una curiosa exclusividad humana. Thémata, 30, 71–82.

References  Bejarano, T. 2003 b. Metarepresentation and human capacities. Pragmatics & Cognition, 11, 93–140. Bejarano, T. 2004. Acerca de la sintaxis originaria: Una crítica de la reciente propuesta de Hurford. El Catoblepas, 28, http://www.nodulo.org/ec/2004/n028p01.htm Bejarano, T. 2008. Pragmatics and theory of mind: a problem exportable to the origins of language. In The Evolution of Language: Proceedings of the 7th International Conference (EVOLANG7), A. D. M. Smith, K. Smith & R. Ferrer i Cancho (eds.), 18–25. Singapore: World Scientific Press. Bejarano, T. 2010a. Review: The Origins of Meaning, by Hurford, 2007. Teorema, 29, 157–163. Bejarano, T. 2010b. Autorregulación y libertad. Thémata, 43, 65–86. Bem, D. J. & Allen, A. 1974. On predicting some of the people some the time: The search for cross-situational consistencies in behavior. Psychological Review, 81, 506–520. Benson, J., Fries, P., Greaves, W., Iwamoto, K., Savage-Rumbaugh, S. & Taglialatela, J. 2002. Confrontation and support in bonobo-human discourse. Functions of Language, 9, 1–38. Benveniste, E. 1965. Le langage et l´expérience humaine. Diogène, 51, 3–13. Benveniste, E. 1966. Communication animale et langage humain (first publication, 1952). In ProbleÌ•mes de linguistique geÌ†neÌ†rale, 56–62. Paris: Gallimard. Benveniste, E. 1966. De la subjectivité dans le langage (first publication, 1958). In ProbleÌ•mes de linguistique geÌ†neÌ†rale, 258–266. Paris: Gallimard. Benveniste, E. 1966. La nature des pronoms (first publication, 1956). In ProbleÌ•mes de linguistique geÌ†neÌ†rale, 251–258. Paris: Gallimard. Bermúdez, J. L. 2003. Thinking Without Words. New York: Oxford University Press. Bickerton, D. 2008. Bastard Tongues: A Trailblazing Linguist Finds Clues to Our Common Humanity in the World’s Lowliest Languages. New York: Hill and Wang. Bickerton, D. 2009. Adam’s Tongue: How Humans Made Language, How Language Made Humans. New York: Hill and Wang. Bickerton, D. 2009. Syntax for Non-Syntacticians: A Brief Primer. In Biological Foundations and Origin of Syntax, D. Bickerton and E. Szathmáry (eds.), 3–14. Cambridge, Massachussets: MIT Press. Bigelow, A. E., MacLean, K. & Proctor, J. (2004). The role of joint attention in the development of infants’ play with objects. Developmental Science, 7, 518–526. Bjorklund, D. F. & Ellis, B. J. 2005. Evolutionary Psychology and Child Development: An emergent synthesis. In Origins of the Social Mind: Evolutionary Psychology and Child Development, D. F. Ellis & B. J. Bjorklund (eds.), 3–18. New York: Guilford Press. Blagrove, M., Blakemore, S. & Thayer, B. R. J. 2006. The ability to self-tickle following Rapid Eye Movement sleep dreaming. Consciousness and Cognition, 15, 285–294. Blakemore, S. J., Frith, C. D. & Wolpert, D. M. 1999. Spatiotemporal prediction modulates the perception of self-produced stimuli. Journal of Cognitive Neuroscience, 11, 551–559. Blakemore, S. J., Wolpert, D. M. & Frith, C. D. 1998. Central cancellation of self-produced tickle sensation. Nature Neuroscience, 1, 635–640. Bloch, M. 2010. Kinship terms are not kinship. Behavioral and Brain Sciences, 33, 384–384. Bloom, L. 1973. One Word at a Time. The Hague: Mouton. Blumberg, M. S. 2009. Freaks of Nature. Oxford/New York: Oxford University Press. Bock, K. 1986. Syntactic persistence in language production. Cognitive Psychology, 18, 355–387. Boesch, C. 2005. Joint cooperative hunting among wild chimpanzees: Taking natural observations seriously. Behavioral and Brain Sciences, 28, 692–693.

 Becoming Human Bompas, A. & O’ Regan, J. K. 2006a. Evidence for a role of action in colour perception. Perception, 35, 65–78. Bompas, A. & O’ Regan, J. K. 2006b. More evidence for sensorimotor adaptation in color perception. Journal of Vision, 6, 145–153. Bouissac, P. 2010. Expressive smiles or leucosignals? Behavioral and Brain Sciences, 33, 436–437. Bowden, E. M., Jung-Beeman, M., Fleck, J., Kounios, J. 2005. New approaches to demystifying insight. Trends in Cognitive Sciences, 9, 322–328. Boyer, P. 2008. Evolutionary economics of mental time travel?. Trends in Cognitive Sciences, 12, 219–224. Brand, R. J., Baldwin, D. A. & Ashburn, L. A. 2002. Evidence for ‘motionese’: modifications in mothers’ infant-directed action. Developmental Science, 5, 72–83. Brand, M. & Shallcross, W. L. 2008. Infants prefer motionese to adult-directed action. Developmental Science, 11, 853–861. Brass, M. & Heyes, C. 2005. Imitation: is cognitive neuroscience solving the correspondence problem? Trends in Cognitive Sciences, 9, 489–495. Brass, M., Schmitt, R. M., Spengler, S. & Gergely, G. 2007. Investigating action understanding: Inferential processes versus action simulation. Current Biology, 17, 2117–2121. Bräten, S. 1998. Infant Learning by Altercentric Participation. In Intersubjective Communication and Emotion in Early Ontogeny, S. Bräten (ed.), 104–124. Cambridge: Cambridge University Press. Breheny, R. 2006. Communication and Folk Psychology. Mind & Language, 21, 74–107. Bruner, J. 1983. Child’s Talk: Learning to Use Language. New York: Norton. Bruner, J. S. 1977. Early social interaction and language development. In Studies in Mother-Child Interaction, H. R. Schaffer (ed.), 271–289. London: Academic Press. Bugnyar, T., Stöwe, M. & Heinrich, B. 2004. Ravens, Corvus Corax, Follow Gaze Direction of Humans around Obstacles. Proceedings of the Royal Society. B (Biological sciences), 271, 1331–1336. Bugnyar, T. & Heinrich, B. 2005. Ravens, Corvus corax, differentiate between knowledgeable and ignorant competitors. Proceedings of the Royal Society. B (Biological sciences), 272, 1641–1646. Bühler K. 1990 (German, 1934). Theory of Language: The Representational Function of Language. Amsterdam: John Benjamins. Bulhof, J. & Gimbel, S. 2001. Deep tautologies. Pragmatics & Cognition, 9, 279–292. Burkart, J. M., Fehr, E., Efferson, C. & van Schaik, C. P. 2007. Other-regarding preferences in a non-human primate. Proceedings of the National Academy of Sciences USA, 104 (50), 19762–19766. Butterworth, G. & Jarrett, N. 1991. What minds share in common is space. British Journal of Developmental Psychology, 9, 55–72. Byrne, R. W. & Russon, A. E. 1998. Learning by imitation: A hierarchical approach. Behavioral and Brain Sciences, 21, 667–721. Byrne, R. W. 2002. Imitation of novel complex actions: What does the evidence from animals mean? Advances in the Study of Behavior, 31, 77–105. Byrne, R. W. 2009. Animal imitation. Current Biology, 19, R111–R114. Caggiano, V., Fogassi, L., Rizzolatti, G., Thier, P. & Casile, A. 2009. Mirror Neurons Differentially Encode the Peripersonal and Extrapersonal Space of Monkeys. Science, 324, 403–406. Call, J. 2004. Inferences about the location of food in the great apes. Journal of Comparative Psychology, 118, 232–241. Call, J. (in press). Do apes know that they could be wrong? Animal Cognition.

References  Calvin, W. H. & Bickerton, D. 2000. Lingua ex Machina. Cambridge, Massachussets: MIT Press. Capirci, O. & Volterra, V. 2008 Gesture and speech. The emergence and development of a strong and changing partnership. Gesture, 8, 22–44. Carpendale, J. I. M. & Carpendale, A. B. 2010.The Development of Pointing: From Personal Directedness to Interpersonal Direction. Human Development, 53, 110–126. Carpendale, J. I. M. & Racine, T. P. (in press). Intersubjectivity and egocentrism: Insights from the relational perspectives of Piaget, Mead, and Wittgenstein. New Ideas in Psychology. Carruthers, G. (in press). The case for the comparator model as an explanation of the sense of agency and its breakdowns. Consciousness and Cognition. Carruthers, P. 2002. The roots of scientific reasoning: Infancy, modularity and the art of the tracking. In The Cognitive Basis of Science, P. Carruthers, P. S. Stich & M. Siegal (eds.), 73–95. Cambridge/New York: Cambridge University Press. Carstairs-McCarthy, A. 1999. The Origins of Complex Language: An Inquiry into the Evolutionary Beginnings of Sentences, Syllables and Truth. Oxford: Oxford University Press. Carver, C. S. 2005. Action and Affect. In Handbook of Self-Regulation, R. F. Baumeister & K. D. Vohs (eds.), 11–35. New York: Guilford Press. Casielles, E. & Progovac, L. 2010. On protolinguistic “fossils”: Subject-Verb vs. Verb-Subject Structures. In Proceedings of the 8th International Conference (Evolang 8), A. Smith, M. Schouwstra, B. De Boer & K. Smith (eds.), 66–73. Singapore: World Scientific. Catchpole, C. K. & Slater, P. J. B. 2008. Bird Song: Themes and Variations. Cambridge: Cambridge University Press. Catmur, C. Walsh, V. & Heyes, C. 2007. Sensorimotor learning configures the human mirror system. Current Biology, 17, 1527–1531. Chang, B. & Vermeulen, N. 2010. Re-thinking the causes, processes, and consequences of simulation. Behavioral and Brain Sciences, 33, 441–442. Cheney, D. L. & Seyfarth, R. M. 1980. Vocal recognition in freeranging vervet monkeys. Animal Behaviour, 28, 362–367. Cheney, D. & Seyfarth, R. 1992. Précis of ‘How monkeys see the world’. Behavioral and Brain Sciences, 15, 135–182. Christiansen, M. H. and Chater, N. 2008. Language as shaped by the brain. Behavioral and Brain Sciences, 31, 489–509. Christie, J. & Barresi, J. 2002. Using illusory line motion to differentiate misrepresentation (Stalinesque) and misremembering (Orwellian) accounts of consciousness. Consciousness & Cognition, 11, 347–365. Chun, M. M., Golomb, J. D. & Turk-Browne, N. B. (in press). A Taxonomy of External and Internal Attention. Annual Review of Psychology, 62, 73–101. Churchland, P. M. 1981. Eliminative Materialism and the Propositional Attitudes. Journal of Philosophy, 78, 67–90. Churchland, P. M. 1988. Matter and Consciousness: A Contemporary Introduction to the Philosophy of Mind. Cambridge, Massachussets: MIT Press. Clancey, W. J. 2009. Scientific antecedents of situated cognition. In Situated Cognition. P. Robbins & M. Aydede (eds.), 11–34. Cambridge/New York: Cambridge University. Press. Clark, A. 1997. Being There: Putting Brain, Body, And Word Together Again. Cambridge, Massachussets: MIT Press. Clark, A. & Chalmers, D. J. 1998. The extended mind. Analysis 58, 7–19. Clark, A. & Thornton, C. 1997. Trading spaces: Computation, representation, and the limits of uninformed learning. Behavioral and Brain Sciences, 20, 57–66.

 Becoming Human Clark, A. 2001. Reasons, Robots and The Extended Mind. Mind and Language, 16, 121–145. Clements, W. A. & Perner, J. 1994. Implicit understanding of belief. Cognitive Development, 9, 377–395. Cochet, H. & Vauclair, J. 2010. Features of Gestural Communication in Infants and Children Favor the Gestural Hypothesis of Language Origin. In The Evolution of Language (EVOLANG 8), Smith, A. D. M., Schouwstra, M., de Boer, B. and Smith, K. (eds.), 383–385. Singapore: World Scientific. Cohen, F. S. 1929. What is a Question?. The Monist, 39, 350–64. Conty, L., Gimmig, D., Belletier, C., George, N. & Huguet, P. (in press). The cost of being watched: Stroop interference increases under concomitant eye contact. Cognition. Conway, Ch. M., Bauernschmidt, A., Huang, S. S. &. Pisoni, D. B. (in press). Implicit statistical learning in language processing: Word predictability is the key. Cognition Coolidge, F. L. & Wynn, T. 2005. Working Memory, its Executive Functions, and the Emergence of Modern Thinking. Cambridge Archaeological Journal, 15, 5–26. Corballis, M. C. 2003. From hand to mouth: Gesture, speech, and the evolution of right-handedness. Behavioral and Brain Sciences, 26, 199–208. Corballis, M. C. 2009. Mirror neurons and the evolution of language. Brain and Language, 112, 25–35. Corina, D. P., Poizner, H., Bellugi, U., Feinberg, T., Dowd, D. & O’Grady-Batch, L. 1992. Dissociation between linguistic and nonlinguistic gestural systems: a case for compositionality. Brain and Language, 43 (3), 414–447. Cosentino, E. (in press). Self in time and language. Consciousness and Cognition. Costantini, M. & Haggard, P. 2007. The rubber hand illusion: Sensitivity and reference frame for body ownership. Consciousness and Cognition, 16, 229–240. Csibra, G. 2007. Action mirroring and action interpretation: An alternative account. In Sensorimotor Foundations of Higher Cognition. Attention and Performance XXII, P. Haggard, Y. Rosetti, & M. Kawato (eds.), 435–459. Oxford: Oxford University Press. Csibra, G. 2010. Recognizing Communicative Intentions in Infancy. Mind & Language, 25, 141–168. Csibra, G. & Gergely, G. 2007. Social learning and social cognition: The case for pedagogy. In Processes of Change in Brain and Cognitive Development, M. H. Johnson & Y. Munakata (eds.), 249–274. Oxford: Oxford University Press. Custance, D. M., Whiten, A. & Bard, K. A. 1995. Can young chimpanzees imitate arbitrary actions? Hayes and Hayes (1952) revisited. Behaviour, 132, 839–858. Cutler A. 2008. The abstract representations in speech processing. Quarterly Journal of Experimental Psychology, 61, 1601–19. Dąbrowska, E. & Lieven, E. 2005. Towards a lexically specific grammar of children’s question constructions. Cognitive Linguistics, 16, 437–474. Damasio, A. R. & Tranel, D. 1993. Nouns and verbs are retrieved with differently distributed neural systems. Proceedings of the National Academy of Sciences USA, 90, 4957–4960. Damasio, A. 2000. The Feeling of What Happens: Body, Emotion and the Making of Consciousness. London: Vintage. Daprati, E., Franck, N., et al., 1997. Looking for the agent: An investigation into consciousness of action and self-consciousness in schizophrenic patients. Cognition, 65, 71–86. Darwin, C. R. 1965 (orig. 1872). The Expression of the Emotions in Man and Animals. Chicago: University of Illinois Press.

References  Dascal, M. 1983. Pragmatics and the Philosophy of Mind: Thought in Language. Amsterdam: Benjamins. Dascal, M. 1995. Epistemology, controversies, and pragmatics. Isegoría, 12, 8–43. Davidson, D. 1968. On Saying That. Synthese, 19, 130–46. Davidson, D. 1982. Rational animals. Dialectica, 36, 317–327. de Boer, B. & Zuidema, W. 2010. Multi-Agent Simulations of the Evolution of Combinatorial Phonology. Adaptive Behavior, 18, 141–154. de Villiers, J. & de Villiers, P. 1999. Linguistic determinism and the understanding of false beliefs. In Chíldren’s Reasoníng and the Mínd, P. Mitchell and K. Riggs (eds.), 191–228. New York: Psychology Press. de Waal, F. 1989. Peacemaking among primates. Cambridge, Massachussets: Harvard University Press. de Waal, F. B. M. & Ferrari, P. F. 2010. Towards a bottom-up perspective on animal and human cognition. Trends in Cognitive Sciences, 14, 201–207. Deacon, T. 1997. The Symbolic Species: The Co-evolution of Language and the Human Brain. New York: W. W. Norton. Decety, J. & Chaminade, T. 2003. When the self represents the other: A new cognitive neuroscience view on psychological identification. Consciousness and Cognition, 12, 577–596. Decety, J. & Grèzes, J. 1998. A neurobiological approach to imitation. Behavioral and Brain Sciences, 21, 688–689. Delgado, B., Gómez, J. C. & Sarriá, E. 2009. Private pointing and private speech: development of executive function. In Private Speech, Executive Functioning, and the Development of Verbal Self-Regulation, A. Winsler, Ch. Fernyhough & I. Montero (eds.), 153–162. Cambridge: Cambridge University Press Dennett, D. 1991. Consciousness Explained. Middlesex: Penguin Books. Dessalles, J.-L. 2007. Why We Talk: The Evolutionary Origins of Language. Oxford: Oxford University Press. Deutscher, G. 2005. The Unfolding of Language: The Evolution of Mankind’s Greatest Invention. London: Arrow. Devlin, J. T. & Aydelott, J. 2009. Speech Perception: Motoric Contributions versus the Motor Theory. Current Biology, 19, R198–R200. Diamond, Adele 2006. Bootstrapping conceptual deduction using physical connection: rethinking frontal cortex. Trends in Cognitive Sciences, 10, 212–218. Diamond, Arthur S. 1959. History and Origin of Language. London: Methuen. Dickinson, A. & Balleine, B. W. 2000. Causal cognition and goal-directed action. In The evolution of cognition, C. Heyes & L. Huber (eds.), 185–204. Cambridge, Massachussets: MIT Press. Diekelmann, S., Wilhelm, I., Wagner, U. & Born, J. (in press). Elevated Cortisol at Retrieval Suppresses False Memories in Parallel with Correct Memories. Journal of Cognitive Neuroscience. Dimitriou, M. & Edin, B. B. (2010). Human Muscle Spindles Act as Forward Sensory Models. Current Biology. Ditman, T., Brunyé, T. T., Mahoney, C. R. & Taylor, H. A. (2010). Simulating an enactment effect: Pronouns guide action simulation during narrative comprehension. Cognition Donald, M. 1991. Origins of Human Mind. Three Stages in the Evolution of Culture and Cognition. Cambridge, Massachussets: Harvard University Press. Donnellan, K. S. 1966. Reference and definite descriptions. Philosophical Review, 75, 281–304.

 Becoming Human Dretske, F. 1995. Meaningful perception. In An Invitation to Cognitive Science: Visual Cognition (Volume 2), S. M. Kosslyn & D. N. Oshershon (eds), 331–352. Cambridge, Massachussets: MIT Press. Dunbar, R. 1998. Grooming, Gossip, and the Evolution of Language. Cambridge, Massachussets: Harvard University Press. Eco, U. 1993. Trattato di semiotica generale. Milano: Bompiani. Ellis, B. J. & Bjorklund, D. F. 2005. Origins of the Social Mind: Evolutionary Psychology and Child Development. New York: Guilford Press. Ellis, R. & Tucker, M. 2000. Micro-affordance: The potentiation of components of action by seen objects. British Journal of Psychology, 91, 451–471. Emery, N. J. & Clayton, N. S. 2001. Effects of experience and social context on prospective caching strategies by scrub jays. Nature, 414, 443–446. Evans, J. St. B. T. 2003. In two minds: Dual process accounts of reasoning. Trends in Cognitive Sciences, 7, 454–459. Evans, J. St. B. T. 2009. How many dual-process theories do we need? One, two or many?. In In two minds, J. Evans & K. Frankish (eds.), 33–54. Oxford: Oxford University. Press. Everett, D. L. 2005. Cultural Constraints on Grammar and Cognition in Pirahã. Current Anthropology, 46, 621–634. Falk, D. 2004. Prelinguistic evolution in early hominins: Whence motherese. Behavioral and Brain Sciences, 27, 491–541. Farroni, T., Csibra, G., Simion, F. & Johnson, M. H. 2002. Eye contact detection in humans from birth. Proceedings of the National Academy of Sciences USA, 99, 9602–9605. Fehér, O., Wang, H., Saar, S., Mitra, P.P. & Tchernichovski, O. 2009. De novo establishment of wild-type song culture in the zebra finch. Nature, 459, 564–568. Feinberg, T. E. (in press). The nested neural hierarchy and the self. Consciousness and Cognition. Fernald, A. & Hurtado, N. 2007. Names in frames: infants interpret words in sentence frames faster than words in isolation. Developmental Science, 3, F33–F40. Fernald, A. 1989. Intonation and communicative intent: Is melody the message? Child Development, 60, 1497–1510. Ferrari, P. F., Gallese, V., Rizzolatti, G., Fogassi, L. 2003. Mirror neurons responding to the observation of ingestive and communicative mouth actions in the monkey ventral premotor cortex. European Journal of Neuroscience, 17, 1703–1714. Ferrari, P. F., Rozzi, S., Fogassi, L. 2005. Mirror neurons responding to the observation of actions made with tools in the monkey ventral premotor cortex. Journal of Cognitive Neuroscience, 17, 212–226. Ferrari, P. F., Visalberghi, E., Paukner, A., Fogassi, L., Ruggiero, A. & Suomi, S. J. 2006. Neonatal imitation in rhesus macaques. PLoS Biology, 4, e302. Ferrari P. F., Paukner A., Ionica C., Suomi S. J. 2009. Reciprocal face-to-face communication between rhesus macaque mothers and their infants. Current Biology, 19, 1768–1772. Fitch, W. T. 2004. Imitation, Quoting and Theory of Mind. In http://www.interdisciplines.org/ coevolution/papers/4/ Fitch, W. T., Huber, L. & Bugnyar, T. 2010. Social Cognition and the Evolution of Language: Constructing Cognitive Phylogenies. Neuron, 65, 795–814. Fitzsimons, G. M. & Bargh, J. A. 2005. Automatic Regulation. In Handbook of Self-Regulation R. F. Baumeister & K. D. Vohs (eds.), 151–170. New York: Guilford Press.

References  Flaherty, M. & Goldin-Meadow, S. 2010. Does Input matter? Gesture and Homesign in Nicaragua, China, Turkey and the USA. Evolution of Language (Evolang 8), 403–404. Singapore: World Scientific. Flanagan, O. 1992. Consciousness Reconsidered. Cambridge, Massachussets: MIT Press. Flavell, J. H., Everett, B. A., Croft, K. & Flavell, E. R. 1981.Young children’s knowledge about visual perception: Further evidence for the Level 1-Level 2 distinction. Developmental Psychology, 17, 99–103. Flavell, J., Flavell, E. & Green, F. 1983. Development of the appearance-reality distinction. Cognitive Development, 15, 95–120. Fodor, J. A. 1978. The Language of Thought. London: Harvester Press. Forrester, M. A. & Cherington, S. M. 2009. The development of other-related conversational skills: A case study of conversational repair during the early years. First Language, 29, 166–191. Fowler, C. A., Brown, J. M., Sabadini, L. & Weihing, J. 2003. Rapid access to speech gestures in perception: evidence from choice and simple response time tasks. Journal of Memory and Language, 49, 396−413. Foxton, J. M., Riviere, L. & Barone, P. (in press). Cross-modal facilitation in speech prosody. Cognition Fraser, H. 2004. Constraining abstractness: Phonological representation in the light of color terms. Cognitive Linguistics, 3, 239–288. Frege, G. 1980 (German, 1892). Über Sinn und Bedeutung. Zeitschrift für Philosophie und philosophische Kritik, 100: 25–50. Translated as ‘On Sense and Reference’ by M. Black in Translations from the Philosophical Writings of Gottlob Frege, P. Geach and M. Black (eds. and trans.). Oxford: Blackwell. Frey, S. H., Funnell, M. G., Gerry, V. E., Gazzaniga, M. S. 2005. A Dissociation between the Representation of Tool-use Skills and Hand Dominance: Insights from Left- and Right-handed Callosotomy Patients. Journal of Cognitive Neuroscience 17, 262–272. Frith, C. D., Blakemore, S. J. et al. 2000. Abnormalities in the awareness and control of action. Philosophical Transactions of the Royal Society of London Series B – Biological Sciences, 355, 1771–1788. Frith, U. & de Vignemont, F. 2005. Egocentrism, allocentrism, and Asperger syndrome. Consciousness and Cognition, 14, 719–738. Fuster, J. 2003. Cortex and Mind: Unifying Cognition. Oxford/New York: Oxford University Press. Fuster J. 2009. Cortex and Memory: Emergence of a New Paradigm. Journal of Cognitive Neuroscience, 21, 2047–2072. Galantucci, B. 2005. An experimental study of the emergence of human communication systems. Cognitive Science, 29, 737–767. Galantucci, B., Fowler, C. A. & Turvey, M. T. 2006. The motor theory of speech perception reviewed. Psychonomic Bulletin & Review, 13, 361–377. Gallese, V. 2003. The manifold nature of interpersonal relations: The quest for a common mechanism. Philosophical Transactions of the Royal Society, 358, 517–528. Gallese, V., Fadiga, L., Fogassi, L. and Rizzolatti, G. 1996. Action recognition in the premotor cortex. Brain, 119, 593–609. Gallese, V., Keysers, C. & Rizzolatti, G. 2004. A unifying view of the basis of social cognition. Trends in Cognitive Sciences, 8, 396–403. García Calvo, A. 1979. Del lenguaje. Madrid: Lucina. García Calvo, A. 1983. De la construcción: Del lenguaje II. Madrid: Lucina.

 Becoming Human Gattis, M., Bekkering, H. & Wohlschläger, A. 1998. When actions are carved at the joints. Behavioral and Brain Sciences, 21, 691–692. Georgieff, N. & Jeannerod, M. 1998. Beyond Consciousness of External Reality: A “Who” System for Consciousness of Action and Self-Consciousness. Consciousness and Cognition, 7, 465–477. Gentilucci, M. & Corballis, M. C. 2006. From manual gesture to speech: a gradual transition. Biobehavioral Review, 30, 949–960. Gergely, G., Bekkering, H. & Király, I. 2002. Rational imitation in preverbal infants. Nature, 415, 755. Gervain J. & Mehler J. 2010. Speech perception and language acquisition in the first year of life. Annual Review of Psychology. 61, 191–218. Ginsburg, S. & Jablonka, E. (in press). Experiencing: A Jamesian Approach. Journal of Consciousness Studies. Givón, T. & Malle, B. F. 2002. The Evolution of Language out of Pre-language. Amsterdam: Benjamins. Gleitman, L. & Gillette, J. 1995. The role of syntax in verb learning. In The Handbook of Child Language, P. Fletcher & B. MacWhinney (eds.), 413–427. London. Blackwell. Gleitman, L. R. 1990. The structural source of verb meaning. Language Acquisition: A Journal of Developmental Linguistics, 1, 3–55. Glenberg, A. M. 1997. What memory is for. Behavioral and Brain Sciences, 20, 1–56. Gliga, T. & Csibra, G. 2007. Seeing the face through the eyes: A developmental perspective on face expertise. Progress in Brain Research, 164, 323–339. Gobes, S. M., Zandbergen, M. A., Bolhuis, J. J. 2010. Memory in the making: localized brain activation related to song learning in young songbirds. Proceedings Biological Sciences, 277, 3343–3351. Goldberg, A. 1995. Constructions. A Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press. Goldie, P. 2002. Understanding Emotions: Mind and Morals. Aldershot, Hants, England/Burlington, VT: Ashgate Pub. Goldin-Meadow, S. 1999. The role of gesture in communication and thinking. Trends in Cognitive Sciences, 3, 419–429. Goldman, H. I. 2001. Parental reports of “MAMA” sounds in infants: An exploratory study. Journal of Child Language, 28, 497–506. Goldinger, S. D. (1998), ‘Echoes of echoes?: an episodic theory of lexical access’, Psychological Review 105, 251–279. Gómez, J. C. 1998. Do concepts of intersubjectivity apply to non-human primates? In Intersubjective Communication and Emotion in Early Ontogeny, S. Bräten (ed.), 245–259. Cambridge: Cambridge University Press. Gómez, J. C. 2004. Apes, Monkeys, Children and the Growth of Mind. Cambridge, Massachusetts/ London: Harvard University Press. Gómez, J. C., Whiten, A., Custance, D., Teixidor, P. & Bard, K. A. 1996 Imitative learning of artificial fruit processing in children (homo sapiens) and chimpanzees (pan troglodytes). Journal of Comparative Psychology; 110, 3–14. Goodglass, H., Klein, B., Carey, P. and Jones, K. 1966. Specific semantic word category in aphasia. Cortex 2, 74–89. Goody, J. 1986. La logique de l’écriture: aux origines des sociétés humaines. Paris: Armand Colin.

References  Gopnik, A. & Meltzoff, A. N. 1997. Words, Thoughts and Theories. Cambridge, Massachussets: MIT Press. Grassmann, S. and Tomasello, M. 2009. Young children follow pointing over words in interpreting acts of reference. Developmental Science, 13, 252–263. Graziano, M., Taylor, C., Moore, T. & Cooke, D. 2002. The Cortical Control of Movement Revisited. Neuron, 36, 349–362. Greenfield, P. M. & Smith, J. H. 1976. The Structure of Communication in Early Language Development. New York: Academic Press. Grice, H. P. 1957. Meaning. The Philosophical Review, 66, 377–88. Grush, R. 2004. The emulation theory of representation: Motor control, imagery, and perception. Behavioral and Brain Sciences, 27, 377–442. Guillaume, G. 1982 (French, 1948–49). Leçons de linguistique de Gustave Guillaume 1948–1949. Grammaire particulière du français et grammaire générale (IV). Québec: Les Presses de l´Université Laval. 2ème edition Haggard, P. & Wolpert, D. M. 2005. Disorders of Body Scheme. In Higher-Order Motor Disorders, H. J. Freund, M. Jeannerod, M. Hallett & R. C. Leiguarda (eds.), 261–271. Oxford: Oxford University Press. Hamilton, W. D. 1964. The genetical evolution of social behaviour. Journal of Theoretical Biology, 7, 1–52. Hamilton, A. F. & Grafton, S. T. 2006. Goal representation in human anterior intraparietal sulcus. The Journal of Neuroscience, 26, 1133–7. Hare, B. 2001. Can competitive paradigms increase the validity of experiments on primate social cognition. Animal Cognition, 4, 269–280. Hare, B., Brown, M., Williamson, C. & Tomasello, M. 2002. The domestication of social cognition in dogs. Science, 298, 1636–1639. Hare, B. & Tomasello, M. 2004. Chimpanzees are more skilful in competitive than in cooperative cognitive tasks. Animal Behaviour, 68, 571–581. Hare, B., Plyusnina, I., Ignacio, N., Schepina, O., Stepika, A., Wrangham, R. & Trut, L. 2005. Social cognitive evolution in captive foxes is a correlated by-product of experimental domestication. Current Biology, 15, 226–230. Harris, J. R. 2009. Attachment theory underestimates the child. Behavioral and Brain Sciences, 32, 30–30. Haugeland, J. 1998. Having Thought: Essays in the Metaphysics of Mind. Cambridge, Massachussets: Harvard University Press. Hauser, M., Chomsky, N. & Fitch, W. T. 2002. The Faculty of Language: What Is It, Who Has It, and How Did It Evolve? Science, 298, 1569–1579. Heilman, K. M., Barrett, A. M. & Adair, J. C. 1998. Possible mechanisms of anosognosia: a defect in self-awareness. Philosophical Transactions of the Royal Society of London (Biological Science series), 353, 1903–9. Heine, B. 2003. Grammaticalization. In The Handbook of Historical Linguistics, B. D. Joseph & R. D. Janda (eds.), 575–601. Oxford: Blackwell. Heine, B. & Kuteva, T. 2002. On the evolution of grammatical forms. In The transition to language, A. Wray (ed.), 376–397. Oxford: Oxford University Press. Heine, B. & Kuteva, T. 2007. The Genesis of Grammar: A Reconstruction. New York: Oxford University Press. Henshilwood, C., d’Errico, F., et al. 2002. Emergence of Modern Human Behavior: Middle Stone Age Engravings from South Africa. Science, 295, 1278–1280.

 Becoming Human Herr-Israel, E. & Mccune, L. (in press). Successive single-word utterances and use of conversational input: a pre-syntactic route to multiword utterances. Journal of Child Language. Heyes, C. 2001. Causes and consequences of imitation. Trends in Cognitive Sciences, 5, 253–261. Heyes, C. 2008. Imitation as a conjunction. Behavioral and Brain Sciences, 31, 28–29. Heyes, C. M. & Ray, E. 2000. What is the significance of imitation in animals? Advances in the Study of Behavior, 29, 215–245. Heyes, C. M., Bird, G., Johnson, H. & Haggard, P. 2005. Experience modulates automatic imitation. Cognitive Brain Research, 22, 233–240. Hickok, G. 2009. Eight Problems for the Mirror Neuron Theory of Action Understanding in Monkeys and Humans. Journal of Cognitive Neuroscience, 21, 1229–1243. Hierro Sánchez-Pescador, José. 1990. Significado y verdad: ensayos de semántica filosófica. Madrid. Alianza. Hietanen, J. K. & Perrett, D. I. 1993. Motion sensitive cells in the macaque superior temporal polysensory area1: Lack of response to the sight of the animals own limb movement. Experimental Brain Research, 93, 117–128. Hockett, C. F. 1960. The Origin of Speech. Scientific American, 203, 89–97. Hockett, C. F. 1963. The problem of universals in language. In Universals of Language, J. Greenberg (ed.), 1–22. Cambridge, Massachussets: MIT Press. Hogendoorn, H., Kammers, M. P. M., Carlson, T. A. & Verstraten, F. A. J. 2009. Being in the dark about your hand: Resolution of visuo-proprioceptive conflict by disowning visible limbs. Neuropsychologia, 47, 2698–2703. Holekamp, K. E. 2007. Questioning the social intelligence hypothesis. Trends in Cognitive Sciences, 11, 65–69. Hollis K. L. 1982. Pavlovian conditioning of signal-centered action patterns and autonomic behavior: a biological analysis of function. Advances in the Study of Behavior, 12, 1–64. Holloway, R. L. 2003. Was a manual gesturing stage really necessary? Behavioral and Brain Sciences, 26, 223–224. Holmes, N. P. & Spence, Ch. 2007. Dissociating body image and body schema with rubber hands. Behavioral and Brain Sciences, 30, 211–212. Hopkins, W. D., Taglialatela, J. P. & Leavens, D. A. 2007. Chimpanzees differentially produce novels vocalizations to capture the attention of a human. Animal Behaviour, 73, 281–286. Hopper, P. K. 2008. Emergent serialization in English. In Linguistic Universals and Language Change, J. Good (ed.), 253–284. New York: Oxford University Press. Horn, L. R. 1989. A Natural History of Negation. Chicago: University of Chicago Press. Houston-Price, C., Plunkett, K. & Harris, P. 2005. ‘Word-learning wizardry’ at 1;6. Journal of Child Language, 32, 175–189. Huang, Y. T. & Snedeker, J. (in press). Cascading activation across levels of representation in children’s lexical processing. Journal of Child Language. Huber, L., Range, F., Voelkl, B., Szucsich, A., Viranyi, Z., and Miklosi, A. 2009. The evolution of imitation: what do the capacities of nonhuman animals tell us about the mechanisms of imitation? Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 364, 2299–2309. Humphrey, N. 1976. The social functions of intellect. In Growing Points in Ethology, P. P. Bateson & R.A. Hinde (eds.), 303–317. Cambridge/Nueva York: Cambridge University Press. Hurford, J. 1989. Biological Evolution of the Saussurean Sign as a Component of the Language Acquisition Device, Lingua, 77, 187–222.

References  Hurford, J. 2002. The Roles of Expression and Representation in Language Evolution. In The Transition to Language, A. Wray (ed.), 311–334. Oxford: Oxford University Press. Hurford, J. 2003. The Neural Basis of Predicate-Argument Structure. Behavioral and Brain Sciences, 26, 261–283. Hurford, J. 2004. Language beyond our grasp: what mirror neurons can, and cannot, do for language evolution, in Evolution of Communication Systems: A Comparative Approach, D. Kimbrough Oller and Ulrike Griebel (eds.), 297–313. Cambridge, Massachussets: MIT Press. Hurford, J. 2007a. A performed practice explains a linguistic universal: Counting gives the Packing Strategy. Lingua, 117, 773–783. Hurford, J. 2007b. The origin of noun phrases: reference, truth and communication. Lingua, 117, 527–542. Hurford, J. R. 2007c. The Origins of Meaning. Oxford: Oxford University Press. Hurley, S. L. 2005. The shared circuits model. How control, mirroring, and simulation can enable imitation and mind reading. http://www.interdisciplines.org/mirror/papers/5 Hutto, D. (in press). Exposing the Background: Deep and Local. In Radman, Z. (ed), Knowing without Thinking: The Background in Philosophy of Mind. Basingstoke. Palgrave Imada, T., Zhang, Y., Cheour, M., Taulu, S., Ahonen, A., & Kuhl, P. K. 2006. Infant speech perception activates Broca’s area: a developmental magnetoencephalography study. Neuroreport, 17, 957–962. Israel, M., Johnson; C. & Brooks, P. 2001. From states to events: The acquisition of English passive participles. Cognitive Linguistics, 11, 103–129. Iverson, J. M. 2010. Developing language in a developing body: the relationship between motor development and language development. Journal of Child Language, 37, 229–261. Jackendoff, R. 1996. How Language Helps Us Think. Pragmatics & Cognition, 4, 1–24. Jackendoff, R. 1997. The Architecture of the Language Faculty. Cambridge, Massachussets: MIT Press. Jackendoff, R. 2003. Foundations of Language. Oxford: Oxford University Press. Jackendoff, R. 2007. Language, Consciousness, Culture: Essays on Mental Structure. Cambridge, Massachussets: MIT Press. Jakobson, R. 1957. Shifters, verbal categories, and the Russian verb. In Selected Writings, vol. II, Word and Language, 130–147. The Hague: Mouton Jakobson, R. 1960 (orig. 1941). Child Language, Aphasia and Phonological Universals. In Se-. lected writings I, 328 401. The Hague: Mouton. Jakobson, R. 1960. Linguistics and Poetics. In Style in Language, T. Sebeok (ed.), 350–377. Cambridge, Massachussets: MIT Press. Janda, L. A. & Solovyev, V. D. 2009. What constructional profiles reveal about synonymy: A case study of Russian words for sadness and happiness. Cognitive Linguistics, 20, 367–394. Janet, P. 1936. L’intelligence avant le langage. Paris: Flammarion. Jarrold, Ch., Boucher, J. & Smith, P. 1993. Symbolic play in autism: A review. Journal of Autism and Developmental Disorders, 23, 281–307. Jarvik, M. E., 1953. Discrimination of colored food and food signs by primates. Journal of comparative physiological Psychology, 46, 390–392. Jarvik, M. E. 1956. Simple colour discrimination in chimpanzees: effect of varying contiguity between cue and incentive, Journal of Comparative and Physiological Psychology, 49, 492–495. Jeannerod, M. 2006. From Volition to agency: The mechanism of action recognition and its failures. In Disorders of Volition, N. Sebanz & W. Prinz (eds.), 175–192. Cambridge, Massachussets: MIT Press.

 Becoming Human Jellema, T. &. Perrett, D. I. 2003. Cells in monkey STS responsive to articulated body motions and consequent static posture: a case of implied motion?. Neuropsychologia, 41, 1728–1737. Jellema, T. & Perrett, D. I. 2005. Neural basis for the perception of goal-directed actions. In: The Cognitive Neuroscience of Social Behaviour, A. Easton and N. Emery (eds.), 81–112. Hove: Psychology Press. Jenkins, J. R., Stein, M. L. & Wysocki, K. 1984. Learning vocabulary through reading. American Educational Research Journal, 21, 767–787. Jerison, H. 1988. Evolutionary neurology and the origin of language as a cognitive adaptation. In The genesis of language, M. E. Landsberg (ed.), 3–9. Berlin: Mouton deGruyter. Jespersen, O. 1975 (orig. 1924). La filosofía de la gramática, Barcelona: Anagrama. Johnson, M. H., Dziurawiec, S., Ellis, H., & Morton, J. 1991. Newborns’ preferential tracking of face-like stimuli and its subsequent decline. Cognition, 40, 1–19. Johnson-Laird, P. N. 1972. The three-term series problem. Cognition, 1, 57–82. Jolly, A. 1966. Lemur social behavior and primate intelligence. Science, 153, 501–506. Jones, S. S. 1996. Imitation or Exploration?: Young Infants’ Matching of Adults’ Oral Gestures. Child Development, 67, 1970–1987. Jones, D. 2010. Human kinship, from conceptual structure to grammar. Behavioral and Brain Sciences, 33, 367–381. Jordan, M. I. & Rumelhart, D. 1992. Forward Models: Supervised Learning with a Distal Teacher. Cognitive Science, 16, 307–354. Kaminski, J., Riedel, J., Call, J., & Tomasello, M. 2005. Domestic goats, Capra hircus, follow gaze direction and use social cues in an object choice task. Animal Behaviour, 69, 11–18. Kaminski, J., Call, J. and Tomasello, M. 2004. Body orientation and head orientation. Two factors controlling apes’ begging behavior from humans. Animal cognition, 7, 216–223. Kano, F. & Tomonaga, M. 2010. Face scanning in chimpanzees and humans: continuity and discontinuity. Animal Behaviour, 79, 227–235. Kaplan, D. 1989. Demonstratives. In Almog, J., Perry, J. and Wettstein, H. (eds), Themes from Kaplan, 481–563. New York: Oxford University Press. Kappes, J., Baumgaertner, A., Peschke, C. & Ziegler, W. 2009. Unintended imitation in nonword repetition. Brain and Language, 111, 140–151. Karmiloff-Smith, A. & Inhelder, B. 1974. If you want to get ahead, get a theory. Cognition, 23, 95–147. Keysers, C. & Perrett, D. I. 2004. Demystifying social cognition: a Hebbian perspective. Trends in Cognitive Sciences, 8, 501–507. Kharitonova, M., Chien, S., Colunga, E. & Munakata, Y. 2009. More than a matter of getting ‘unstuck’: flexible thinkers use more abstract representations than perseverators. Developmental Science, 12, 662–669. King, J. E., & Figueredo, A. J. 1997. The five-factor model plus dominance in chimpanzee personality. Journal of Research in Personality, 31, 271–271. Kintsch, W. 1998. Comprehension. A Paradigm for Cognition. New York: Cambridge University Press. Klein, J. T., Shepherd, S. V. & Platt, M. L. 2009. Social Attention and the Brain. Current Biology, 19, R958–R962. Kloo, D. & Perner, J. 2005. Disentangling dimensions in the dimensional change card-sorting task. Developmental Science, 8, 44–56. Kluender, K. R., Diehl, R. L., & Killeen, P. R. 1987. Japanese Quail can form phonetic categories. Science, 237, 1195–1197.

References  Knight, C. 2003. The secret of lateralisation is trust. Behavioral and Brain Sciences, 26, 231– 232. Kobayashi, H. & Koshima, S. 2001. Unique morphology of the human eye and its adaptive meaning: comparative studies on external morphology of the primate eye. Journal of Human Evolution, 40, 419–435. Kobayashi, H. & Hashiya, K. (in press). The gaze that grooms: contribution of social factors to the evolution of primate eye morphology. Evolution and Human Behavior. Koenig, M. A. & Harris, P. L. 2005. The role of social cognition in early trust. Trends in Cognitive Sciences, 9, 457–459. Kohler, E., Keysers, C., Umiltà, M.A., Fogassi, L., Gallese, V., and Rizzolatti, G. 2002. Hearing sounds, understanding actions: Action representation in mirror neurons. Science, 297, 846–848. Koschmann, T. (in press). On the universality of recursion. Lingua. Kouider, S., de Gardelle, V., Sackur, J. & Dupoux, E. (in press). How rich is consciousness? The partial awareness hypothesis. Trends in Cognitive Sciences. Kraskov, A., Dancause, N., Quallo, M. M., Shepherd, S. & Lemon, R. N. 2009. Corticospinal Neurons in Macaque Ventral Premotor Cortex with Mirror Properties: A Potential Mechanism for Action Suppression?. Neuron, 64, 922–930. Krifka, M. 2008. Functional similarities between bimanual coordination and topic/comment structure. In Variation, Selection, Development: Probing the Evolutionary Model of Language Change, R. Eckardt, G. Jäger & T. Veenstra (eds.), 307–336. Berlin/New York: Mouton de Gruyter. Krifka, M. (in press). A Note on an Asymmetry in the Hedonic Implicatures of Olfactory and Gustatory Terms. http://amor.cms.hu-berlin.de/~h2816i3x/Publications/Krifka_SmellTaste.pdf Kuhl, P. K. & Miller, J. D. 1975. Speech perception by the chinchilla: voiced-voiceless distinction in alveolar plosive consonants. Science, 190, 69–72. Kuhl, P. K. 2010. Brain Mechanisms in Early Language Acquisition. Neuron, 67, 713–727. Kühn, S. & Brass, M. 2010. The cognitive representation of intending not to act: Evidence for specific non-action-effect binding. Cognition, 117, 9–16. La Frenière, P. 1988. The ontogeny of tactical deception in humans. In R. W. Byrne & A. Whiten (eds), Machiavellian Intelligence: Social Expertise and the Evolution of Intellect in Monkeys, Apes and Humans, 238–252. Oxford/New York: Oxford University Press. Lakoff, G. 1994. What is a conceptual system?. In The Nature and Ontogenesis of Meaning, W. F. Overton & D. S. Palermo (eds.), 41–90. Hillsdale, NJ.: Lawrence Erlbaum. Landau, B. 2000. Concepts, the lexicon and acquisition: Fodor’s new challenge. Mind & Language, 15, 319–26. Landau, B., Gleitman, L. R. 1985. Language and Experience. Cambridge, Massachussets: Harvard University Press. Landauer, T. K. & Dumais, S. T. 1997. A solution to Plato’s problem: the Latent Semantic Analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104, 211–240. Landauer, T. K., Foltz, P. W. & Laham, D. 1998. An introduction to Latent Semantic Analysis. Discourse Processes, 25, 259–284. Lane, S. M. & Schooler, J. W. 2004. Skimming the Surface. Verbal Overshadowing of Analogical Retrieval. Psychological Science, 11, 715–719. Langacker, R. W. 1991. Foundations of Cognitive Grammar. Stanford: Stanford University Press. Langacker, R. W. 2006. On the continuous debate about discreteness. Cognitive Linguistics, 17, 107–151.

 Becoming Human Langacker, R. W. 2009. A dynamic view of usage and language acquisition. Cognitive Linguistics, 20, 627–640. Langer, S. K. 1957 (orig., 1941). Philosophy in a New Key. Cambridge, Massachussets: Harvard University Press. Latash, M. L. & Feldman, A. G. 2004. Computational ideas versus living systems. Behavioral and Brain Sciences, 27, 409. Leary, M. R. 2004. The Sociometer. In Handbook of Self-regulation: Research, Theory, and Applications, R. F. Baumeister & K. D. Vohs (eds.), 373–391. New York: Guilford Press. Leavens, D. A., Hopkins, W. D. and Bard, K. A. 2005. Understanding the Point of Chimpanzee Pointing. Epigenesis and Ecological Validity. Current Directions in Psychological Science, 14, 185–189. Legerstee, M. & Barillas, Y. 2003. Sharing attention and pointing to objects at 12 months: is the intentional stance implied? Cognitive Development, 18, 91–110. Leighton, J., Bird, G. & Heyes, C. (2010). ‘Goals’ are not an integral component of imitation. Cognition. Leslie, A. M. 1988. Some implications of pretense for mechanisms underlying the child’s theory of mind. In Developing Theories of Mind, J. Astington, P. Harris, & D. Olson (eds.), 19–46. New York: Cambridge University Press. Leslie, A. M. 2005. Developmental parallels in understanding minds and bodies. Trends in Cognitive Sciences, 9, 459–462. Levelt, W. J. M. 1989. Speaking: From Intention to Articulation. Cambridge, Massachussets: MIT Press. Levinson, S. C. 1996. Relativity in spatial conception and description. In Rethinking Linguistic Relativity, J. J. Gumperz & S. C. Levinson (eds.), 177–202. Cambridge: Cambridge University Press. Lewis, M. 2000. The emergence of human emotions. In Handbook of emotions (second edition), M. Lewis & J. M. Haviland-Jones (eds.), 265–280. New York: Guilford Press. Liberman, A., Cooper, F., Shankweiler, D. & Studdert-Kennedy, M. 1967. Perception of the speech code. Psychology Review, 74, 431–61. Liberman, A. M. & Mattingly, I. G. 1985. The motor theory of speech perception revised. Cognition, 21, 1–36. Liberman, A. & Whalen, D. H. 2000. On the relation of speech to language. Trends in Cognitive Sciences, 4, 187–196. Liebenberg, L. 1990. The Art of Tracking. The Origin of Science. Cape Town: David Philips Publishers. Lieberman, P. 1991. Uniquely Human: The Evolution of Speech, Thought, and Selfless Behaviour. Cambridge, Massachussets: Harvard University Press. Lieven, E., Behrens, H., Speares, J., & Tomasello, M. 2003. Early syntactic creativity: a usagebased approach. Journal of Child Language, 30, 333–370. Lieven, E. 2008. Learning the English auxiliary: a usage-based approach. In Corpora in Language Acquisition Research: Finding Structure in Data, H. Behrens (ed.), 60–98. Amsterdam: Benjamins. Lieven, E., Salomo, D. & Tomasello, M. 2009. Two-year-old children’s production of multiword utterances: A usage-based analysis. Cognitive Linguistics, 20, 481–508. Lieven, E. 2010. Input and first language acquisition: Evaluating the role of frequency. Lingua, 120, 2546–2556.

References  Liszkowski, U., Carpenter, M. & Tomasello, M. 2008. Twelve-month-olds communicate helpfully and appropriately for knowledgeable and ignorant partners. Cognition, 108, 732–739. Liszkowski, U., Schafer, M., Carpenter, M. & Tomasello, M. 2009. Prelinguistic infants, but not chimpanzees, communicate about absent entities. Psychological Science, 20, 654–660. Locke, J. L. and Bogin, B. 2006. Language and life history: A new perspective on the development and evolution of human language. Behavioral and Brain Sciences, 29, 259–280. Lorenz, K. Z. 1966. Evolution and Modification of Behaviour. London: Methuen. Lorenzo, G. &. Longa, V. 2003. Homo loquens. Biología y evolución del lenguaje. Lugo: TrisTram. Lotman, Y. 1978 (orig. 1970). La estructura del texto artístico. Madrid: Istmo. Lotto, A. J., Hickok, G. S. & Holt, L. L. 2009. Reflections on mirror neurons and speech perception. Trends Cognitive Sciences, 13, 110–14. Lozano, S. C., Hard, B. M. & Tversky, B. 2007. Putting action in perspective. Cognition. 103, 480–490. Lucy, J. A (ed). 1993. Reflexive Language: Reported Speech and Metapragmatics. Cambridge: Cambridge University Press. Luka, B. & Barsalou, L. W. 2005. Structural facilitation: Mere exposure effects of grammatical acceptability as evidence for syntactic priming in comprehension. Journal of Memory and Language, 52, 436–459. Lukowski, A F., Wiebe, S. A., Haight, J. C., DeBoer, T., Nelson, C. A. & Bauer, P. J. 2005. Forming a stable memory representation in the first year of life: why imitation is more than child’s play. Developmental Science 8, 279–298. Luria, A. R. 1976a. Cognitive Development. Cambridge, Massachussets/London. Harvard University Press. Luria, A. R. 1976b. The Neuropsychology of Memory. New York: Winston. Luria, A. R. 1980 (orig. 1979). Conciencia y lenguaje. Madrid: Pablo del Río Editor. MacNeilage, P. F. & Davis, B. L. 2004. Baby talk and the emergence of first words. Behavioral and Brain Sciences, 27, 517–518. MacNeilage, P. F. 2008. The Origin of Speech. Oxford: Oxford University Press. Markowitsch, H. J. & Staniloiu, A. (in press). Memory, autonoetic consciousness, and the self. Consciousness and Cognition. Marler, P. 1991. The instinct to learn. In S. Carey and R. Gelman (eds) The Epigenesis of Mind: Essays on Biology and Cognition, 37–66. Hillsdale: Erlbaum. Marshall, J., Atkinson, J., Smulovitch, E., Thacker, A. & Woll, B. 2004. Aphasia in a user of British Sign Language: Dissociation between sign and gesture. Cognitive Neuropsychology, 21, 537–554. Masangkay, Z. S., McCluskey, K. A., McIntyre, C. W., Sims-Knight, J., Vaughn, B. E., and Flavell, J. H. 1974. The early development of inferences about the visual percepts of others. Child Development, 45, 357–366. Massaro, D. W. & Chen, T. H. 2008. The motor theory of speech perception revisited. Psychonomic Bulletin & Review, 15, 453–457. Mattar, A. A., & Gribble, P. L. 2005. Motor learning by observing. Neuron, 46, 153−160. Maurer, D. & Landis, T. 1990. Role of bone conduction in the self-perception of speech. Folia Phoniatrica, 42, 226–229. McCune, L. 1995. A normative study of representational play at the transition to language. Developmental Psychology, 31, 198–206.

 Becoming Human McCune, L. 2006. Dynamic event words: From common cognition to varied linguistic expression. First Language, 26, 233–255. McCune, L. 2008. How Children Learn to Learn Language. Oxford: Oxford University Press. Mead, G. H. 1934. Mind, Self, and Society. From the Standpoint of a Social Behaviorist. Chicago: University of Chicago Press. Meguerditchian, A. & Vauclair, J. 2010. Investigation of gestural vs. vocal origins of language in honhuman primates: Distinguishing comprehension and production of signals. In The Evolution of Language (EVOLANG 8), Smith, A.D.M., Schouwstra, M., de Boer, B. and Smith, K. (eds), 453–454. Singapore: World Scientific. Meguerditchian, A. & Vauclair, J. & Hopkins, W. D. 2010. Captive chimpanzees use their right hand to communicate with each other: Implications for the origin of the cerebral substrate for language. Cortex, 46, 40–48. Mehler, J. & Dupoux, E. 1994. What Infants Know: The New Cognitive Science of Early Development. Cambridge, Massachussets: Blackwell. Meltzoff, A. N. & Moore, M. K. 1983. Newborn infants imitate adult facial gestures. Child Development, 54, 702–709. Meltzoff, A. N. 1988. Infant imitation after a 1-week delay: Long-term memory for novel acts and multiple stimuli. Developmental Psychology, 24, 470–476. Metcalfe, J. & Mischel, W. 1999. A Hot/Cool-System Analysis of Delay of Gratification: Dynamics of Willpower. Psychological Review, 106, 3–19. Metzing, C. & Brennan, S. E. 2003. When conceptual pacts are broken: Partner-specific effects on the comprehension of referring expressions. Journal of Memory and Language, 49, 201–213. Meyer, M. 1985. Pour une rhétorique de la raison. Révue Internationale de Philosophie, 155, 289–301. Meyer, M. 1988. The revival of questioning in the twentieth century. Synthese, 74, 5–18. Miall, R. C. & Wolpert, D. M. 1996. Forward Models for Physiological Motor Control. Neural Networks, 9, 1265–1279. Miklósi, Á., Polgárdi, R., Topál, J., & Csányi, V. 1998. Use of experimenter-given cues in dogs. Animal Cognition, 1, 113–121. Miklósi, Á & Topal, J. 2005. Is there a simple recipe for how to make friends? Trends in cognitive sciences, 9, 463–464. Mintz, T. H. 2003. Frequent frames as a cue for grammatical categories in child directed speech. Cognition, 90, 91–117. Mitchell, P. & Lacohee, H. 1991. Children’s early understanding of false belief. Cognition, 39, 107–127. Mitchell, R. W. 1993. Mental models of mirror-self-recognition: Two theories. New Ideas in Psychology, 11, 295–325. Mitchell, R. W. (in press). Mirrors and matchings: Imitation from the perspective of mirror-selfrecognition, and why the parietal region is involved in both. In Models and Mechanisms of Imitation and Social Learning: Behavioural, Social and Communicative Dimensions, K. Dautenhahn & C. L. Nehaniv (eds.). Cambridge, UK: Cambridge University Press. Moeschler, J. 2006. Expressing Causality in Natural Language. A Pragmatic Perspective. http:// www.interdisciplines.org/causality/papers/15 Monaghan, P. & Christiansen, M. H. 2008. Integration of multiple probabilistic cues in syntax acquisition. In Corpora in Language Acquisition Research: History, Methods, Perspectives, H. Behrens (ed.), 139–164. Amsterdam: Benjamins.

References  Moore, C. 1999. Intentional relations and triadic interactions. In Developing Theories of Intention, P. D. Zelazo, J. W. Astington & D. R. Olson (eds.), 43–62. Hillsdale: Lawrence Erlbaum. Moore, R. 2010. Review of Tallis, 2010. Journal of Consciousness Studies, 17, 238–243. Myin, E. & O’Regan, K. 2009. Situated perception and sensation in vision and other modalities. In Situated cognition, Ph. Robbins & M. Aydede (eds.), 185–199. New York/Cambridge: Cambridge University Press. Myowa-Yamakoshi, M., Yamaguchi, M.K., Tomonaga, M., Tanaka, M., & Matsuzawa, T. 2005. Development of face recognition in infant chimpanzees (Pan troglodytes). Cognitive Development, 20, 49–63. Nadel, J. 2002. Imitation and imitation recognition: functional use in preverbal infants and nonverbal children with autism. In A. Meltzoff & W. Prinz (eds), The Imitative Mind: Development, Evolution, and Brain Bases, 63–73. Cambridge: Cambridge University Press. Nagel, T. 1971. Brain Bisection and the Unity of Consciousness. Synthese, 22, 396–413. Nagel, T. 1979. Subjective and objective. In Mortal Questions, 196–214. Cambridge/New York: Cambridge University Press. Nagy, E. & Molnar, P. 2004. Homo imitans or homo provocans? The phenomenon of neonatal imitation. Infant Behaviour and Development, 27, 57–63. Nagy, W. E., Herman, P. A. & Anderson, R. C. 1985. Learning words from context. Reading Research Quarterly, 20, 233–253. Namy, L. L. 2008. Recognition of iconicity doesn’t come for free. Developmental Science, 11, 841–846. Navon, D. 2002. It Takes Two for an Inverse Relationship. http://www.cogsci.ecs.soton.ac.uk/ cgi/psyc/newpsy?12.017 Nelson, K. 1985. Making sense: The Acquisition of Shared Meaning. New York: Academic Press. Newmeyer, F. J. 2006. What can grammaticalization tell us about the origins of language? http:// www.tech.plym.ac.uk/socce/evolang6/ Nielsen, M., Collier-Baker, E., Davis, J. M. & Suddendorf, T. 2005. Imitation recognition in a captive chimpanzee (Pan troglodytes). Animal Cognition, 8, 31–36. Ninio, A. 1993. On the fringes of the system: Children’s acquisition of syntactically isolated forms at the onset of speech. First Language, 13, 291–313. Ninio, A. 1994. Words with holes: The acquisition of the predicateness of predicates. Paper presented at the Emory Conference on Cognitive and Functional Approaches to Grammatical Development, Emory University, Atlanta, Georgia. micro5.mscc.huji.ac.il/~msninio/ WORDS-wth-holes.doc Ninio, A. 1999. Pathbreaking verbs in syntactic development and the question of prototypical transitivity. Journal of Child Language, 26, 619–653. Ninio, A. 2006. Language and the Learning Curve: The Acquisition of Syntax. Oxford: Oxford University Press. Ninio, A. 2008. The past was just a moment ago: Past morphology in the speech of young children and their mothers. Poster presented at the XVIth Biennial International Conference on Infant Studies, Vancouver, Canada, March 27–29. Nissen, H. W. 1946. Primate Psychology. In Ph. Harriman, Encyclopedie of Psychology, 546–570. New York: Philosophical Library. Noh, E. J. 2000. Metarepresentation. A Relevance-theory Approach. Amsterdam: Benjamins. Nunes, T. & Bryant, P. 1996. Children Doing Mathematics. London: Blackwell. Núñez, M. & Rivière, A. 1994. Engaño, intenciones y creencias en el desarrollo y evolución de una psicología natural. Estudios de Psicología, 15, 83–128.

 Becoming Human Ohala, J. J. 2010. What’s behind the smile? Behavioral and Brain Sciences, 33, 456–457. Ohms, V. R., Gill, A., Van Heijningen, C. A. A., Beckers, G. J. L. & ten Cate C. 2010. Zebra finches exhibit speaker-independent phonetic perception of human speech. Proceedings of Royal Society, B, 277, 1003–1009. Olson, D. R. 1970. Language and Thought. Psychological Review, 77, 257–273. Olson, D. R. 1997. The written representation of negation. Pragmatics & Cognition, 5, 235–252. Olson, D. R. 2002. What writing is. Pragmatics & Cognition, 9, 239–258. Onishi, K. H. & Baillargeon, R. 2005. Do 15-Month-Old Infants Understand False Beliefs? Science, 308, 255–258. Oppenheim, G. M. & Dell, G. S. 2008. Inner speech slips exhibit lexical bias, but not the phonemic similarity effect. Cognition, 106, 528–537. Oppenheim, G. M., Dell, G. S. & Schwartz, M. F. 2010. The dark side of incremental learning: A model of cumulative semantic interference during lexical access in speech production. Cognition, 114, 227–252. Origgi, G. 2004. On Quoting and Theory of Mind http://www.interdisciplines.org/coevolution/ papers/4 Osvath, M. 2009. Spontaneous planning for future stone throwing by a male chimpanzee. Current Biology, 19, R190–R191. Owings, D. H. & Morton, E. S. 1998. Animal Vocal Communication. A New Approach. Cambridge: Cambridge University Press. Oztop, O. E. & Arbib, M. A. 2002. Schema design and implementation of the grasp-related mirror-neuron system. Biological Cybernetics, 87, 116–140. Panfilov, V. Z. 1972 (Russian, 1963). Gramática y lógica: articulación gramatical y lógico-gramatical de la oración simple. Buenos Aires: Paidós. Papineau, D. 2001. The Evolution of Means-End Reasoning. In Naturalism, Evolution and Mind. Supplement to Philosophy, D. N. Walsh (ed.), 145–178. Parkinson, J. A., Roberts, A. C., Everitt, B. J., Di Ciano, P. 2005. Acquisition of instrumental conditioned reinforcement is resistant to the devaluation of the unconditioned stimulus. The Quarterly journal of experimental psychology. B. Comparative and physiological psychology, 58, 19–30. Partington, A. S. 2009. A linguistic account of wordplay: The lexical grammar of punning. Journal of Pragmatics, 41, 1794–1809. Paukner, A., Suomi, S. J., Visalberghi, E. & Ferrari, P. F. 2009. Capuchin Monkeys Display Affiliation Toward Humans Who Imitate Them. Science, 325, 880–883. Penn, D. C., Holyoak, K. J. and Povinelli, D. J. 2008. Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds. Behavioral and Brain Sciences, 31, 109–130. Pepperberg, I. M. 2005. An Avian Perspective on Language Evolution: Implications of simultaneous development of vocal and physical object combinations by a Grey parrot (Psittacus erithacus). In Language Origins: Perspectives on Evolution, Maggie Tallerman (ed.), 239–261. Oxford University Press. Peretz, I. &. Hyde, K. L. 2003. What is specific to music processing? Insights from congenital amusia. Trends in Cognitive Sciences, 7, 362–367. Perner, J. & Ruffman, T. 2005. Infants’ Insight into the Mind: How Deep? Science, 308, 214–216. Perner, J. 1995. The many faces of belief. Cognition, 57, 241–269. Perruchet, P. & Pacton, S. 2006. Implicit learning and statistical learning: one phenomenon, two approaches. Trends in Cognitive Sciences, 10, 233–238.

References  Peskin, J. 1992. Ruse and representations: On children’s ability to conceal information. Developmental Psychology. 28, 84–89. Pettenati, P., Stefanini, S. & Volterra, V. (in press). Motoric characteristics of representational gestures produced by young children in a naming task. Journal of Child Language. Piaget, J. & Inhelder, B. 1956. The Child’s Concept of Space. London: Routledge. Piaget, J. 1945/1959. La formation du symbole chez l´enfant. Neuchâtel: Delachaux & Niestlé. Piaget, J. 1954/1937. The construction of Reality in the Child. New York: Ballantine. Piaget, J. 1962. Comments on Vygotsky’s critical remarks concerning The Language and Thought of the Child, and Judgment and Reasoning in the Child, by Jean Piaget. In Vygotsky, L. S. Thought and language E. Hanfmann and G. Vakar (eds.and trans.). Cambridge, Massachussets: MIT Press. Pierce, K. A. & Gholson, B. 1994. Surface Similarity and Relational Similarity in the Development of Analogical Problem Solving: Isomorphic and Nonisomorphic Transfer. Developmental Psychology, 30, 724–737. Pierrehumbert, J. B. 2001. Exemplar dynamics: Word frequency, lenition and contrast. In Frequency and the Emergence of Linguistic Structure, J. Bybee and P. Hopper (eds.), 137–157. Amsterdam: Benjamins. Pinker, S. & Bloom, P. 1990. Natural language and natural selection. Behavioral and Brain Sciences, 13, 707–784 Ploog, D. 2002. Is the neural basis of vocalisation different in non-human primates and Homo sapiens? In The speciation of modern Homo Sapiens, T. J. Crow (ed.), 121–135. Oxford: Oxford University Press. Plooij, F. 1978. Some basic traits of language in wild chimpanzees?. In Action, Gesture and Symbol, A. Lock (ed.), 111–132. New York: Academic Press. Port, R. 2007. How are words stored in memory? Beyond phones and phonemes. New Ideas in Psychology, 25, 143–170. Povinelli, D. J., Vonk, J. 2004. We Don’t Need a Microscope to Explore the Chimpanzee’s Mind. Mind and Language, 19, 1–28. Povinelli, D. J. 2004. Behind the ape’s appearance: Escaping anthropocentrism in the study of other minds. Daedalus, 133, 29– 41. Prather, J. F., Peters, S., Nowicki, S. & Mooney, R. 2008. Precise auditory-vocal mirroring in neurons for learned vocal communication. Nature, 451, 249–250. Premack, D. 1971. Some general characteristics of a method for teaching language to organisms that do not ordinarily acquire it. In Cognitive Processes of Nonhuman Primates, L. E. Jarrad (ed.), 47–82. New York: Academic Press. Premack, D. & Premack, A. 2003. Original Intelligence. New York: McGraw-Hill. Prior, A. N. 1968. Papers on Time and Tense. Oxford: Clarendon Press. Progovac, L. 2010. Syntax: Its Evolution and Its Representation in the Brain. Biolinguistics, 4, 234–254. Pustejovsky, J. 1995. The Generative Lexicon. Cambridge, Massachussets: MIT Press Quine, W., van O. 1960. Word and Object. Cambridge, Massachussets: MIT Press. Ray, E. & Heyes, C. 2011. Imitation in infancy: the wealth of the stimulus. Developmental Science, 14, 92–105. Reboul, A. (in press). Pragmatics, point of view and theory of mind. Intellectica. (Since 2005, in her web) Recanati, F. 2000. Oratio Obliqua, Oratio Recta. An Essay on Metarepresentation. Cambridge, Massachussets: MIT Press.

 Becoming Human Reddy, V. 2005. Before the ‘Third Element’: Understanding attention to Self. In Joint Attention: Communication and Other Minds, N. Eilan, Ch. Hoerl, T. McCormack and J. Roessler (eds.), 85–109. Oxford: Oxford University Press. Reddy, V. 2008. How infants know minds. Cambridge, Massachussets Harvard University Press. Reynolds, P. C. 1983. Ape Construction and Linguistic Structure. In Glossogenetics. The Origin and Evolution of Language, E. Grolier (ed.), 185–200. New York: Harwood Academic Publishers. Reynolds, P. C. 1993. The complementation theory of language and tool use. In Tools, Language and Cognition in Human Evolution, K. R. Gibson & T. Ingold (eds.), 407–428. Cambridge: Cambridge University Press. Richtsmeier, P. T., Gerken, L., Goffman, L. & Hogan, T. 2009. Statistical frequency in perception affects children’s lexical production. Cognition, 111, 372–377. Riggs, K. J. & Simpson, A. 2005. Young children have difficulty ascribing true beliefs. Developmental Science, 8, F27–F30. Risjord, M. 1996. Meaning, belief and language acquisition. Philosophical Psychology, 9, 465–475. Rizzi, L. 2009. Some elements of syntactic computations. In Biological Foundations and Origin of Syntax, D. Bickerton and E. Szathmáry (eds.), 63–88. Cambridge, Massachussets: MIT Press. Rizzolatti, G., Fadiga, L., Gallese, V. & Fogassi, L. 1996. Premotor cortex and the recognition of motor actions. Cognitive Brain Research, 3, 131–141. Rizzolatti, G. & Sinigaglia, C. 2010. The functional role of the parieto-frontal mirror circuit: Interpretations and misinterpretations. Nature Reviews Neuroscience, 11, 264–274. Rochat, P. 2009. Others in Mind: Social Origins of Self-Consciousness. New York: Cambridge University Press. Rochat, O. & Zahavi, D. (in press). The uncanny mirror: A re-framing of mirror self-experience. Consciousness and Cognition. Roediger III, H. L. & Butler, A. C. (in press). The critical role of retrieval practice in long-term retention. Trends in Cognitive Sciences. Rohlfing, K.J., Fritsch, J., Wrede, B., & Jungmann, T. 2006. How can multimodal cues from child-directed interaction reduce learning complexity in robots? Advanced Robotics, 20, 1183–1199. Rönnqvist, L. 2003. Developmentally, the arm preference precedes handedness. Behavioral and Brain Sciences, 26, 238–239. Rosa-Salva, O., Regolin, L. & Vallortigara, G. (in press). Faces are special for newly hatched chicks: evidence for inborn domain-specific mechanisms underlying spontaneous preferences for face-like stimuli. Developmental Science. Rowland, C. F. 2007. Explaining errors in children’s questions. Cognition, 104, 106–134. Rowlands, M. 2009. Situated representation. In Situated Cognition, P. Robbins & M. Aydede (eds.), 117–133. Cambridge/New York: Cambridge University Press. Russell, J. 2005. Justifying all the fuss about false belief. Trends in Cognitive Sciences, 9, 307–308. Ryle, G. 2000 (first ed., 1949). The concept of Mind. New York: Penguin Group. Sabbagh, M. A. & Baldwin, D. A. 2005. Understanding the role of communicative intentions in word learning. In Joint attention: communication and other minds,, N. Eilan, Ch. Hoerl, T. McCormack and J. Roessler (eds.), 165–184. Oxford: Oxford University Press. Sacks, H. 1992. Lectures on Conversation. Oxford: Blackwell. Saffran, J. R., Aslin, R. N. & Newport, E. L. 1996. Statistical learning by 8-month-old infants. Science, 274, 1926–28.

References  Saffran, J. R., Johnson, E. K., Aslin, R. N. & Newport, E. L. 1999. Statistical learning of tone sequences by human infants and adults. Cognition, 70, 27–52. Salazar-Orvig, A., Marcos, H., Morgenstern, A., Hassand, R., Leber-Marin, J., & Parès, J. (in press). Dialogical beginnings of anaphora: The use of third person pronouns before the age of 3. Journal of Pragmatics. Samuel, A. G. (in press). Speech Perception. Annual Review of Psychology, 62, 49–72. Sánchez de Zavala, V. 1973. Indagaciones praxiológicas sobre la actividad lingüística. Madrid: Siglo veintiuno. Sánchez de Zavala, V. 1978. Comunicar y conocer en la actividad lingüística. Barcelona: Fundación Juan March. Sánchez de Zavala, V. 1997. Hacia la pragmática psicológica. Madrid: Visor. Sapir, E. 1956. Language (orig., 1933). In Sapir, E., Culture, Language and Personality: Selected Essays (edited by David G. Mandelbaum) 1– 44. Berkeley: University of California Press. Sartori, L., Becchio, C., Bara, B. G. & Castiello, U. 2009. Does the intention to communicate affect action kinematics? Consciousness and Cognition, 18, 766–772. Savage-Rumbaugh, S., Shanker, S. G. & Taylor, T. J. 2001. Apes, Language, and the Human Mind. Oxford: Oxford University Press. Saxe, G. B. 1991. Culture and Cognitive Development: Studies in Mathematical Understanding. Hillsdale, New Jersey: Lawrence Erlbaum Associates. Saylor, M. M. 2004. Twelve- and 16-month-old infants recognize properties of mentioned absent things. Developmental Science, 7, 599–611. Scaife, M. & Bruner, J. S. 1975. The capacity for joint attention in the infant. Nature, 253, 265–266. Schilbach, L., Wilms, M., Eickhoff, S. B., Romanzetti, S., Tepest, R., Bente, G., Jon Shah, N., Fink, G. R. and Vogeley, K. (in press). Minds Made for Sharing: Initiating Joint Attention Recruits Reward-related Neurocircuitry. Journal of Cognitive Neuroscience. Schloegl, C., Kotrschal, K., & Bugnyar, T. 2007. Gaze following in common ravens, Corvus corax: ontogeny and habituation. Animal Behaviour, 74, 769–778. Schögler, B. and Trevarthen, C. 2007. To sing and dance together. From infants to jazz. In On Being Moved: From Mirror Neurons to Empathy, S. Bråten (ed.), 281–302. Amsterdam: Benjamins. Scholz, B. C. 2002. Reconciling ‘instinct’ with biological reality may require a recasting of evolutionary metaphors. Nature, 415, 739. Scott, L. S., Pascalis, O. & Nelson, C. A. 2007. A domain-general theory of the development of perceptual discrimination. Current Directions in Psychological Science, 16, 197–201. Scott-Phillips, T, C. 2010. The evolution of communication: Humans may be exceptional. In Experimental Semiotics: A New Approach for Studying the Emergence and the Evolution of Human Communication, B. Galantucci and S. Garrod (eds.), 78–99. Amsterdam: Benjamins. Scribner, S. & Cole, M. 1981. The Psychology of Literacy. Cambridge, Massachussets: Harvard University Press. Searle, J. 1958. Proper names and descriptions. In The Enciclopedia of Philosophy, P. Edwards (ed.). New York: MacMillan. Searle, J. 1979. Expression and Meaning: Studies in the Theory of Speech Acts. New York: Cambridge University Press. Searle, J. 1983. Intentionality: An Essay in the Philosophy of Mind. New York: Cambridge University Press. Sebanz, N., Bekkering, H. & Knoblich, G. 2006. Joint action: bodies and minds moving together. Trends in cognitive sciences, 10, 70–76.

 Becoming Human Senghas, A. & Coppola, M. 2001. Children Creating Language: How Nicaraguan Sign Language Acquired a Spatial Grammar. Psychological Science, 12, 323–328. Senghas A., Kita S., Özyürek, A. 2004. Children creating core properties of language: evidence from an emerging sign language in Nicaragua. Science, 305, 1779–82. Seyfarth, R. M. & Cheney, D. L. 2003. Signalers and receivers in animal communication. Annual Review of Psychology, 54, 145–173. Shankweiler, D., Studdert-Kennedy, M. 1967. Identification of consonants and vowels presented to left and right ears. Quarterly Journal of Experimental Psychology, 19, 59–63. Sheets-Johnstone, M. 1999. The Primacy of Movement. Amsterdam: Benjamins Siegal, M. & Beattie, K. 1991. Where to look first for children’s knowledge of false beliefs. Cognition, 38, 1–12. Simon, H. 1957. Models of Man. New York: John Wiley. Singh, L. 2008. Influences of high and low variability on infant word recognition. Cognition, 106, 833–870. Skinner, B. F. 1965 (first ed., 1953). Science and Human Behavior. New York: Free Press. Smith, J. D. 2005. Studies of uncertainty monitoring and metacognition in animals and humans. In The Missing Link in Cognition: Origins of Self-reflective Consciousness, H. Terrace & J. Metcalfe (eds.), 242–271. Oxford: Oxford University Press. Smith, K. & Wonnacott, E. (in press). Eliminating unpredictable variation through iterated learning. Cognition. Southgate, V., van Maanen, C. & Csibra, G. 2007. Infant Pointing: Communication to Cooperate or Communication to Learn?. Child Development, 78, 735–740. Southgate, V., Chevallier, C. and Csibra, G. 2010. Seventeen-month-olds appeal to false beliefs to interpret others’ referential communication. Developmental Science. 13, 907–912. Spaulding, S. 2010. Embodied Cognition and Mindreading. Mind & Language, 25, 119–140. Sperber, D. & Wilson, D. 1986. Relevance: Communication and Cognition. Oxford: Basil Blackwell. Sperry, R. W. 1950. Neural basis of spontaneous optokinetic responses produced by visual inversion. Journal of Computational Physiology and Psychology, 43, 482–489. Stamatopoulou, D. (in press). Symbol formation and the embodied self: A microgenetic casestudy examination of the transition to symbolic communication in scribbling activities from 14 to 31 months of age. New Ideas in Psychology. Stamenov, M. I. & Gallese, V. (eds). 2002. Mirror Neurons and the Evolution of Brain and Language. Amsterdam: Benjamins. Steels, L. 2003. Language re-entrance and the inner voice. Journal of Consciousness Studies, 10, 173–185. Sternberg, R. J. 1987. Most vocabulary is learned from context. In The Nature of Vocabulary Acquisition, M.G. McKeown and M.E. Curtis (eds.), 89–106. Hillsdale, NJ: Lawrence Erlbaum Associates. Steyvers, M. & Tenenbaum, J. B. 2005. The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth. Cognitive Science, 29, 41–78. Strawson, P. F. 1950. On referring. Mind. (Also in Essays in Conceptual Analysis, A. Flew (ed.). London: Macmillan, 1956). Strother, L., House, K. A. & Obhi, S. S. 2010. Subjective agency and awareness of shared actions. Consciousness and Cognition, 19, 12–20. Suchow, J. W. & Alvarez, G. A. (2011). Motion Silences Awareness of Visual Change. Current Biology.

References  Suddendorf, T. & Busby, J. 2003. Mental time travel in animals?. Trends in Cognitive Sciences, 7, 391–396. Suddendorf, T. & Corballis, M. C. 2008. New evidence of animal foresight. Animal Behaviour, 75, e1–e3. Suddendorf, T. & Whiten, A. 2001. Mental evolution and development: evidence for secondary representation in children, great apes and other animals. Psychological Bulletin, 127, 629–650. Summerfield, C. & Egner, T. 2009. Expectation (and attention) in visual cognition. Trends in Cognitive Sciences, 13, 403–409. Sutton, J. 2009. Remembering. In The Cambridge Handbook of Situated Cognition, P. Robbins, M. Aydede (eds.), 217–235. Cambridge/New York: Cambridge University Press. Sweetser, E. 1991. From Etymology to Pragmatics: Metaphorical and Cultural Aspects of Semantic Structure. Cambridge: Cambridge University Press. Tallerman, M. 1998. Understanding Syntax. London: Arnold Tallerman, M. et al. 2009. What kinds of syntactic phenomena must biologists, neurobiologists, and computer scientists try to explain and replicate?. In D. Bickerton and E. Szathmáry (eds.) Biological Foundations and Origin of Syntax, 135– 157. Cambridge, Massachussets: MIT Press. Tallis. T. 2010. Michelangelo’s Finger: An Exploration of Everyday Transcendence. London: Atlantic Books. Teufel, C., Gutmann, A., Pirow, R. & Fischer, J. (2010). Facial expressions modulate the ontogenetic trajectory of gaze-following among monkeys. Developmental Science. Theakston, A. L., Lieven. E. V. M., Pine, J. M. & Rowland, C. F. 2002. Going, going, gone: the acquisition of the verb ‘go’. Journal of Child Language, 29, 783–811. Thelen, E. & Smith, L. B. 1993. A Dynamic Systems Approach to Development. Cambridge, Massachussets: MIT Press. Thomas, N. J. T. 1999. Are theories of imagery theories of imagination? An active perception approach to conscious mental content. Cognitive Science, 23, 207–245. Thompson, R. K. R., Oden, D. L., Boysen, S. T. 1997. Language-Naive Chimpanzees (Pan troglodytes) Judge Relations Between Relations in a Conceptual Matching-to-Sample Task. Journal of Experimental Psychology: Animal Behavior Processes, 23, 31–43. Tomasello, M. 1992. First Verbs: A Case Study of Early Grammatical Development. New York: Cambridge Universitary Press. Tomasello, M. 1999. The Cultural Origins of Human Cognition. Cambridge, Massachussets: Harvard University Press. Tomasello, M. 2003. Constructing a Language: A Usage-Based Theory of Language Acquisition. Cambridge, Massachussets: Harvard University Press. Tomasello, M. 2008. Origins of Human Communication. Cambridge, Massachussets: MIT Press. Tomasello, M. Call, J. & Hare, B. 2003. Chimpanzees understand psychological states – the question is which ones and to what extent. Trends in Cognitive Sciences, 7, 153–156. Tomasello, M. & Call, J. 2004. The role of humans in the cognitive development of apes revisited. Animal Cognition, 7, 213 – 215. Tomasello, M., Carpenter, M., Call, J., Behne T. &. Moll, H. 2006. Understanding and sharing intentions. The origins of cultural cognition. Behavioral and Brain Sciences, 29, 675–691. Tomasello, M., Hare, B., Lehmann, H. & Call, J. 2007. Reliance on head versus eyes in the gaze following of great apes and human infants: the cooperative eye hypothesis. Journal of Human Evolution, 52, 314–320.

 Becoming Human Tomonaga, M. & Imura, T. 2010. Visual Search for Human Gaze Direction by a Chimpanzee (Pan troglodytes). PLoS ONE 5(2): e9131. Traugott, E. C. 2008 Suggestions from the development of Degree Modifiers in English. In Variation, Selection, Development: Probing the Evolutionary Model of Language Change, R. Eckardt, G. Jäger & T. Veenstra (eds.), 219–252. Berlin: Mouton de Gruyter. Trevarthen, C. 1998. The concept and foundations of intersubjectivity. In Intersubjective Communication and Emotion in Early Ontogeny, S. Braten (ed.), 15–46. Cambridge: Cambridge University Press. Tsakiris, M., Prabhu, G. & Haggard, P. 2006. Having a body vs. moving your body: How agency structures body-ownership. Consciousness and Cognition, 15, 423–432. Tuller, L., Delage, H., Monjauze, C., Piller, A.G. & Barthez, M. A. (in press). Clitic pronoun production as a measure of a typical language development in French. Lingua. Udell, M. A. R., Dorey, N. R. & Wynne, C. D. L. 2008. Wolves outperform dogs in following human social cues. Animal Behaviour, 76, 1767–1773. Umiltà, M.A., Kohler, E., Gallese, V., Fogassi, L., Fadiga, L., Keysers, C. and Rizzolatti, G. 2001. “I know what you are doing”: A neurophysiologycal study. Neuron, 32, 91–101. Umiltà, M. A., Escola, L., Intskirveli, I., Grammont, F., Rochat, M., Caruana, F., Jezzini, A., Gallese, V. & Rizzolatti, G. 2008. How pliers become fingers in the monkey motor system. Proceedings of National Academy of Sciences USA, 105, 2209–2213. Uriagereka, J. 2001. Review of A. Carstairs-McCarthy, The origins of complex language, 1999. Language, 77. 368–373. Uriagereka, J. & Piatelli-Palmarini, M. 2004. The Immune Syntax: The Evolution of the Language Virus. In Variation and Universals in Biolinguistics, L. Jenkins (ed.), 341–377. Amsterdam: Elsevier. Valentine, T., Brennen, T. & Brédart, S. 1996. The Cognitive Psychology of Proper Names. London: Routledge. van Eijck, J. & de Vries, F. J. 1995. Reasoning about Update Logic. In Journal of Philosophical Logic, 24, 19–45. van Rooijen, J. (2010). Do dogs and bees possess a ‘theory of mind’? Animal Behaviour. Vea, J. & Sabater-Pi, J. 1998. Spontaneous pointing behaviour in the wild pygmy chimpanzee (Pan paniscus). Folia Primatologica, 69, 289–290. Veneziano, E. 1999. Early lexical, morphological and syntactic development in French: Some complex relations. The International Journal of Bilingualism, 3, 183–217. Vespignani, F., Canal, P., Molinaro, N., Fonda, S. & Cacciari, C. 2010. Predictive Mechanisms in Idiom Comprehension. Journal of Cognitive Neuroscience. 22, 1682–1700. Vihman, M. M. 2002. The role of mirror neurons in the ontogeny of speech. In Mirror Neurons and the Evolution of Brain and Language, M. Stamenov & V. Gallese (eds.), 305–314. Amsterdam: John Benjamins. Vogt, P. and Lieven, E. 2010. Verifying Theories of Language Acquisition Using Computer Models of Language Evolution. Adaptive Behavior, 18, 21–35. von Grünau, M., Anston, C. 1995. The detection of gaze: A stare-in-the-crowd effect. Perception, 24, 1297–1313. Von Holst, E. 1954. Relations between the central nervous system and the peripheral organs. The British Journal of Animal Behavior, 2, 89–94. Vosgerau, G. (in press). Memory and content. Consciousness and Cognition. Voss, J., Ingram, J. N., Haggard, P. & Wolpert, D. M. 2006. Sensorimotor attenuation by central motor command signals in the absence of movement. Nature Neuroscience, 9, 26–7.

References  Vygotsky, L S. 1973 (Russian, 1934). Pensamiento y lenguaje (Con comentarios críticos de Jean Piaget). Buenos Aires: La Pléyade. Warneken, F., Hare, B., Melis, A. P., Hanus, D. & Tomasello, M. 2007. Spontaneous altruism by chimpanzees and young children. PLoS Biology, 5 (7), Warneken, F. & Tomasello, M. 2009. Varieties of altruism in children and chimpanzees. Trends in Cognitive Science, 13, 397–402. Webb, B. 2004. Small brains and minimalist emulation. Behavioral and Brain Sciences, 27, 421. Wellman, H. M., Lane, J. D., LaBounty, J. and Olson, S. L. (in press). Observant, nonaggressive temperament predicts theory-of-mind development. Developmental Science. West, S. A., El Mouden, C. and Gardner, A. (in press). Sixteen common misconceptions about the evolution of cooperation in humans. Evolution and Human Behavior. Wettstein, H. 1986. Has Semantics Rested on a Mistake? The Journal of Philosophy, 83, 185–209. Whalen, P. et al. 2004. Human Amygdala Responsivity to Masked Fearful Eye Whites. Science, 306, 2061. Whiten, A. 2000. Primate culture and social learning. Cognitive Science, 24, 477–508. Whiten, A. 2005. The second inheritance system of chimpanzees and humans. Nature 437, 52–55. Whiten, A., Horner, V., Litchfield, C.A., and Marshall-Pescini, S. 2004. How do apes ape? Learning and Behavior, 32, 36–52. Wierzbicka, A. 2002. Semantic Primes and Linguistic Typology. In Meaning and Universal Grammar. Theory and empirical findings. Volume 2, C. Goddard & A. Wierzbicka (eds.), 257–300. Amsterdam: Benjamins. Williams, R. F. 2008. Gesture as a conceptual mapping tool. In Metaphor and Gesture, A. Cienki & C. Müller (eds.), 55–92. Amsterdam: Benjamins. Wilson, D. 2000. Metarepresentation in linguistic metarepresentation. In Metarepresentations: A Multidisciplinary Perspective, D. Sperber (ed.), 411–448. Oxford: Oxford University Press. Wittgenstein, L. 1963 (orig. 1953). Philosophical Investigations. New York: The Macmillan Company. Woodward, A.L. 1998. Infants selectively encode the goal object of an actor’s reach. Cognition 69, 1–34. Wynn, T. 1993. Layers of thinking in tool behavior. In Tools, language and cognition in human evolution, K. R. Gibson & T. Ingold. (eds.), 389–406. Cambridge. Cambridge University Press. Yip, M. J. 2006. The search for phonology in other species. Trends in Cognitive Sciences, 10, 442–46. Yule, G. 2006. The Study of Language. Cambridge: Cambridge University Press. Zedelius, C. M., Veling, H. & Aarts, H. (in press). Boosting or choking – How conscious and unconscious reward processing modulate the active maintenance of goal-relevant information. Consciousness and Cognition. Zeedyk, M. S. (in press). Essay review. Self-consciousness: Attained by seeing ourselves through the eyes of the Other or by turning to look into those eyes?. Cognitive Development Zentall, T. R. 2003. Imitation by animals: how do they do it? Current Directions in Psychological Science, 12, 91–95. Zwickel, J. & Müller, H. J. (in press). Observing fearful faces leads to visuo-spatial perspective taking. Cognition.

Glossary allocentrism. Unlike egocentrism which entails reference from ego’s viewpoint, allocentrism entails a more objective reference, shared for example by the speaker and the listener. direction of fit. According to Searle, a significant dimension of differences between illocutionary acts. Statements have the word-to-world direction of fit, and orders have the world-to-word direction of fit. If the statement is not true, it is the statement which is at fault, not the world; if the order is disobeyed, it is not the order which is at fault, but the world in the person of the disobeyer. exaptation. The utilization of a structure or feature for a function other than that for which it was developed through natural selection. For example, feathers might have originally arisen in the context of selection for insulation, and only later were they coopted for flight. In this case, the general form of feathers is an adaptation for insulation and an exaptation for flight. Gavagai. In order to illustrate his view of the indeterminacy of translation, Quine (1960) used the example of ‘Gavagai’ uttered by a native upon seeing a rabbit. The linguist forms the reasonable hypothesis that ‘Gavagai’ means ‘rabbit’, and, putting aside some difficulties about ‘yes’ and ‘no’ and interrogative intonation, he can ask the native in a series of different situations the simple question ‘Gavagai?’ and see what responses he gets. However, Quine continues (and this is his point): Does ‘Gavagai’ refer to rabbits, or rabbit stages (temporal slices of rabbits), or undetached rabbit parts? Nothing in the native’s behaviour can as yet settle this question. A rabbit stage will cause just the same neural stimulations and hence just the same verbal behaviour as a rabbit. holophrase: One-word message. The time during which babies speak in single words is called the holophrastic stage. homology vs. analogy. Analogy: Similarity of structure between two species in spite of the fact that they are not closely related; attributable to convergent evolution. Homology: Similarity in characteristics resulting from shared ancestry. innatism. Human language is supported by innate, species-specific abilities: This is accepted without any controversies. However, Chomskyan innatism, that is, the postulation of an innate, specifically syntactic knowledge (from ‘deep syntactic structures’ in the early 60, to recursive embedding in 2002) is a very different issue. There are alternative views on this postulation. Concretely, it has been suggested that human abilities,

 Becoming Human

as ‘theory of mind’ or working memory, on one hand, and historical grammaticalisation, on the other hand, could support full syntax. judgements of identity. In them, the copulative verb is flanked by terms that have the same referent or extension. ‘The Moon is Earth’s sole natural satellite’ is a judgement of identity. ‘The Earth is a planet’ is not a judgement of identity. Are judgements of identity symmetrical? This has been object of controversies. Nowadays however the majority of logicians agree that judgements of identity are as asymmetrical as the other types of predications or judgements. phonetic vs. phonological differences. Phonetic differences: Any perceptible distinction between one speech sound and another irrespective of whether in a particular language these sounds are different phonemes or irrelevant variations of one phoneme. Phonologic differences: Only those contrasts in sound which make differences of meaning within a particular language. rubber-hand experiment. In this experiment, the subject sits with eyes fixed on an artificial hand while the experimenter uses two small paintbrushes to stroke the rubber hand and the subject’s hidden hand, synchronising the timing of the brushing as closely as possible. In this way, the subject takes the viewed rubber hand as their own hand. A consequence of this illusion is that the viewed location of the rubber hand adapts the proprioceptively perceived location of the subject’s own hand. simulationism. There are two different theoretical positions as to how people predict and understand other people’s mental states and actions (i.e., two theoretical positions concerning ‘Theory of mind’). One, called theory-theory, proposes that people apply a theory, i.e. they use a body of knowledge about what other subjects feel, think, want, etc. in given circumstances. The other position, called simulationism, is that we imagine ourselves being in the other’s situation. This imaginative identification triggers a series of ‘off-line’ or ‘pretend’ mental states and action tendencies which we then attribute to the simulated person. theme and rheme. Theme (or topic): What the sentence is about, the point of departure. Rheme (or comment): What is predicated of the theme. Theme was called by some authors (as Vygotsky) ‘psychological subject’ and distinguished from ‘syntactic subject’. In the Prague School, founded in 1926, this dichotomy was studied mainly by its relation to intonation and word-order. theory of mind is the social-cognitive ability to imagine or simulate others’ minds and emotion (the simulationist version) or to draw inferences from one’s theory of other’s mental states that are different from one’s own (the theory-theory version). Since its beginnings some 30 years ago, this area has grown to be one of the largest and liveliest fields that study exclusively human capacities. However, the issue was introduced by the question ‘Does the chimpanzee have a theory of mind?’ in the 1978 issue of Behavioral and Brain Sciences. Some commentaries suggested an experimental method to

Glossary 

find out whether an animal possessed the concept of belief. In the early 80, this method is already used to test young children’s understanding of false belief. This is the birth of the field of study called ‘Theory of mind’. Its expansion was very rapid, and nowadays it is an indisputably ‘progressive research programme’. In addition, the ground covered by it has been extended beyond the understanding of false beliefs. More and more researchers are focussing on the so-called 11-months revolution and also on cooperative behaviours.

Author index A Abbot-Smith, K.â•‡ 258 Adair, J. C.â•‡ 28 Ahluwalia, J.â•‡ 50 Ahonen, A.,â•‡ 100 Aitchison, J.â•‡ 303 Albrecht, K.â•‡ 75 Allen, A.â•‡ 306 Allport, F. H.â•‡ 79 Alston, W. P.â•‡ 186, 263 Alvarez, G. A.â•‡ 281 Ambrose, S. H.â•‡ 79 Amsterlaw, J.â•‡ 294 Anderson, M. L.â•‡ 129–130, 233 Anderson, R. C.â•‡ 257 Andrews, M.â•‡ 260, 263 Anisfeld, M.â•‡ 19 Anston, C.â•‡ 52 Apperly, I. A.â•‡ 216 Arbib, M.,â•‡ 15, 91, 130, 138, 156 Armstrong, D. F.,â•‡ 141 Asendorpf, J. B.â•‡ 87 Ashburn, L. A.â•‡ 124 Aslin, R. N.â•‡ 98 Atkinson, J.,â•‡ 138 Aydelott, J.â•‡ 100 B Baillargeon, R.â•‡ 179 Baldwin, D. A.â•‡ 77, 124 Balleine, B. W.,â•‡ 109 Bar, M.â•‡ 22 Bara, B. G.â•‡ 125 Barclay, J. R.â•‡ 260 Bard, K. A.â•‡ 17, 38, 66–67 Bargh, J. A.â•‡ 108 Barillas, Y.â•‡ 62 Baron-Cohen, S.â•‡ 49–50 Barone, P.â•‡ 172 Barresi, J.,â•‡ 78, 281 Barrett, A. M.,â•‡ 28, 138 Barsalou, L. W.â•‡ 232, 268 Barthez, M. A.â•‡ 153 Bates, E.â•‡ 49, 62 Batki, A.â•‡ 50

Baudonniere P. M.â•‡ 87 Bauer, P. J.â•‡ 130 Bauernschmidt, A.,â•‡ 267 Baumgaertner, A.,â•‡ 152 Bays, P. M.â•‡ 22 Beattie, K.â•‡ 176 Becchio, C.,â•‡ 125 Bechtel, W.â•‡ 21 Beckers, G. J. L.â•‡ 97 Behne, T.â•‡ 44, 73, 126 Behrens, H.,â•‡ 258 Bejarano, T.â•‡ 53, 76, 114, 170, 181, 223, 225, 227, 250, 264, 275–276, 289, 328, 333, 357 Bekkering, H.â•‡ 79, 122, 141 Belletier, C.,â•‡ 53 Bellugi, U.,â•‡ 138 Bem, D. J.â•‡ 306 Benson, J.â•‡ 71 Bente, G.,â•‡ 72 Benveniste, E.â•‡ 211, 250, 339, 344, 348–349 Bermudez, J. L.â•‡ 276 Bickerton, D.â•‡ 51, 64, 73, 105, 165, 194, 212, 213, 245–246 Bigelow, A. E.â•‡ 62 Bird, G.,â•‡ 18, 39 Bjorklund, D. F.â•‡ 84 Blagrove, M.â•‡ 145 Blakemore, S. J.â•‡ 22–23, 145 Bloch, M.â•‡ 353 Bloom, L.â•‡ 290 Bloom, P.â•‡ 213 Blumberg, M. S.â•‡ 21 Bock, K.â•‡ 268 Boesch, C.â•‡ 63, 81 Bogin, B.â•‡ 122–123 Bolhuis, J. J.â•‡ 99 Bompas, A.â•‡ 22 Born, J.â•‡ 277 Boucher, J.â•‡ 155 Bouissac, P.â•‡ 59 Bowden, E. M.â•‡ 267 Boyer, P.â•‡ 75 Boysen, S. T.â•‡ 305

Brand, M.â•‡ 124 Brand, R. J.â•‡ 124 Bransford, J. D.â•‡ 260 Brass, M.,â•‡ 15, 26, 33, 234 Braten, S.â•‡ 56, 121, 359 Bredart, S.â•‡ 265 Breheny, R.â•‡ 206 Brennan, S. E.â•‡ 260 Brennen, T.â•‡ 265 Brooks, P.â•‡ 258 Brown, M.,â•‡ 70 Bruner, J.â•‡ 49, 170, 289 Brunye, T. T.,â•‡ 356 Bryant, P.â•‡ 293 Bugnyar, T.â•‡ 34, 40, 43, 51 Buhler K.â•‡ 241–242, 317, 328, 339 Bulhof, J.â•‡ 264 Burkart, J. M.,â•‡ 63 Busby, J.â•‡ 288 Butler, A. C.â•‡ 130 Butterworth, G.â•‡ 49 Byrne, R. W.â•‡ 34, 38, 141–142 C Cacciari, C.â•‡ 266 Caggiano, V.,â•‡ 30 Call, J.â•‡ 39–41, 43–44, 51–52, 57–58, 60, 64, 71, 73, 77, 82, 126, 292, 323 Calvin, W. H.â•‡ 245–246 Camaioni, L.â•‡ 49, 62 Canal, P.,â•‡ 266 Capirci, O.â•‡ 156 Carey, P.â•‡ 311 Carlson, T. A.,â•‡ 29 Carpendale, A. B.â•‡ 60 Carpendale, J. I. M.â•‡ 60, 355 Carpenter, M.,â•‡ 44, 73, 111, 126 Carruthers, G.â•‡ 23 Carruthers, P.â•‡ 293 Carstairs-McCarthy, A.â•‡ 223 Caruana, F.,â•‡ 19, 32 Carver, C. S.â•‡ 22 Casielles, E.â•‡ 244, 300

 Becoming Human Casile, A.â•‡ 30 Castiello, U.â•‡ 125 Catchpole, C. K.â•‡ 193 Catmur, C.â•‡ 16 Chalmers, D. J.â•‡ 232 Chaminade, T.â•‡ 26 Chang, B.â•‡ 94 Chater, N.â•‡ 309 Chen, T. H.â•‡ 100 Cheney, D. L.â•‡ 67, 187, 291 Cheour, M.,â•‡ 100 Cherington, S. M.â•‡ 300 Chevallier, C.â•‡ 184 Chien, S.,â•‡ 321 Chomsky, N.â•‡ 212 Christiansen, M. H.â•‡ 260, 309 Christie, J.â•‡ 281 Chun, M. M.,â•‡ 153 Churchland, P. M.â•‡ 79, 99 Clancey, W. J.â•‡ 107 Clark, A.â•‡ 132, 232, 269, 305 Clayton, N. S.â•‡ 43 Clements, W. A.â•‡ 178–179 Cochet, H.â•‡ 143 Cohen, F. S.â•‡ 317 Cole, M.â•‡ 357 Collier-Baker, E.,â•‡ 88 Colunga, E.â•‡ 321 Connellan, J.â•‡ 50 Conty, L.,â•‡ 53 Conway, Ch. M.,â•‡ 267 Cooke, D.â•‡ 23 Coolidge, F. L.â•‡ 152 Cooper, F.,â•‡ 97 Coppola, M.â•‡ 165 Corballis, M. C.â•‡ 138, 288 Corina, D. P.,â•‡ 138 Cosentino, E.â•‡ 288 Costantini, M.,â•‡ 30 Croft, K.â•‡ 45 Csanyi, V.â•‡ 40 Csibra, G.â•‡ 27, 33, 50, 59, 62, 166, 171, 184, 334 Custance, D. M.,â•‡ 38 Cutler A.â•‡ 98 D d’Errico, F.,â•‡ 294 Dąbrowska, E.â•‡ 258 Damasio, A. R.â•‡ 284, 311 Dancause, N.,â•‡ 29 Daprati, E.,â•‡ 30 Darwin, C. R.â•‡ 60, 70, 328 Dascal, M.â•‡ 163, 234

Davidson, D.â•‡ 251, 274–276, 278, 313 Davis, B. L.â•‡ 308 Davis, J. M.â•‡ 88 de Boer, B.â•‡ 194–195 de Gardelle, V.,â•‡ 161 de Vignemont, F.â•‡ 184 de Villiers, J.â•‡ 337 de Villiers, P.â•‡ 337 de Vries, F. J.â•‡ 227 de Waal, F. B. M.â•‡ 34, 63 Deacon, T.â•‡ 111 DeBoer, T.,â•‡ 130 Decety, J.â•‡ 26, 100, 117 Delage, H.,â•‡ 153 Delgado, B.,â•‡ 61, 70 Dell, G. S.â•‡ 153, 259 Dennett, D.â•‡ 281 Dessalles, J.-L.â•‡ 155 Deutscher, G.â•‡ 308 Devlin, J. T.â•‡ 100 Di Ciano, P.â•‡ 109 Diamond, Adeleâ•‡ 280 Diamond, Arthur S.â•‡ 308–309 Dickinson, A.â•‡ 109 Diehl, R. L.,â•‡ 97 Diekelmann, S.,â•‡ 277 Dimitriou, M.â•‡ 23, 127 Ditman, T.,â•‡ 356 Donald, M.â•‡ 130, 305 Donnellan, K. S.â•‡ 320 Dorey, N. R.â•‡ 71 Dowd, D.â•‡ 138 Dretske, F.â•‡ 45 Dumais, S. T.â•‡ 260, Dunbar, R.â•‡ 206 Dupoux, E.â•‡ 98, 161 Dziurawiec, S.,â•‡ 50 E Eco, U.â•‡ 264 Edin, B. B.â•‡ 23, 127 Efferson, C.â•‡ 63 Egner, T.â•‡ 22 Eickhoff, S. B.,â•‡ 72 El Mouden, C.â•‡ 155 Ellis, B. J.â•‡ 84 Ellis, H.,â•‡ 50 Ellis, R.â•‡ 23 Emery, N. J.â•‡ 43 Escola, L.,â•‡ 19, 32 Evans, J. St. B. T.â•‡ 152 Everett, B. A.,â•‡ 45 Everett, D. L.â•‡ 211

Everitt, B. J.,â•‡ 109 F Fadiga, L.,â•‡ 13, 23, 32 Falk, D.â•‡ 308 Farroni, T.,â•‡ 50 Feher, O.,â•‡ 197 Fehr, E.,â•‡ 63 Feinberg, T. E.â•‡ 138, 276 Feldman, A. G.â•‡ 23 Fernald, A.â•‡ 85, 326 Ferrari, P. F.,â•‡ 17, 19, 21, 32, 34, 50–51 Figueredo, A. J.â•‡ 306 Fink, G. R.â•‡ 72 Fischer, J.â•‡ 40, 68 Fitch, W. T.â•‡ 34, 43, 51, 208–209, 211–212 Fitzsimons, G. M.â•‡ 108 Flaherty, M.â•‡ 165 Flanagan, J. R.â•‡ 22 Flanagan, O.â•‡ 75 Flavell, E. R.â•‡ 45, 282 Flavell, J. H.,â•‡ 45, 282 Fleck, J.â•‡ 267 Fodor, J. A.â•‡ 131, 237 Fogassi, L.â•‡ 13, 19, 23, 30–32, 51, 93 Foltz, P. W.â•‡ 260 Fonda, S.â•‡ 266 Forrester, M. A.â•‡ 300 Foundas, A. L.â•‡ 138 Fowler, C. A.â•‡ 97 Foxton, J. M.,â•‡ 172 Franck, N.,â•‡ 30 Franks, J. J.â•‡ 260 Fraser, H.â•‡ 98 Frege, G.â•‡ 180, 217, 219–227 Frey, S. H.,â•‡ 138, 143 Fries, P.â•‡ 71 Frith, C. D.â•‡ 22–23 Frith, U.â•‡ 184 Fritsch, J.,â•‡ 124 Funnell, M. G.,â•‡ 138, 143 Fuster, J.â•‡ 45, 260 G Galantucci, B.â•‡ 170, 97 Gallese, V.â•‡ 13, 19, 23, 26, 31–32, 93 Garcia Calvo, A.â•‡ 328 Gardner, A.â•‡ 155 Gattis, M.,â•‡ 141 Gazzaniga, M. S.â•‡ 138, 143

Author index  Gentilucci, M.â•‡ 138 George, N.â•‡ 53 Georgieff, N.â•‡ 26 Gergely, G.â•‡ 15, 59, 122 Gerken, L.,â•‡ 142 Gerry, V. E.,â•‡ 138, 143 Gervain J.â•‡ 163, 246 Gholson, B.â•‡ 267 Gill, A.,â•‡ 97 Gillette, J.â•‡ 311 Gimbel, S.â•‡ 264 Gimmig, D.,â•‡ 53 Ginsburg, S.â•‡ 276 Givon, T.â•‡ 237 Gleitman, L.,â•‡ 193, 257, 311 Glenberg, A. M.â•‡ 26, 130 Gliga, T.,â•‡ 59 Gobes, S. M.,â•‡ 99 Goffman, L.â•‡ 142 Goldberg, A.â•‡ 266, Goldie, P.â•‡ 306 Goldinger, S. D.â•‡ 259 Goldin-Meadow, S.â•‡ 165–166 Goldman, H. I.â•‡ 308 Golomb, J. D.â•‡ 153 Gomez, J. C.â•‡ 44, 61, 65, 70 Goodglass, H.,â•‡ 311 Goody, J.â•‡ 356 Gopnik, A.â•‡ 289 Grafton, S. T.â•‡ 24 Grammont, F.,â•‡ 19, 32 Grassmann, S.â•‡ 140 Graziano, M.,â•‡ 23 Greaves, W.â•‡ 71 Green, F.â•‡ 282 Greenfield, P. M.â•‡ 193 Grezes, J.â•‡ 100, 117 Gribble, P. L.â•‡ 100 Grice, H. P.â•‡ 68, 206 Grush, R.â•‡ 23 Guillaume, G.â•‡ 153 Gutmann, A.,â•‡ 40, 68 H Haggard, P.â•‡ 18, 22, 29–30, 126 Haight, J. C.,â•‡ 130 Hamilton, A. F.â•‡ 24 Hamilton, W. D.â•‡ 14 Hanus, D.â•‡ 63 Hard, B. M.â•‡ 103 Hare, B.â•‡ 39–41, 43–44, 51–52, 57–58, 60, 63–65, 67, 70, 77 Harris, J. R.â•‡ 306 Harris, P.â•‡ 190, 205

Hashiya, K.â•‡ 59 Hassand, R.,â•‡ 348 Haugeland, J.â•‡ 232 Hauser, M.,â•‡ 212 Heilman, K. M.,â•‡ 28, 138 Heine, B.â•‡ 303, 307 Heinrich, B.â•‡ 43, 51 Henshilwood, C.,â•‡ 294 Herman, P. A.â•‡ 257 Herr-Israel, E.,â•‡ 210 Heyes, C.â•‡ 16–18, 26–27, 33, 39, 43 Hickok, G.â•‡ 15, 33, 100, Hierro Sanchez-Pescador, J.â•‡ 76 Hietanen, J. K.â•‡ 28 Hockett, C. F.,â•‡ 104, 240 Hogan, T.â•‡ 142 Hogendoorn, H.,â•‡ 29 Holekamp, K. E.â•‡ 15 Hollis K. L.â•‡ 107 Holloway, R. L.â•‡ 59 Holmes, N. P.â•‡ 30 Holt, L. L.â•‡ 100 Holyoak, K. J.â•‡ 42, 64 Hopkins, W. D.â•‡ 57, 66–67, 173 Hopper, P. K.â•‡ 312, 346 Horn, L. R.â•‡ 248 Horner, V.,â•‡ 34, 38 House, K. A.â•‡ 314 Houston-Price, C.,â•‡ 190 Huang, S. S.â•‡ 267 Huang, Y. T.â•‡ 153 Huber, L.â•‡ 34, 43, 51 Huguet, P.â•‡ 53 Humphrey, N.â•‡ 15 Hurford, J.â•‡ 19, 59, 91, 223, 227, 261, 293, 303 Hurley, S. L.â•‡ 15, 19, 26 Hurtado, N.â•‡ 326 Hutto, D.â•‡ 132 Hyde, K. L.â•‡ 138 I Ignacio, N.,â•‡ 64 Imada, T.,â•‡ 100 Imura, T.â•‡ 50 Ingram, J. N.,â•‡ 22 Inhelder, B.â•‡ 53, 269 Intskirveli, I.,â•‡ 19, 32 Ionica C.,â•‡ 17, 50–51 Israel, M.,â•‡ 258 Iverson, J. M.â•‡ 139 Iwamoto, K.â•‡ 71

J Jablonka, E.â•‡ 276 Jackendoff, R.â•‡ 261, 266, 300, 303–304, 308 Jakobson, R.,â•‡ 308, 317, 344 Janda, L. A.â•‡ 260 Janet, P.â•‡ 94, 230 Jarrett, N.â•‡ 49 Jarrold, Ch.,â•‡ 155 Jarvik, M. E.,â•‡ 279–281 Jeannerod, M.â•‡ 26, 30 Jellema, T.â•‡ 32, 41 Jenkins, J. R.,â•‡ 257 Jerison, H.,â•‡ 205 Jespersen, O.,â•‡ 223 Jezzini, A.,â•‡ 19, 32 Johnson, C.â•‡ 258 Johnson, H.â•‡ 18 Johnson, M. H.,â•‡ 50 Johnson-Laird, P. N.â•‡ 352 Jolly, A.â•‡ 15 Jon Shah, N.,â•‡ 72 Jones, D.â•‡ 353 Jones, K.â•‡ 311 Jones, S. S.â•‡ 19 Jordan, M. I.â•‡ 23 Jung-Beeman, M.â•‡ 267 Jungmann, T.â•‡ 124 K Kaminski, J.,â•‡ 40, 58 Kammers, M. P. M.,â•‡ 29 Kano, F.â•‡ 58 Kaplan, D.â•‡ 357 Kappes, J.,â•‡ 152 Karmiloff-Smith, A.â•‡ 269 Keysers, C.â•‡ 15–20, 24, 26–27, 31–32, 35, 93 Kharitonova, M.,â•‡ 321 Killeen, P. R.â•‡ 97 King, J. E.,â•‡ 306 Kintsch, W.â•‡ 260 Kiraly, I.â•‡ 122 Kita S.,â•‡ 165 Klein, B.,â•‡ 311 Klein, J. T.,â•‡ 51 Kloo, D.â•‡ 267 Kluender, K. R.,â•‡ 97 Knight, C.â•‡ 205 Knoblich, G.â•‡ 79 Kobayashi, H.â•‡ 49, 59 Koenig, M. A.â•‡ 205 Kohler, E.,â•‡ 19, 31–32, 93 Koschmann, T.â•‡ 214

 Becoming Human Koshima, S.â•‡ 49, 59 Kotrschal, K.,â•‡ 40 Kouider, S.,â•‡ 161 Kounios, J.â•‡ 267 Kraskov, A.,â•‡ 29 Krifka, M.â•‡ 275, 303 Kuhl, P. K.â•‡ 97–98, 100 Kuhn, S.â•‡ 234 Kuteva, T.â•‡ 303, 307 L La Freniere, P.â•‡ 248 LaBounty, J.â•‡ 64 Lacohee, H.â•‡ 176, 178, 273–274, 291 Laham, D.â•‡ 260 Laibson, D. I.â•‡ 75 Lakoff, G.â•‡ 227, 231, 264 Landau, B.â•‡ 257 Landauer, T. K.â•‡ 260, 264 Landis, T.â•‡ 92 Lane, J. D.,â•‡ 64 Lane, S. M.â•‡ 267 Langacker, R. W.â•‡ 175, 224, 259, 268, 352 Langer, S. K.â•‡ 111 Latash, M. L.â•‡ 23 Leary, M. R.â•‡ 60 Leavens, D. A.â•‡ 66–67, 173 Leber-Marin, J.,â•‡ 348 Legerstee, M.â•‡ 62 Lehmann, H.,â•‡ 58, 60 Leighton, J.,â•‡ 39 Lemon, R. N.â•‡ 29 Leslie, A. M.â•‡ 114, 179–181 Levelt, W. J. M.â•‡ 339 Levinson, S. C.â•‡ 339 Lewis, M.â•‡ 60 Liberman, A.,â•‡ 97, 99, 142 Liebenberg, L.â•‡ 293 Lieberman, P.â•‡ 152 Lieven, E.,â•‡ 252, 258, 289 Liszkowski, U.,â•‡ 62, 111 Litchfield, C. A.,â•‡ 34, 38 Locke, J. L.â•‡ 122–123 Longa, V.â•‡ 60 Lorenz, K. Z.,â•‡ 22, 87, 106–107, 114–115 Lorenzo, G.â•‡ 60 Lotman, Y.â•‡ 266 Lotto, A. J.,â•‡ 100 Lozano, S. C.,â•‡ 103 Lucy, J. Aâ•‡ 357 Luka, B.â•‡ 268 Lukowski, A. F.,â•‡ 130

Luria, A. R.â•‡ 126, 153, 167, 265, 267, 286, 294, 322, 328, 357 M MacLean, K.â•‡ 62 MacNeilage, P. F.â•‡ 308 Mahoney, C. R.â•‡ 356 Malle, B. F.â•‡ 237 Marcos, H.,â•‡ 348 Markowitsch, H. J.â•‡ 288, 291 Marler, P.â•‡ 96, 98–99 Marshall, J.,â•‡ 138 Marshall-Pescini, S.,â•‡ 34, 38 Masangkay, Z. S.,â•‡ 45 Massaro, D. W.â•‡ 100 Matsuzawa, T.,â•‡ 50 Mattar, A. A.,â•‡ 100 Mattingly, I. G.â•‡ 97 Maurer, D.â•‡ 92 McCarrell, N. S.â•‡ 260 McCluskey, K. A.,â•‡ 45 McCune, L.,â•‡ 121, 169, 210, 290 McIntyre, C. W.,â•‡ 45 Mead, G. H.â•‡ 355 Meguerditchian, A.â•‡ 57, 173 Mehler J.â•‡ 98, 163, 246 Melis, A. P.,â•‡ 63 Meltzoff, A. N.â•‡ 18–19, 51, 122, 289 Metcalfe, J.â•‡ 75 Metzing, C.â•‡ 260 Meyer, M.â•‡ 317 Miall, R. C.â•‡ 127 Miklosi, A.,â•‡ 40, 155 Miller, J. D.â•‡ 97 Mintz, T. H.â•‡ 260 Mischel, W.â•‡ 75 Mitchell, P.â•‡ 176, 178, 273–274, 291 Mitchell, R. W.â•‡ 56 Mitra, P.P.â•‡ 197 Moeschler, J.â•‡ 292–293 Molinaro, N.,â•‡ 266 Moll, H.â•‡ 44, 73, 126 Molnar, P.â•‡ 19 Monaghan, P.â•‡ 260 Monjauze, C.,â•‡ 153 Mooney, R.â•‡ 95, 99 Moore, C.â•‡ 60, 71, 78 Moore, M. K.â•‡ 18–19, 51 Moore, R.â•‡ 171 Moore, T.â•‡ 23 Morgenstern, A.,â•‡ 348 Morton, E. S.â•‡ 155 Morton, J.â•‡ 50

Muller, H. J.â•‡ 73 Munakata, Y.â•‡ 321 Myin, E.â•‡ 105, 232 Myowa-Yamakoshi, M.,â•‡ 50 N Nadel, J.â•‡ 87 Nagel, T.â•‡ 79, 355 Nagy, E.â•‡ 19 Nagy, W. E.,â•‡ 257 Namy, L. L.â•‡ 110, 147 Navon, D.â•‡ 55 Nelson, C. A.â•‡ 98, 130 Nelson, K.â•‡ 258 Newmeyer, F. J.â•‡ 342 Newport, E. L.â•‡ 98 Nielsen, M.,â•‡ 88 Ninio, A.â•‡ 167, 193, 232, 259, 310 Nissen, H. W.â•‡ 73–74 Nitsch, K.â•‡ 260 Noh, E. J.â•‡ 318 Nowicki, S.â•‡ 95, 99 Nunes, T.â•‡ 293 Nuñez, M.â•‡ 248 O O’ Regan, J. K.â•‡ 22 O’Grady-Batch, L.â•‡ 138 O’Regan, K.,â•‡ 105, 232 Oates, T.â•‡ 233 Obhi, S. S.â•‡ 314 Oden, D. L.,â•‡ 305 Ohala, J. J.â•‡ 333 Ohms, V. R.,â•‡ 97 Olson, D. R.â•‡ 222, 298, 356 Olson, S. L.â•‡ 64 Onishi, K. H.â•‡ 179 Oppenheim, G. M.â•‡ 153, 259 Origgi, G.â•‡ 209 Osvath, M.â•‡ 108 Owings, D. H.â•‡ 155 Oztop, O. E.â•‡ 15 Ozyurek, A.,â•‡ 165 P Pacton, S.â•‡ 259 Panfilov, V. Z.â•‡ 240 Papineau, D.â•‡ 109 Pares, J.,â•‡ 348 Parkinson, J. A.,â•‡ 109 Partington, A. S.â•‡ 260 Pascalis, O.â•‡ 98 Paukner A.,â•‡ 17, 19, 21, 50–51 Penn, D. C.,â•‡ 42, 64

Author index  Pepperberg, I. M.â•‡ 51 Peretz, I.â•‡ 138 Perner, J.â•‡ 178–180, 267, 278 Perrett, D. I.â•‡ 15–20, 24, 27–28, 32, 35, 41 Perruchet, P.â•‡ 259 Peschke, C.â•‡ 152 Peskin, J.â•‡ 248 Peters, S.,â•‡ 95, 99 Pettenati, P.,â•‡ 137 Piaget, J.â•‡ 53, 98–99, 113–115, 126, 131, 181, 233, 286, 292, 323, 328, 343 Piatelli-Palmarini, M.â•‡ 213 Pierce, K. A.â•‡ 267 Pierrehumbert, J. B.â•‡ 259 Piller, A. G.â•‡ 153 Pine, J. M.â•‡ 289 Pinker, S.â•‡ 213 Pirow, R.â•‡ 40, 68 Pisoni, D. B.â•‡ 267 Platt, M. L.â•‡ 51 Ploog, D.â•‡ 94, 163, 173, 192 Plooij, F.â•‡ 65 Plunkett, K.â•‡ 190 Plyusnina, I.,â•‡ 64 Poizner, H.,â•‡ 138 Polgardi, R.,â•‡ 40 Port, R.â•‡ 152 Povinelli, D. J.â•‡ 39, 41–43, 51, 58, 292 Povinelli, D. J.â•‡ 42, 64 Prabhu, G.â•‡ 29 Prather, J. F.,â•‡ 95, 99 Premack, A.â•‡ 183, 193, 251 Premack, D.â•‡ 183, 193, 251, 279–281 Proctor, J.â•‡ 62 Progovac, L.â•‡ 170, 244, 300 Pustejovsky, J.â•‡ 266 Q Quallo, M. M.,â•‡ 29 R Racine, T. P.â•‡ 355 Ray, E.â•‡ 18 Reboul, A.â•‡ 216, 324 Recanati, F.â•‡ 276 Reddy, V.â•‡ 45, 50, 69, 184 Regolin, L.â•‡ 50 Reynolds, P. C.â•‡ 79, 83 Richtsmeier, P. T.,â•‡ 142 Riedel, J.,â•‡ 40

Riggs, K. J.â•‡ 278 Risjord, M.â•‡ 206 Riviere, A.â•‡ 248 Riviere, L.â•‡ 172 Rizzi, L.â•‡ 213 Rizzolatti, G.â•‡ 13, 19, 23, 26, 30,–33, 35, 91, 93 Roberts, A. C.,â•‡ 109 Robinson, E. J.â•‡ 216 Rochat, M.,â•‡ 19, 32 Rochat, P.â•‡ 57, 60 Roediger III, H. L.â•‡ 130 Rohlfing, K. J.,â•‡ 124 Romanzetti, S.,â•‡ 72 Ronnqvist, L.â•‡ 245 Rosa-Salva, O.,â•‡ 50 Rowland, C. F.â•‡ 258, 289 Rowlands, M.â•‡ 222 Rozzi, S.,â•‡ 32 Ruffman, T.â•‡ 179–180 Ruggiero, A.â•‡ 19, 51 Rumelhart, D.â•‡ 23 Russell, J.â•‡ 278 Russon, A. E.â•‡ 34, 38, 141–142 Ryle, G.â•‡ 181, 310 S Saar, S.,â•‡ 197 Sabater-Pi, J.â•‡ 70 Sabbagh, M. A.â•‡ 77 Sacks, H.â•‡ 214 Sackur, J.â•‡ 161 Saffran, J. R.,â•‡ 98 Salazar-Orvig, A.,â•‡ 348 Salomo, D.â•‡ 258 Samuel, A. G.â•‡ 152 Sanchez de Zavala, V.,â•‡ 113, 280, 328 Sapir, E.â•‡ 175, 224 Sarria, E.â•‡ 61, 70 Sartori, L.,â•‡ 125 Savage-Rumbaugh, S.,â•‡ 71, 81 Saxe, G. B.â•‡ 293 Saylor, M. M.â•‡ 111 Scaife, M.â•‡ 49 Schafer, M.,â•‡ 111 Schepina, O.,â•‡ 64 Schilbach, L.,â•‡ 72 Schloegl, C.,â•‡ 40 Schmitt, R. M.,â•‡ 15 Schogler, B.â•‡ 245 Scholz, B. C.â•‡ 106 Schooler, J. W.â•‡ 267 Schwartz, M. F.â•‡ 259

Scott, L. S.,â•‡ 98 Scott-Phillips, T, C.â•‡ 92 Scribner, S.â•‡ 357 Searle, J.â•‡ 132, 222, 229–230 Sebanz, N.,â•‡ 79 Senghas, A.â•‡ 165 Seyfarth, R. M.,â•‡ 67, 187, 291 Shallcross, W. L.â•‡ 124 Shanker, S. G.â•‡ 71, 81 Shankweiler, D.,â•‡ 97 Sheets-Johnstone, M.â•‡ 139 Shepherd, Samantha,â•‡ 29 Shepherd, Stephen. V.,â•‡ 51 Siegal, M.â•‡ 176 Simion, F.â•‡ 50 Simon, H.â•‡ 150 Simpson, A.â•‡ 278 Sims-Knight, J.,â•‡ 45 Singh, L.â•‡ 142, 175 Sinigaglia, C.â•‡ 33, 35 Skinner, B. F.â•‡ 17 Slater, P. J. B.â•‡ 193 Smith, J. D.â•‡ 323 Smith, J. H.â•‡ 193 Smith, K.â•‡ 144 Smith, L. B.â•‡ 21, 115 Smith, P.â•‡ 155 Smulovitch, E.,â•‡ 138 Snedeker, J.â•‡ 153 Solovyev, V. D.â•‡ 260 Southgate, V.,â•‡ 62, 166, 184, 334 Spaulding, S.â•‡ 77 Speares, J.,â•‡ 258 Spence, Ch.â•‡ 30 Spengler, S.â•‡ 15 Sperber, D.â•‡ 318, 334, 338 Sperry, R. W.â•‡ 22 Stamatopoulou, D.â•‡ 146 Stamenov, M. I.â•‡ 13 Staniloiu, A.â•‡ 288, 291 Steels, L.â•‡ 297 Stefanini, S.â•‡ 137 Stein, M. L.,â•‡ 257 Stepika, A.,â•‡ 64 Sternberg, R. J.â•‡ 257 Steyvers, M.â•‡ 260 Stowe, M.â•‡ 43 Strawson, P. F.â•‡ 226 Strother, L.,â•‡ 314 Studdert-Kennedy, M.â•‡ 97 Studdert-Kennedy, M.â•‡ 97 Suchow, J. W.â•‡ 281 Suddendorf, T.â•‡ 80, 88, 288 Summerfield, C.â•‡ 22

 Becoming Human Suomi S. J.â•‡ 17, 19–21, 50–51 Sutter, M.â•‡ 75 Sutton, J.â•‡ 339 Sweetser, E.â•‡ 215

U Udell, M. A. R.,â•‡ 71 Umilta, M.A.,â•‡ 19, 31–32, 93 Uriagereka, J.â•‡ 213, 223, 305

T Taglialatela, J.â•‡ 71, 173 Tallerman, M.â•‡ 312–313, 341 Tallis. T.â•‡ 171 Tanaka, M.,â•‡ 50 Taulu, S.,â•‡ 100 Taylor, C.,â•‡ 23 Taylor, H. A.â•‡ 356 Taylor, T. J.â•‡ 71, 81 Tchernichovski, O.â•‡ 197 ten Cate C.â•‡ 97 Tenenbaum, J. B.â•‡ 260 Tepest, R.,â•‡ 72 Teufel, C.,â•‡ 40,â•‡ 68 Thacker, A.â•‡ 138 Thayer, B. R. J.â•‡ 145 Theakston, A. L.,â•‡ 289 Thelen, E.â•‡ 21, 115 Thier, P.â•‡ 30 Thomas, N. J. T.,â•‡ 117–118, 232 Thompson, R. K. R.,â•‡ 305 Thornton, C.â•‡ 269 Tomasello, M.â•‡ 39–41, 43–44, 50–52, 57–58, 60, 63–64, 67–68, 70–71, 73, 77, 82, 111, 126, 140, 169, 173, 197, 258–259, 290, 303, 306 Tomonaga, M.,â•‡ 50, 58 Topal, J.,â•‡ 40, 155 Tranel, D.â•‡ 311 Traugott, E. C.â•‡ 303 Trevarthen, C.â•‡ 17, 49, 51, 245 Trut, L.â•‡ 64 Tsakiris, M.,â•‡ 29 Tucker, M.â•‡ 23 Tuller, L.,â•‡ 153 Turk-Browne, N. B.â•‡ 153 Turvey, M. T.â•‡ 97 Tversky, B.â•‡ 103

V Valentine, T.,â•‡ 265 Vallortigara, G.â•‡ 50 van Eijck, J.â•‡ 227 Van Heijningen, C. A. A.,â•‡ 97 van Maanen, C.â•‡ 62, 166, 334 van Rooijen, J.â•‡ 41 van Schaik, C. P.â•‡ 63 Vauclair, J.â•‡ 57, 143, 173 Vaughn, B. E.,â•‡ 45 Vea, J.â•‡ 70 Veneziano, E.â•‡ 210 Vermeulen, N.â•‡ 94 Verstraten, F. A. J.,â•‡ 29 Vespignani, F.,â•‡ 266 Vigliocco, G.â•‡ 260, 263 Vihman, M. M.â•‡ 99 Vinson, D.â•‡ 263 Visalberghi, E.,â•‡ 19, 21, 51 Vogeley, K.â•‡ 72 Vogt, P.â•‡ 258 Volterra, V.â•‡ 49, 62, 137, 156 Volz, K. G.â•‡ 75 von Cramon, D. Y.â•‡ 75 von Grunau, M.,â•‡ 52 Von Holst, E.â•‡ 22 Vonk, J.â•‡ 41 Vosgerau, G.â•‡ 132 Voss, J.,â•‡ 22 Vygotsky, L S.â•‡ 61–62, 70–71, 120, 163, 210, 228, 233–235, 239, 244, 257, 283–284, 286, 290, 299, 306, 322, 328, 348, 357 W Wagner, U.â•‡ 277 Walsh, V.â•‡ 16 Wang, H.,â•‡ 197 Warkentin, V.â•‡ 87

Warneken, F.,â•‡ 63 Webb, B.â•‡ 23 Wellman, H. M.â•‡ 64, 294 West, S. A.,â•‡ 155 Wettstein, H.â•‡ 222 Whalen, D. H.â•‡ 97, 142 Whalen, P.â•‡ 73 Wheelwright, S.â•‡ 50 Whiten, A.,â•‡ 34, 38, 79–80, 126 Wiebe, S. A.,â•‡ 130 Wierzbicka, A.â•‡ 237 Wilhelm, I.,â•‡ 277 Williams, R. F.â•‡ 170 Williamson, C.â•‡ 70 Wilms, M.,â•‡ 72 Wilson, D.â•‡ 318, 334, 338 Wittgenstein, L.â•‡ 261 Wohlschlager, A.â•‡ 141 Woll, B.â•‡ 138 Wolpert, D. M. â•‡ 22, 126–127 Wonnacott, E.â•‡ 144 Woodward, A. L.,â•‡ 24 Wrangham, R.â•‡ 64 Wrede, B.,â•‡ 124 Wynn, T.â•‡ 79, 152 Wynne, C. D. L.â•‡ 71 Wysocki, K.,â•‡ 257 Y Yamaguchi, M.K.,â•‡ 50 Yip, M. J.,â•‡ 163 Yule, G.,â•‡ 258 Z Zahavi, D.,â•‡ 57 Zandbergen, M. A.,â•‡ 99 Zedelius, C. M., Veling, H. & Aarts, H.â•‡ 108 Zeedyk, M. S.â•‡ 60 Zentall, T. R.â•‡ 19 Zhang, Y.,â•‡ 100 Ziegler, W.â•‡ 152 Zuidema, W.â•‡ 194–195 Zwickel, J.â•‡ 73

Subject index A Allocentrism:â•‡ 339, 341, 343–346 ‘Altercentrism’:â•‡ 359 Altruism (See also Cooperation):â•‡ 62–63, 66; kin _:â•‡ 83, 155; reciprocal _:â•‡ 83, 155 Anaphora:â•‡ 154, 308, 313 Arbitrariness:â•‡ 140, 143–146, 150, 156, 173, 189–190 Attributive description:â•‡ 320– 321 Autism:â•‡ 49, 154 Axis (right/left _):â•‡ 55–57, 80, 120

Cooperation: _and pointing:â•‡ 63–64, 67–68, 73; _ and individual selection:â•‡ 154; emotional-cooperative function:â•‡ 312 Creative problem-solving:â•‡ 234, 261, 267, 269, 320, 348 Creoles:â•‡ 312

B Bees:â•‡ 211 Bird song learning:â•‡ 99; quotation in _:â•‡ 211; directional cultural ratchet in _:â•‡ 197; no clear addressee’s role in _:â•‡ 96; mirroring and _:â•‡ 95 Bleaching: _ and protodeclarative:â•‡ 175; _ and the subject term:â•‡ 223; _ and linguistic progress:â•‡ 263; _ and echo interrogation:â•‡ 328

E Expectation:â•‡ 21–22, 107–110, 115, 126, 131–132 Extended (or external) mind (See also external memory):â•‡ 305; _ and prostheses:â•‡ 132, 306

C Cataphora:â•‡ 313 Causality:â•‡ 292, 311 Cleft-sentences:â•‡ 241–242, 357 Communicative intention (understanding of utterers’ _):â•‡ 69, 73–74 Compositionality: _ and perception:â•‡ 161, 252; genesis of _:â•‡ 164; beliefs of the hearer and compositionality:â•‡ 207; _ and mentalese:â•‡ 237; _ and technical abilities:â•‡ 78 Consciousness:â•‡ 108, 161, 236, 261; animal _:â•‡ 275, 281; stream of _:â•‡ 357

G ‘gone’:â•‡ 289–290, 295 Gossip:â•‡ 206

D Deception: See Lying Distality:â•‡ 103, 275 Donkey-sentences:â•‡ 348

F Forward models:â•‡ 23, 27, 127 Fossils: linguistic _:â•‡ 300, 304

H Helen Keller’s spelling:â•‡ 146 Hemispher (hemispheric specialisation): _ and inversion of the right/left axis,â•‡ 56; _and movements adapted to the model,â•‡ 120, 141; _and pantomimes,â•‡ 138; _ and chimpanzees’ gestual communication,â•‡ 173; _ and intonation:â•‡ 174, 245

Holophrase: syntax and child’s _:â•‡ 165; protodeclarative _ and linguistic learning:â•‡ 166; conative _:â•‡ 307; choral _:â•‡ 192, 329; Holophrastic Era:â•‡ 168– 169, 176, 183, 205, 224, 252, 270, 286, 297, 305, 309 I Imitation of movements: development of _:â•‡ 16; neonatal imitation:â•‡ 17; useful and useless imitation:â•‡ 38; _ and culture:â•‡ 197 Implication (versus explicitation):â•‡ 186, 326, 350–355 Innatism (innate syntax)â•‡ 131, 163, 177–181, 252, 305 Inner speech:â•‡ 357, 100, 118, 145–148, 151–153, 195, 233–235, 283–289, 299 Insects (See also bees):â•‡ 275 Intonation:â•‡ 94, 138, 141–142, 148, 172–176, 191, 200–201, 224, 240–246, 257–258, 300, 307, 327, 333–334, 357; _ in lullabies and motherese:â•‡ 85 L Lying:â•‡ 230, 247–248 M Meno paradox:â•‡ 107, 243, 320 Mental time travel:â•‡ 75, 288, 291 ‘Mentalese’:â•‡ 131, 176, 237, 245, 275 ‘Motionese’:â•‡ 124, 170 Mutualism: See Cooperation N Negative predication:â•‡ 221–222 Nicaraguan deaf people:â•‡ 165, 169 ‘No’:â•‡ 210–211, 214–216, 318 Numerical ability (number):â•‡ 213– 214, 293–295, 354

 Becoming Human P Pantomime:â•‡ 114–115, 124, 137– 138, 143, 146, 173; _and communication with foreigners or deaf people:â•‡ 156 Pedagogy:â•‡ 171; catch a baby’s attention:â•‡ 85, 193; _ and speech episodes made by adults to children: â•‡ 89, 322; _ and motionese:â•‡ 124 Phatic communicative function:â•‡ 205, 206 Phoneme (phonemic):â•‡ 97–98, 141, 146, 150–154, 168 Pidgin:â•‡ 165, 169, 304 Polylith:â•‡ 78 Pregrammatical syntax:â•‡ 161, 213, 214, 300, 321 Primatocentrism:â•‡ 20, 51 Q Quotation:â•‡ 208–214, 299, 341, 347 R Ratchet effect:â•‡ 197 Recursivity:â•‡ 211–214, 341–342

Redescription:â•‡ 269 Relevance (Gricean _):â•‡ 338

Synsemantic:â•‡ 241–242, 339, 350, 356–357

S Schizophreny: See Voices Second-person:â•‡ 183–184, 186, 214, 249–250, 274, 341, 343–345, 347, 356 Self-conscious emotions:â•‡ 60, 69, 72, 83, 130 Self-perceptible (self-visible or self-audible) movements:â•‡ 14–16, 33, 38, 93, 245; _ and some (incorrectly called) ‘opaque actions’:â•‡ 18; _ and information about their proper fulfilment:â•‡ 24; _ and linguistic codes:â•‡ 145–146, 172 Self-regulation of attention:â•‡ 75, 131, 234 Sign Language:â•‡ 145–146 Subitising:â•‡ 214, 293, 354 Subordination:â•‡ 338–339 Syncategorematic:â•‡ 261–262, 313, 342

T Tense:â•‡ 101–102, 273, 276–278, 341; mental time travel and _:â•‡ 288; absolute and relative _:â•‡ 343 Throwing:â•‡ 245 Tracks:â•‡ 291–293, 295, 299, 355 Tropism: plant _:â•‡ 275 V Verb ‘to say’:â•‡ 250, 278, 322, 337–338, 343 Verb ‘to believe’:â•‡ 250–251 Vervets:â•‡ 187, 291 Voices (schizophreny):â•‡ 145, 151 Vygotskian Principle:â•‡ 61, 120, 163, 210, 234, 244, 290, 348 W Working memory:â•‡ 152–153 Writing:â•‡ 222, 228–229, 305, 356–358