Interdisciplinary Anthropology: Continuing Evolution of Man

Interdisciplinary Anthropology . Wolfgang Welsch l Wolf Singer Editors Interdisciplinary Anthropology Continuin...

Author: Wolfgang Welsch | Wolf J. Singer | André Wunder

45 downloads 1233 Views 2MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Interdisciplinary Anthropology

.

Wolfgang Welsch

l

Wolf Singer

Editors

Interdisciplinary Anthropology Continuing Evolution of Man

l

Andre´ Wunder

Editors Prof. Dr. Wolfgang Welsch Universita¨t Jena Institut fu¨r Philosophie Zwa¨tzengasse 9 07743 Jena Germany [email protected]

Prof. Dr. Wolf Singer MPI fu¨r Hirnforschung Abt. Neurophysiologie Deutschordenstr. 46 60528 Frankfurt/Main Germany [email protected]

Andre´ Wunder, M.A. Universita¨t Zu¨rich Philosophisches Seminar Zu¨richbergstrasse 43 CH-8044 Zu¨rich [email protected]

ISBN 978-3-642-11667-4 e-ISBN 978-3-642-11668-1 DOI 10.1007/978-3-642-11668-1 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2011924896 # Springer-Verlag Berlin Heidelberg 2011 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover photo: Michael K. Nichols/getty images Cover design: deblik, Berlin Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

Preface

This volume is the result of a research project entitled “Evolutionary Continuity – Human Specifics – The Possibility of Objective Knowledge” that was carried out by representatives of six academic disciplines (evolutionary biology, evolutionary anthropology, brain research, cognitive neuroscience, cognitive psychology, and philosophy) over a period of three-and-a-half years starting July 1, 2006, and ending December 31, 2009. The starting point for the project was the newly emerging riddle of human uniqueness. Formerly, people believed it possible to determine which features distinguish humans from other animals. Rationality, i.e., the possession of mind, reason, language, logical thinking, etc., was thought to be the unique characteristic of human beings. This is precisely what the old definition of the human as animal rationale suggested: only human beings possess rationality and this sets them apart from all other creatures. But the results of scientific research fundamentally questioned this view in recent decades. With regard to the dimensions of rationality (possession of concepts, arithmetic, reasoning, etc.), it was found that they not only exist in us humans, but that at least early forms can be found in our close and distant relatives in the animal world. Not a single element of rationality is really exclusive to humans. For example, all mammals are capable of elementary categorizations; pigeons are experts in abstraction and generalization; chimpanzees and bonobos do not only understand causal relationships in the physical world but are also able to understand what their conspecifics think; finally, chimpanzees and orangutans are able to act on the basis of prior reasoning. Certainly, most of these skills are more perfectly developed in us than in our relatives. Yet, they are – and this precisely is the new insight – in no way exclusive to humans. Rather, our rationality constitutes an advancement of animal rationality. Alarmed by these results and in order to adhere to the exclusivity of humans, many attempts were made to come up with other human specifics. However, all of the alternatives turned out to be untenable in the light of recent research. The making and use of tools, for example, are common in the animal world; aesthetic judgment can already be observed in animals; the same applies to altruism, or to walking upright,

v

vi

Preface

grasping hand, premature birth, and neoteny. Even sadism can be found sporadically among our closer relatives. In short, nothing in humans can be considered an absolute novelty, spontaneously occurring when humans appeared on the earth. Rather, we have to see these traits as advancements of prehumanly existing characteristics. On the other hand, it goes without saying that we humans are quite extraordinary beings doing things without simile in the animal kingdom. No species among the higher organisms is so widely spread all over the world, constructs cathedrals, surfs the web, and engages in space travel. Only humans have developed poetry, philosophy, science, and technology. We humans clearly differ from other creatures in our achievements. The common denominator for all of these distinctly human accomplishments is “culture.” Humans are cultural beings par excellence and that is what renders humankind distinct from any other species. Of course, certain preforms of culture can be found in the animal world as well: from the formation of colonies over sophisticated forms of communication up to the invention of tools. In chimpanzees, we can even observe cultural diversity between different populations as one population might use different cultural practices than another but in very similar contexts. Yet, what animal culture (even in chimpanzees) lacks is cumulative cultural development, the ongoing procession of developments in which all achievements constantly form the basis for further steps. This is typical of humans, and this has brought about the gigantic cultural evolution that so obviously distinguishes humans from their fellow beings. Hence, this is the situation: though the uniqueness of human beings is undisputable, all explanations for this fact successively got lost in recent decades. There is no special factor that could explain the particularities of human existence. Rather, all human skills derive from a continuous relation to prehuman skills, that is to say elements that have been developed earlier in phylogeny and have been inherited therefrom. But starting from abilities that are anything but special, how could the particularity of human beings have come into being? This is the modern riddle of human uniqueness. The only possible explanation is that our uniqueness must have emerged from our evolutionary heritage. Since in human evolution our ancestors had to start with the same endowment as our closest relatives, it obviously is the case that in hominization the use of this heredity must have acquired a direction considerably different from that of our animal companions – which finally led to the impressive achievements of cultural evolution. Our ancestors must have been seized by a special dynamic development or used their endowment in a specific way that the uniqueness of humankind emerged and animal-like humans became fully fledged human beings. This was the issue underlying the project. Starting from this point, the following research questions were formulated: How strong is evolutionary continuity in human beings? How can we understand that it gave way to cultural discontinuity? Which aspect of cultural existence is really unique to humans? Can the possibility of objective knowledge be seen as an (admittedly extreme) case in point? These research questions were first developed by Prof. Dr. Wolfgang Welsch (Friedrich-Schiller-University, Jena). To realize the project he invited five research

Preface

vii

partners: Dr. Julia Fischer (German Primate Center, Go¨ttingen), Dr. Hannes Rakoczy (Max Planck Institute for Evolutionary Anthropology, Leipzig), Prof. Dr. Wolf Singer (Max Planck Institute for Brain Research, Frankfurt/Main), Dr. Ricarda I. Schubotz (Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, later Max Planck Institute for Neurological Research, Ko¨ln), and Prof. Dr. Rainer Mausfeld (Institute of Psychology at Christian Albrecht University of Kiel). Jointly, six areas of research were defined that addressed different aspects especially productive with regard to the overall question. The results of three-and-ahalf years of research are now presented in the six chapters of this volume. They document a combination of meticulous empirical studies with theoretical and metatheoretical thinking. The final Overview (Forster/Welsch) summarizes the results once more with regard to the leading research questions.1 It is and always has been the persistent conviction of all authors that the ship of research has to pass the Scylla of a simply naturalistic reductionism and the Charybdis of an abundant supranaturalism to sail past a one-sided orientation on merely physicist and neurobiological issues on the one hand and an ignorant rejection of empirical research results on the other and finally enter the open sea of evolutionary enlightenment. We hope that this volume will help us to take the ship forward some distance and that it presents aspects apt to determine our future understanding of evolution and of humankind’s position in it. Finally, we would like to add some words of thanks. We are very much obliged to the German Federal Ministry of Education and Research that has financed this project and the German Aerospace Center that has proved to be a helpful and competent partner in all phases of research. Furthermore, we would like to thank Friedrich-Schiller-University, Jena, which has given us the opportunity to introduce the project in a lecture series in the winter semester 2006/2007 and also to present the results in a closing conference in December 2009. Thanks are also due to the Springer publishing company for its spontaneous interest in the project and for the careful and accurate design of the volume. Finally, we would like to thank all other research partners and their staff (Julia Fischer, Maria Golde, Rainer Mausfeld, Reinhard Niedere´e, Hannes Rakoczy, Elisabeth Scheiner, Ricarda I. Schubotz, Christian Spahn, Peter Uhlhaas, and Emily Wyman) for their dedicated cooperation, as well as the various honorable international colleagues (Merlin Donald, Christopher Frith, Ruth Millikan, Joe¨lle Proust, and Evan Thompson) for their contribution to the discussions. Last but not least, we are very much obliged to Michael Forster who, using his stupendous understanding of latest results in

1

With regard to the general question of how animal-like humans became really human, see Wolfgang Welsch’s explanations concerning the origin of human uniqueness during the protocultural period (starting 2.5 million years ago, when the evolution of humankind took up momentum, and lasting until 40,000 years ago, when the takeoff of cultural evolution took place): “Das Ra¨tsel der menschlichen Besonderheit,” in: Jahrbuch 2009 der Deutschen Akademie der Naturforscher Leopoldina (Halle/Saale), LEOPOLDINA (R.3) 55 (2010).

viii

Preface

research on the one hand and his excellent philosophical reflections on the other, lead us back on productive paths time and again and also drew up the closing overview. Berlin, (Germany) August 2010

Wolfgang Welsch Wolf Singer Andre´ Wunder

The project underlying this report was funded by the German Federal Ministry of Education and Research under the project funding reference number 01GWS055-01GWS060. The responsibility for the content of this publication lies with the authors.

Contents

Brain Evolution and Cognition: Psychosis as Evolutionary Cost for Complexity and Cognitive Abilities in Humans . . . . . . . . . . . . . . . . . . . . . . . . . 1 Peter J. Uhlhaas and Wolf Singer Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture of the Perceptual System . . . . . . . . . . . . . . . . 19 Rainer Mausfeld Prospects of Objective Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Christian Spahn Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Ricarda I. Schubotz Emotion Expression: The Evolutionary Heritage in the Human Voice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Elisabeth Scheiner and Julia Fischer Social Conventions, Institutions, and Human Uniqueness: Lessons from Children and Chimpanzees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Emily Wyman and Hannes Rakoczy The Continuity of Evolution and the Special Character of Humans: Concluding Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Michael Forster and Wolfgang Welsch Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

ix

.

Contributors

Julia Fischer Cognitive Ethology Lab, German Primate Center, Kellnerweg 4, 37077 Go¨ttingen, Germany, [email protected] Michael Forster Department of Philosophy, University of Chicago, Stuart Hall 203, Chicago, IL, USA, [email protected] Rainer Mausfeld Department of Psychology, Christian Albrechts University Kiel, Olshausenstr. 62, 24098 Kiel, Germany, [email protected] Hannes Rakoczy Institute of Psychology & Courant Research Centre “Evolution of Social Behavior”, University of Go¨ttingen, 37077 Go¨ttingen, Germany, [email protected] Elisabeth Scheiner Cognitive Ethology Lab, German Primate Center, Kellnerweg 4, 37077 Go¨ttingen, Germany, [email protected] Ricarda I. Schubotz Max Planck Institute for Neurological Research, Gleueler Str. 50, 50931 Ko¨ln, Germany, [email protected] Wolf Singer Department of Neurophysiology, Max Planck Institute for Brain Research, Deutschordenstr. 46, 60528 Frankfurt am Main, Germany; Frankfurt Institute for Advanced Studies, Johann Wolfgang Goethe-Universita¨t, Ruth-Moufang-Str. 1, 60438 Frankfurt am Main, Germany, [email protected] Christian Spahn Department of Philosophy, College of Humanities, Keimyung University, 100 Sindang-Dong, Dalseo-Gu, Daegu 704-701, South Korea, christian. [email protected]

xi

xii

Contributors

Peter J. Uhlhaas Department of Neurophysiology, Max Planck Institute for Brain Research, Deutschordenstr. 46, 60528 Frankfurt am Main, Germany, [email protected] Wolfgang Welsch Institute of Philosophy, Friedrich-Schiller-University Jena, Zwa¨tzengasse 9, 07743 Jena, Germany, [email protected] Emily Wyman Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103, Leipzig, Germany, [email protected]

.

Brain Evolution and Cognition: Psychosis as Evolutionary Cost for Complexity and Cognitive Abilities in Humans Peter J. Uhlhaas and Wolf Singer

Abstract Cognitive functions correlate with the organization and complexity of neural networks. During the evolution of mammalian brains, basic algorithms for neural computations have largely remained unchanged while between species comparisons reveal marked differences in the volume of neocortex. Accordingly, the specific cognitive functions found in humans need to be considered as the product of the iteration of basic cortical algorithms. In humans, one characteristic feature of cortical organization is the addition of strategically important areas that serve as nodes for additional interactions between phylogenetically conserved brain regions. These novel processing structures serve multimodal integration and the generation of metarepresentations. The novel cognitive functions that have emerged from this increase in complexity comprise multiperspectivity, creativity, language, and theory of mind. We propose that certain mental disorders, such as schizophrenia, are a consequence of this evolutionary trend towards complexity whereby the increasing prevalence of self-referential internal computations enhances the risk of specific disturbances in higher cognitive functions. These pathological phenomena of human cognition can thus be considered as specific side effects of the evolution of human capabilities.

P.J. Uhlhaas (*) Department of Neurophysiology, Max Planck Institute for Brain Research, Deutschordenstr. 46, 60528 Frankfurt am Main, Germany e-mail: [email protected] W. Singer Department of Neurophysiology, Max Planck Institute for Brain Research, Deutschordenstr. 46, 60528 Frankfurt am Main, Germany and Frankfurt Institute for Advanced Studies, Johann Wolfgang Goethe-Universit€at, Ruth-MoufangStr. 1, 60438 Frankfurt am Main, Germany e-mail: [email protected]

W. Welsch et al. (eds.), Interdisciplinary Anthropology, DOI 10.1007/978-3-642-11668-1_1, # Springer-Verlag Berlin Heidelberg 2011

1

2

P.J. Uhlhaas and W. Singer

1 Introduction With the advances in the cognitive neurosciences in the last two decades, the prospect for a unified framework for understanding mind and brain has increased tremendously. Numerous studies have shown close correlations and also causal relationships between neural events and mental phenomena that clearly support the view that the human mind is an emergent property of neural events, leaving little room for ontological dualism (Singer, in press). Accordingly, this evidence suggests the view that human mental operations as well as their dysfunction are intimately tied to the architecture and function of the biological hardware that supports cognition and action. While these data clearly place the mind within nature, there are several issues that are still largely unresolved. For example, what are the unique capabilities that characterize the human mind? For long it has been assumed that there is a clear distinction between the supposedly “unique” cognitive abilities of humans and the cognitive functions in “lower” organisms. Yet, recent research suggests that certain abilities that were formerly considered specific to humans, such as theory of mind, shared attention, and use of symbols, can also be found in primates and other species such as birds, albeit in rudimentary forms (Prior et al. 2008; Tomasello et al. 2003). In this chapter, we attempt to identify cognitive processes that are specific to humans by drawing on studies that have examined the structure, function, and organization of nervous systems in organisms of different complexity. In the second step, we relate these differences to cognitive processes that may be considered characteristic of Homo sapiens. Finally, we argue that these specific human functions predispose for certain malfunctions that manifest themselves predominantly in neuropsychiatric disorders, such as schizophrenia.

2 Evolution and Cortical Circuits The relationship between evolution and the organization and function of central nervous systems had already received attention by Charles Darwin. In the Descent of Man (Darwin 1871, p. 10), he suggests that “It is notorious that man is constructed on the same general type or model with other mammals. All the bones in his skeleton can be compared with corresponding bones in a monkey, bat, or seal. So it is with his muscles, nerves, blood-vessels and internal viscera. The brain, the most important of all the organs, follows the same law as shown by Huxley and other anatomists”. According to Darwin, the origin of the human brain has to be sought in the precursors of Homo sapiens and there is an evolutionary continuity in terms of organization and structure of the human brain. In the history of evolutionary neuroscience, this perspective has not always been the dominant approach with some positions emphasizing species differences in

Brain Evolution and Cognition: Psychosis as Evolutionary Cost for Complexity

3

brain organization while others have subscribed to the opposite view [for a review see Striedter (2005)]. The perspective we adopt in this chapter is to emphasize that the basic building blocks of the human brain are evolutionarily conserved, suggesting that the principles and mechanisms for the encoding, processing, and transmission of information in cortical circuits have largely remained unchanged. In contrast, there are important differences in the complexity and organization of cortical circuits that have lead to unique properties of the human brain.

2.1

Basic Building Blocks of Cortical Circuits: Evidence for Evolutionary Continuity

The transmission of information in neural circuits of the vertebrate brain depends mainly on the generation and propagation of action potentials (APs) that represent the fundamental unit of neural coding. A brief survey reveals that the transmission of signals via APs is a universal feature of all vertebrate nervous systems. In addition, APs are present in invertebrates, such as insects, and even in simple nervous systems, such as that of Caenorhabditis elegans (Mellem et al. 2008). However, in all systems, and in particular in simple organisms, signals are also exchanged between nerve cells by direct coupling and by voltage-dependent release of transmitter without APs. The crucial advantage of AP coding is that signals can be propagated over longer distances without attenuation. Important electrical properties of the AP, such as the resting potential, the size, and duration of the AP, have largely remained constant across species (Bullock and Horridge 1965), suggesting that APs represent an energy efficient way of transmitting information in nervous systems. Similarities extend to the mechanisms underlying intercellular communication. Synapses convert electrical signals into chemical messages that diffuse across a gap, the synaptic cleft, and then are converted into de- or hyperpolarizing synaptic potentials (EPSPs or IPSPs). This type of electro-chemical transmission is a universal feature of vertebrate nervous systems and has possibly evolved around 900 million years ago (Hedges et al. 2006). Synaptic signalling can also be found in invertebrates with the difference that signalling complexity is somewhat lower, while the overall structure and function of synaptic mechanisms are preserved (Ryan and Grant 2009). Similarly, the neurotransmitters that convey signals across the synaptic cleft have largely remained unchanged during evolution [for a review see Venter et al. (1988)]. For example, acetylcholine, serotonin, and other catecholamines are not only found in the animal kingdom but also in plants. However, differences between species exist with respect to the expression patterns of neurotransmitter receptors, the prevalence of certain cell types and the composition of ion channels in their membrane. For example, the N-methylD-aspartic acid (NMDA) receptor differs in vertebrates and invertebrates. It is a specialized glutamate receptor that acts as a coincident detector and plays a crucial role not only in signal transmission but also in use-dependent synaptic plasticity and

4

P.J. Uhlhaas and W. Singer

memory formation. Ryan et al. (2008) showed that NMDA receptors in vertebrates possess a specific NR2 subunit that is not present in invertebrates. In addition, there is evidence that specific neuron types vary with species. The von Economo neuron (VEN) is a large bipolar neuron that was first described in humans and in certain great apes (Allman et al. 2010). As these cells occur in the frontal and anterior cingulate cortex, VENs have been related to social and emotional functions. However, more recent work has shown that VENs can also be found in elephants and whales (Butti et al. 2009). These data indicate that selective pressures have led to the evolution of this type of neuron in phylogenetically unrelated groups. However, other more common types of neurons, such as pyramidal cells, stellate cells, and the different populations of inhibitory interneurons, have remained highly conserved in vertebrates (Kaas 2010). Similarities across species extend to properties of network activity, such as neural oscillations. Neural oscillations are a fundamental mechanism for enabling coordinated activity in nervous systems (Uhlhaas et al. 2009). They occur in different frequencies from 1 to 400 Hz and are present in simple organisms, such as the olfactory systems of insects (Stopfer et al. 1997), as well as in all cortices of vertebrates (Buzsaki 2006). Moreover, the mechanisms generating neural oscillations, such as the pacemaker neurons and specific networks that couple inhibitory GABA (g-aminobutyric acid)-ergic interneurons with each other and with excitatory glutamatergic neurons, are highly preserved. In conclusion, while certain differences in cellular structures and transmitter systems exist between species, it appears that overall the “basic” building blocks that are fundamental for the coding and transmission of information in nervous systems have remained highly conserved across species and in particular among vertebrates. This suggests that unique cognitive capabilities are not the result of differences in these parameters.

3 Changes in Brain Size and Organization During Evolution 3.1

Does Size Matter?

While nervous systems in vertebrates exhibit a rather similar organization across species, there are major differences in size and in particular in the organization of cortical circuits. Across mammalian species, brain size varies by a factor of 100,000 (Herculano-Houzel 2009), indicating that the size of the brain may be a critical determinant of cognitive abilities. This is supported by the fact that species with the largest brains, such as cetaceans and primates, display a greater behavioural repertoire than species with smaller brains (Marino 2002). However, the relation between brain size and cognitive abilities is not straightforward. For example, elephants have a similar brain size as humans, yet lack several of the higher cognitive functions. Accordingly, it has been proposed that

Brain Evolution and Cognition: Psychosis as Evolutionary Cost for Complexity

5

what matters is brain size relative to body mass. Jerison (1973) developed the encephalization quotient (EQ), which is defined as the ratio of the actual brain weight over the expected brain weight given the size of the animal (EQ ¼ w(brain)/ Ew(brain)). Thus, an EQ of 1 indicates that the brain mass matches the expected value; an EQ > 1 means that the brain size in that species is larger than expected given its body weight. Indeed, humans are characterized by the largest EQ in the animal kingdom with a value of approximately 7–8, while chimpanzees, the closest living primate relative, have only an EQ of 2.8 (Jerison 1973). Accordingly, the human brain also contains among primates the largest number of neurons, around 100 billion. Several factors have been proposed for the disproportional enlargement of the human brain. The first major increase in brain size in hominid evolution occurred possibly when bipedal apes diverged from other apes about 6 million years ago and these increases were probably related to diet (Leonard and Robertson 1994). The second major increase is associated with the emergence of Homo sapiens about 100,000 years ago, which has reached the current volume of 1,200–1,800 cm3 (Lee and Wolpoff 2003). Dunbar (1993) has proposed a quantitative relationship between brain size and social group size. According to her perspective, one possible role of brain enlargement could be to increase the competence for social negotiations, for the adjustment of appropriate social behaviours and for the generation of historical memories of contextualized behaviour of particular individuals. It has been argued that these differentiated cognitive abilities may require more complex networks and as a consequence more neurons and connections and hence larger brains.

3.2

Changes in Structure and Connectivity

The evolutionary increase in brain size is mainly due to a volume increase of layered structures that evolved fairly recently, such as the neocortex, the hippocampus, and the cerebellum. The neocortex, one of the distinguishing features of mammalian brains, consists of a six-layered cellular sheet composed of pyramidal cells and interneurons that are arranged in horizontal layers, which in turn show a modular substructure of vertical columns the neurons of which that share distinct afferent and efferent connections and have similar functional properties (Rakic 2009). Evidence suggests that the neocortex evolved from a trilaminar reptilian precursor by adding several cellular layers (Reiner 1993). Cortical circuits are fundamentally related to all aspects of higher cognitive and executive processes. In humans, this structure is disproportionally larger than in any other species and makes up 90% of the overall size of the brain, suggesting that the enormous expansion of neocortex may be causally related to the evolution of the specific cognitive abilities that characterize humans. The expansion in size is due to the addition of novel, cytoarchitectonically distinguishable cortical areas and the myriads of connections that link these areas

6

P.J. Uhlhaas and W. Singer

with the phylogenetically ancient cortical and subcortical structures. In early mammals, only 20–25 cortical areas can be distinguished. In contrast, the human neocortex has been estimated to contain ~150 cortical areas (Kaas 2010). Reasons for the increased parcellation of human neocortex into functionally specialized modules are related to the need to maintain effective connectivity patterns. As the cortex grows in size, more neurons have to communicate with one another. Since the dynamic range of neurons is limited, fully connected networks become impossible because they lead to a combinatorial explosion of input. One solution is to implement a small-world architecture that represents an optimal compromise between nearest neighbor and strategic long-range connections. Such parcellated networks comprising nodes of variable sizes can carry out local computations and through long-distance connections can self-organize towards globally ordered states (Sporns et al. 2004). Comparative evidence suggests that small-world architectures are found in various primate brains as well as in other mammals, suggesting a general design principle of cortical networks (Striedter 2005). Although the human neocortex contains the largest number of cortical areas, there is little evidence that any of these areas are unique to Homo sapiens. The frontal and especially the prefrontal cortex are considered to be crucial for higher cognitive processes, yet even rodents have a prefrontal cortex, albeit in rudimentary form. In humans, the frontal lobes are the single largest partition of the cortex and relative to other primates have expanded disproportionally (Semendeferi et al. 2002). Furthermore, cross-species comparisons of the effects of lesions have shown that in humans frontal lesions entrain a wide range of severe cognitive deficits, whereas in cats, for example, there is little change of overt behaviour (Mesulam 1998). The increase in the size of neocortex goes along with a shift in the prevalence of intrinsic over extrinsic connections. Most of the connections a cortical neuron receives come from other cortical neurons rather than from subcortical relays and sensory organs. In the human cerebral cortex, 90% of connections are established with other neocortical pyramidal cells, i.e., not input or sensory neurons (Buzsaki 2006). This architecture results in an immense number of reentrant loops that is reflected in the fact that over 80% of the pathways leaving the cortex are directed to other areas of the cortex. The addition of new areas and the concomitant increase in neuron numbers and connections necessarily also led to an expansion of white matter that harbors cortico-cortical long-range connections. Thus, in complex brains, there is a disproportionate increase in long-range connections leading to an increase in white matter at four-third power of the volume of grey matter during the course of the evolution (Zhang and Sejnowski 2000). This developmental trend is most pronounced in regions that are crucial for human-specific cognitive functions, such as language. One example is the arcuate fasciculus that links auditory areas in the temporal lobe (Wernicke) with executive areas in the frontal lobe (Broca). Thus, in humans, but not in chimpanzees or macaques, frontal cortices are strongly connected via the arcuate fasciculus with the left medial temporal gyrus and inferior temporal gyrus, areas that are critical for language (Rilling et al. 2008). These findings highlight the

Brain Evolution and Cognition: Psychosis as Evolutionary Cost for Complexity

7

possibility that the evolution of language in humans is closely associated with modifications of cortical areas and pathways in temporal and frontal regions that are associated with linguistic functions.

4 Evolution of Cortical Circuits: Consequences for Human Cognition 4.1

Structure Defines Function

One critical question is whether specific human cognitive abilities can be derived from the more complex wiring of cortical circuits. The most likely explanation for the most highly developed cognitive functions in humans is indeed the stunning increase in neuron numbers and the ensuing complexity of the networks. One distinguishing feature of the computations carried out in the human neocortex, and to a lesser degree in nonhuman primates, is (1) the number of synaptic steps or nodes intervening between sensory and executive organs and (2) the number of cross-connections between parallel sensory–motor streams that deal with different sensory modalities and target different effector organs. In simple brains, there are few steps between sensing and acting, leading to stereotyped stimulus–response cascades. Basic learning rules underlying these “simple” behavioural repertoires represent another example of a highly conserved evolutionary heritage. Thus, the computational and physiological implementations of learning mechanisms in simple organisms, such as Aplysia, a sea slug with a nervous system of ~20,000 neurons, show numerous similarities with those of more complex nervous systems. One of the major innovations in the nervous systems of advanced mammals is the increase in behavioural flexibility. This can be directly related to the expansion of the synaptic bridge that links sensation to action and cognition, which are furthermore accompanied by the introduction of numerous parallel lines of communication that allow for information exchange between processing streams (Mesulam 1998). As a result, the synaptic volume dedicated to intermediary processing has increased disproportionally in advanced primates and cetaceans, allowing for responses that can be varied dynamically in a context- and goal-dependent way.

4.2

Multiperspectivity, Pretence, and Theory of Mind

The increasing number of neuronal steps during evolution between sensory registration and cognition has led to profound alterations in the relationship between sensory input and cognition. In cognitive terms, the increasingly complex wiring of the neocortex can be conceptualized as resulting in the increased influence of internal states over incoming sensory data, allowing for state- and context-dependent

8

P.J. Uhlhaas and W. Singer

interpretations of stimuli. Accordingly, one key property of the human mind can be characterized as multiperspectivity that represents a fundamental feature of human cognition (see Mausfeld 2010). Depending on prior knowledge, expectancy, and context, sensory inputs evoke different interpretations. This flexibility is the basis of alternating perception of ambiguous figures, pretend play, metaphors, and allegories. Although this remarkable ability can lead to conflicting interpretations, it is constitutive for human perception. Further examples of multiperspectivity underlie the phenomenon of keeping counterfactual world distinct, such as during watching a theatre production or a movie. Developmental data suggest that the earliest forms of “mental double book keeping” emerge between 1 and 2 years of age and can be linked to pretend play (Nichols and Stich 2000). An example of pretend play might constitute the transformation of a banana by a child into telephone, which is held up to the ear and the child “uses” the banana to convey a message (Leslie 1987; Piaget 1962). In all human cultures, children engage in pretend play, which seems to be a specific human trait (Gomez 2008). A defining feature of pretend play is the ability to represent objects and actions symbolically. Interestingly, the emergence of pretend play has been linked to other cognitive abilities that are potentially specific to humans, such as language and theory of mind, suggesting that the mechanisms that allow the “decoupling” (Leslie 1987) of an object, i.e., the banana, from its present context, is related to the development of a general-purpose mechanism that involves a separate mental workspace, a possible World Box (Nichols and Stich 2000). Accordingly, we suggest that the development of a separate mental workspace is the direct consequence of increasing interactions between different sensory systems on the one hand and the addition of high-level, transmodal processing structures on the other. Crossmodal interactions allow one to detect the same in the seemingly different and, hence, are the basis for abstractions, and iteration of cognitive operations in the added cortical areas allows for the generation of metarepresentations. Both abilities constitute defining features of human cognition, such as multiperspectivity, pretend play, the use of symbols, creativity, and ultimately language.

5 Encephalization, Evolution, and Psychosis “The growing consciousness is a danger and a disease”. F. Nietzsche While the continued iteration of basic algorithms has led to “unique” cognitive abilities in Homo sapiens, there is also evidence that the trend towards increased encephalisation and complexity of cortical networks may also be associated with unique costs. Schizophrenia is a mental disorder that is fundamentally characterized by a loss of contact with reality and a preponderance of self-generated, selfcontained cognitive processes that ultimately manifest themselves in the defining features of psychosis, namely delusions and hallucinations. In the following

Brain Evolution and Cognition: Psychosis as Evolutionary Cost for Complexity

9

section, we propose that certain symptoms in schizophrenia can be considered an evolutionary by-product of human-specific cognitive abilities.

5.1

Schizophrenia: A Decline in Mental Functions?

Throughout the history of psychiatry, schizophrenia has been predominantly conceived as a loss of reason, highlighting a general decline of mental functioning. Kraepelin (1899) even conceptualized the disorder as a dementia with an early onset (dementia praecox). This perspective is supported by a large body of neuropsychological data that show “deficits” in a wide range of functions, such as perception, executive functions, processing speed, attention, and working memory (Heinrichs 2001). Interestingly, some of the dysfunctions in schizophrenia patients can also lead to certain improvements in performance. Under certain experimental conditions, patients outperform healthy controls, e.g., by being less susceptible to perception illusions (Uhlhaas et al. 2006). However, recent perspectives from phenomenological psychopathology suggest a different perspective. Sass (1992) and Sass and Parnas (2003) challenge the prevailing view that schizophrenia impairs mainly cognitive processes and propose that the disorder is characterized essentially by an accentuation of specific human traits of conscious processing. Specifically, Sass has argued that schizophrenia results from an exaggeration of reflexive forms of awareness (Sass 1992). Hyperreflexive forms of awareness and their importance for understanding alterations of conscious experience in schizophrenia can be demonstrated in relationship to perceptual disturbances [see Uhlhaas and Mishara (2007)]. During normal perceptual experience, the subject and the perceptual field do not appear as two opposing poles. On the contrary, the phenomenal self and the phenomenal world appear as a Gestalt through which the subject is prereflectively orientated towards the world (prereflective intentionality), which serves as the basis for action and cognition (Merleau-Ponty 1962). Importantly, this mode of experience is always given, that is, it is highly automatic as we do not need to question our immersion in the world through an explicit epistemic acquisition. In schizophrenia, this nonreflexive, automatic attunement to the world is weakened and as a result, consciousness itself becomes the focus of attention. Schizophrenia patients complain frequently that those aspects of experience that normally constitute the background of experience, such as the perceptual details, invade their experience so that the focus of attention shifts on the phenomenal experience itself (Uhlhaas and Mishara 2007). This basic alteration in the relationship between self and world in schizophrenia has been captured in several formulations, such as a “loss of vital contact with reality” (Minkowski 1927) or “loss of natural selfevidence” (Blankenburg 2001). Accordingly, the basic deficit in schizophrenia may not reside in a reduction of higher cognitive functions, but rather in alterations

10

P.J. Uhlhaas and W. Singer

of fundamental processes that link the self with the world and serve as the background of our mental life. The consequences are perplexity and a questioning of fundamental aspects of mental processes as demonstrated by this patients’ description: “I only saw fragments: a few people, a dairy, a dreary house. To be quite correct, I cannot say that I did see all that, because these objects seemed altered from the usual. They did not stand together in an overall context, and I saw them as meaningless details. My impressions did not flow as they normally do. If I had not continuously reminded myself where I was going, I would just as gladly have stood still somewhere” (Matussek 1987, p. 92). A further investigation of the alterations in conscious experience in schizophrenia reveals that in addition to the reduction in automatic, prereflective modes of experience there is a corresponding increase in those cognitive abilities that may be considered central to the human cognitive system, such as multiperspectivity. For Sass (1992), consciousness in schizophrenia is characterized by “. . .a fundamental failure to stay anchored within a single frame of reference, perspective or orientation. Often this involves a shift among conceptual levels, including hyperabstract as well as hyperconcrete perspectives”. Hyperreflexive forms of awareness are not only a feature of schizophrenia, but show intriguing links with modernist art in which, in a similar way, there is an uncertainty and multiplicity of perspectives. The self-referential aspect of conscious experience in schizophrenia is also demonstrated by the prevalence of metaphysical and religious themes. As patients loose contact with normal life, there is an increased tendency to contemplate the meaning of things and profound philosophical issues, clearly suggesting that schizophrenia does not merely involve a “loss of reason”. In addition, hyperreflexive forms of awareness may also be one cause of the most striking aspects of psychotic experience, namely the delusions and hallucinations that represent the most obvious “break” from reality. While the patient interrogates his inner experiences, the mystery of subjectivity becomes transformed into explanations. When a person intensely scrutinizes the stream of consciousness, the very act of observing changes, for example, the relationship between the body and the self as the body is transformed into an object that appears distinct. Similarly, there is little evidence for an own identity, innerness, and volition, which is implicitly given in a prereflective mode of intentionality. In schizophrenia, the quality of this intense mode of introspection is dramatically increased with the result that the “own body sensation” is not only experienced as distinct from the self but even translocated to a separate mechanistic process that interferes with the self, “the influencing machine”, a typical theme of delusional thinking in schizophrenia (Sass 1992). Similarly, auditory hallucinations can be conceived of as an extension of the inner dialog that we undergo during everyday experience. Previous experimental studies have shown, for example, that the content of hallucinatory experiences can originate from subvocal rehearsal that normally operates outside awareness (Frith and Done 1988). Thus, one explanation for hallucinatory experiences in schizophrenia is the false attribution of a self-generated mental act to an outside

Brain Evolution and Cognition: Psychosis as Evolutionary Cost for Complexity

11

source, possibly because of an abnormally increased focus on inner experiences. As a result cognitive processes that normally operate automatically become conscious and more in the focus of the inner gaze (Sass 1992).

5.2

Neural Correlates of Psychosis

The phenomenological perspective suggests that schizophrenia may be associated with heightened, self-referential cognitive processes. In addition, recent anatomical and functional brain imaging studies have found evidence that abnormal internally generated activity and an abnormal organization of cortical networks may lie at the core of psychotic phenomena. Functional imaging studies have revealed increased neuronal activation, as assessed by Blood Oxygenation Level-Dependent Imaging (BOLD), in primary sensory areas during both visual and auditory hallucinations (Dierks et al. 1999; Oertel et al. 2007). These findings suggest that circuits that are normally involved in the processing of externally generated signals become captured and activated by reentrant loops from higher-order areas. This hyperactivation of early sensory areas can explain the vividness of hallucinatory experiences. Hyperactivation of primary sensory areas is associated with increased local connectivity. There is a positive correlation between hyperconnectivity and productive symptoms, such as auditory hallucinations (Hubl et al. 2004). One interpretation of this finding is that hyperconnectivity between higher- and lower-order cortical areas favours backpropagation to the respective primary sensory cortices of activity that is generated in higher sensory areas during visual and auditory imagery, thus generating activation patterns that resemble those induced by sensory stimulation. Recent data by our group (Shritaran et al. submitted) provide strong support for this hypothesis. We investigated the network architecture of white matter fiber pathways in 15 patients with schizophrenia and 15 healthy control participants. The main finding of the study is that schizophrenia patients exhibited significantly more long-range projections relative to control participants, suggesting that corticocortical connectivity is significantly increased. The relevance of this aberrant organization of cortical networks in schizophrenia was demonstrated through a correlation with the occurrence of positive symptoms, especially delusions. These findings suggest that schizophrenia can no longer be considered as a disorder that is mainly associated with a reduction in neural activity and in particular in frontal cortical areas and reduced connectivity as has been emphasized by the majority of theories. Instead, the more recent data clearly point to the role of increased internal communication and self-generated experiences that are likely a consequence of hyperconnectivity of cortico-cortical pathways that pass through white matter. These anatomical and functional correlates may underlie the difficulty in distinguishing between internal and external stimuli which is the defining feature of psychotic phenoma, namely the break with reality.

12

5.3

P.J. Uhlhaas and W. Singer

Psychosis as an Extreme Manifestation in a Continuum of Brain Functions

Several lines of evidence suggest that psychotic phenomena are not found exclusively in the context of the clinically defined syndrome but are rather common and compatible with normal brain functioning. Recent epidemiological data suggest that psychotic symptoms are common in the general population. Estimates suggest a lifetime prevalence of psychotic symptoms between 4 and 28.4% (Nuevo et al. 2010) with hallucinations occurring approximately between 5 and 15% in the general population (Johns and van Os 2001). Further evidence for a continuum between normal brain functions and psychotic phenomena comes from studies that investigated the effects of sensory deprivation. Reducing the amount of exteroceptive stimuli frequently leads to hallucinatory experiences, in particular in the visual and auditory domain, while olfactory, kinaesthetic, and tactile sensations are less common (Goodman 1982). Visual hallucinations include a range of sensations, from light to complex objects and scenes, which cannot be controlled in terms of content, appearance and duration. These findings suggest that cortical networks have an inherent tendency to generate activity patterns independently of sensory inputs and further emphasize the importance of self-generated activity patterns for normal brain functions. One of the roles is with all likelihood the generation of predictions, of priors, on the basis of which sensory signals are evaluated and interpreted. Similar phenomena occur during dreaming, which has been compared by Jung to psychosis (Jung 1909). Llina´s and Pare´ (1991) have pointed out that physiologically there are only minor differences between dreaming and wakefulness. Both states of consciousness go along with activated cortical states that are regulated by ascending projections from subcortical structures, such as mes- and diencephalic nuclei, which receive only minor inputs from sensory organs. Accordingly, the main differences between dreaming and perceiving during wakefulness lies in the weight given to signals provided by sensory afferents that are relayed through specific thalamic nuclei.

5.4

Adaptive Value of Psychotic Phenomena

The prevalence of psychotic phenomena suggests that the underlying physiological and anatomical correlates represent a common property of cortical networks. This notion is supported by the fact that the prevalence rates are similar across cultures and apparently constant over time (Saha et al. 2005). Because the fertility of schizophrenia patients is reduced, this constancy might indicate that the underlying processes that give rise to the clinical phenotype may have adaptive and even reproductive advantages.

Brain Evolution and Cognition: Psychosis as Evolutionary Cost for Complexity

13

Previous work has suggested that the evolution of language in Homo sapiens has been a predisposing factor for the development of schizophrenia. Crow (2008) has argued that the brain’s susceptibility to the epigenetic shaping of its connectome, reflected so succinctly in the diversity of languages, is also the basis for the genetic predisposition to psychosis. However, this theory has remained controversial. Instead, it is more likely that the evolutionary increase in the complexity of cortical networks has led to a novel and more versatile form information processing, of which the disposition for language is only one out of many others. One of the most consistent correlations between the phenotype and cognitive abilities is the relationship with creativity. Eysenck (1995) has argued that individuals with elevated psychotic personality traits show a specific cognitive style characterized by “overinclusive thinking” and thought processes that involve a wider conception of relevance than is conventional. This disposition gives rise to original and novel ideas, the cornerstone of creativity. Interestingly, creativity also shares certain properties with pretend play as both processes require establishing novel relations between hitherto unrelated contents. Carruthers (2002) has proposed that childhood pretence may have evolved to practice and enhance adult forms of creativity. A large body of experimental work supports the link between psychosis and creativity through the following findings: (1) Individuals with either a genetic predisposition for schizophrenia or psychometrically defined subjects who show elevated levels of mild psychotic symptoms perform consistently better on tasks that assess creative and divergent thinking (Nelson and Rawlings 2010). (2) Visual artists have higher levels of psychoticism relative to nonartists and the degree of success is associated with psychoticism (G€ otz and G€otz 1979). (3) There is emerging evidence between risk genes for schizophrenia and creativity. Ke´ri (2009) studied the impact of a polymorphism of the promoter region of neuregulin 1, a candidate gene for psychosis, on creativity in healthy volunteers. Intriguingly, the highest creative achievements and creative-thinking scores were found in individuals who carried a neuregulin polymorphism, which has been related to psychosis risk (Kao et al. 2010). However, the relationship between creativity and psychosis may not hold for the clinical syndrome as the empirical evidence for enhanced creative abilities in schizophrenia patients has been mixed (Nelson and Rawlings 2010). Thus, it has been proposed that the schizophrenia spectrum tends to display an “inverted-U” relationship with creativity most pronounced in schizophrenia spectrum disorders, whereas the clinical syndrome is characterized by an attenuated relationship.

6 Conclusions The current data suggest that the evolution of cortical networks in Homo sapiens has led to novel cognitive abilities, whose essential property may lie in the ability to construct elaborate representations of input–output relations that have only a weak

14

P.J. Uhlhaas and W. Singer

resemblance to the original sensory information. Phenomena that result from this cognitive architecture are multiperspectivity, language, pretend play, theory of mind, and creativity that can be considered human specific. These cognitive abilities are with all likelihood the consequence of increased connectivity among subsystems and the addition of strategic nodes for the generation of metarepresentations of highly integrated and abstract cognitive contents that are unique properties of human brains. Interestingly, the evolution of cortical areas and circuits that gave rise to the amazing computational powers of the human brain has followed a “conservative” approach. There is little evidence for discontinuities neither with respect to the rules governing the coding and transmission of information nor with respect to the anatomical hardware and its functional properties. Accordingly, the innovations in the cognitive architecture that characterize Homo sapiens and the other primates are the consequence of iterations of basic algorithms of cortical processing. In our view, these findings clearly argue for a perspective, which considers the human mind as a product of biological evolution that is characterized by a striking degree of continuity. In defending this position, we wish to emphasize, however, that the human mind cannot be reduced to an emergent function of the underlying biological hardware alone because of the deep cultural embedding of the human species. Homo sapiens promoted cultural evolution but the social (immaterial) realities created in this process influenced in turn through epigenetic shaping the functions of human brains and the contents they have to deal with. While these developments led to a dramatic increase in the cognitive abilities of the human brain, they clearly also had their prize and negative consequences that highlight the precarious balance of our mental functions. On the one hand, they are rendered extremely efficient by our ability to imaging and create, but at the same time these acquisitions can also lead to a retreat into our inner world at the expense of a complete break with reality. We have tried to make the case that the same developmental trends that underlie our uniqueness may also be the cause for schizophrenia. While some of these arguments presented are preliminary and require further testing, we nonetheless believe that an approach that views schizophrenia not merely as a “dementia” but as an accentuation of mental processes that may reflect the cornerstones of our cognitive architecture may yield novel insights and perspectives on this mysterious disorder that for long has been conceptualized as “incomprehensible” and fundamentally different from “normal” mental life.

References Allman JM, Tetreault NA, Hakeem AY, Manaye KF, Semendeferi K, Erwin JM, Park S, Goubert V, Hof PR (2010) The von Economo neurons in frontoinsular and anterior cingulate cortex in great apes and humans. Brain Struct Funct 214:495–517 Blankenburg W (2001) First steps toward a ‘psychopathology of common sense’. Philos Psychiatr Psychol 8:303–315

Brain Evolution and Cognition: Psychosis as Evolutionary Cost for Complexity

15

Bullock TH, Horridge GA (1965) Structure and function in the nervous systems of invertebrates. W. H. Freeman, San Francisco Butti C, Sherwood CC, Hakeem AY, Allman JM, Hof PR (2009) Total number and volume of Von Economo neurons in the cerebral cortex of cetaceans. J Comp Neurol 515:243–259 Buzsaki G (2006) Rhythms of the brain. Oxford University Press, New York Carruthers P (2002) Human creativity: its evolution, its cognitive basis, and its connections with childhood pretence. Br J Philos Sci 53:1–25 Crow TJ (2008) The ‘big bang’ theory of the origin of psychosis and the faculty of language. Schizophr Res 102:31–52 Darwin C (1871) The descent of man, and selection in relation to sex. John Murray, London Dierks T, Linden DE, Jandl M, Formisano E, Goebel R, Lanfermann H, Singer W (1999) Activation of Heschl’s gyrus during auditory hallucinations. Neuron 22:615–621 Dunbar RI (1993) Coevolution of neocortical size, group size and language in humans. Behav Brain Sci 11:681–735 Eysenck HJ (1995) Genius: the natural history of human creativity. Cambridge University Press, New York Frith CD, Done DJ (1988) Towards a neuropsychology of schizophrenia. Br J Psychiatry 153:437–443 Gomez JC (2008) The evolution of pretence: from intentional availability to intentional nonexistence. Mind Lang 23:586–606 Goodman AL (1982) Neurophysiological and psychopharmacological approaches to sensory deprivation phenomena. Prog Neuropsychopharmacol Biol Psychiatry 6:95–110 G€ otz KO, G€otz K (1979) Personality characteristics of professional artists. Percept Mot Skills 49:919–924 Hedges SB, Dudley J, Kumar S (2006) TimeTree: a public knowledge-base of divergence times among organisms. Bioinformatics 22:2971–2972 Heinrichs RW (2001) In search of madness: schizophrenia and neuroscience. Oxford University Press, New York Herculano-Houzel S (2009) The human brain in numbers: a linearly scaled-up primate brain. Front Hum Neurosci 3:31 Hubl D, Koenig T, Strik W, Federspiel A, Kreis R, Boesch C, Maier SE, Schroth G, Lovblad K, Dierks T (2004) Pathways that make voices: white matter changes in auditory hallucinations. Arch Gen Psychiatry 61:658–668 Jerison HJ (1973) Evolution of the brain and intelligence. Academic Press, New York Johns LC, van Os J (2001) The continuity of psychotic experiences in the general population. Clin Psychol Rev 21:1125–1141 Jung CG (1909) The psychology of dementia praecox. The Journal of Nervous and Mental Disease Publishing Company, New York Kaas JH (2010) Cortical circuits: consistency and variability across cortical areas and species. In: von der Marlsburg C, Phillips WA, Singer W (eds) Dynamic coordination in the brain: from neurons to mind. MIT, Cambridge, pp 25–35 Kao WT, Wang Y, Kleinman JE, Lipska BK, Hyde TM, Weinberger DR, Law AJ (2010) Common genetic variation in Neuregulin 3 (NRG3) influences risk for schizophrenia and impacts NRG3 expression in human brain. Proc Natl Acad Sci USA 107:15619–15624 Ke´ri S (2009) Genes for psychosis and creativity: a promoter polymorphism of the neuregulin 1 gene is related to creativity in people with high intellectual achievement. Psychol Sci 20:1070–1073 ¨ rzte, 6th edn. J.A. Barth, Leipzig Kraepelin E (1899) Ein Lehrbuch f€ ur Studierende und A Lee S-H, Wolpoff MH (2003) The pattern of evolution in pleistocene human brain size. Paleobiology 29:186–196 Leonard WR, Robertson ML (1994) Evolutionary perspectives on human nutrition: the influence of brain and body size on diet and metabolism. Am J Hum Biol 6:77–88 Leslie A (1987) Pretense and representation: the origins of “theory of mind”. Psychol Rev 94:412–426

16

P.J. Uhlhaas and W. Singer

Llina´s RR, Pare´ D (1991) Of dreaming and wakefulness. Neuroscience 44:521–535 Marino L (2002) Convergence of complex cognitive abilities in cetaceans and primates. Brain Behav Evol 59:21–32 Matussek P (1987) Studies in delusional perception. (Translated and condensed). In: Cutting J, Sheppard M, (eds). Clinical roots of the schizophrenia concept. Translations of seminal European contributions on schizophrenia. Cambridge University Press, Cambridge, pp 87–103 Mausfeld R (2010) Intrinsic Multiperspectivity: Conceptual Forms, and the Functional Architecture of the Perceptual System Mellem JE, Brockie PJ, Madsen DM, Maricq AV (2008) Action potentials contribute to neuronal signaling in C. elegans. Nat Neurosci 11:865–867 Merleau-Ponty M (1962) Phenomenology of perception. Routledge and Kegan Paul, London Mesulam MM (1998) From sensation to cognition. Brain 121:1013–1052 Minkowski E (1927) La Schizophrenie:Psychopathologie des schizoides et des schizophrenes. Payot, Paris Nelson B, Rawlings D (2010) Relating schizotypy and personality to the phenomenology of creativity. Schizophr Bull 36:388–399 Nichols S, Stich S (2000) A cognitive theory of pretense. Cognition 74:115–147 Nuevo R, Chatterji S, Verdes E, Naidoo N, Arango C, Ayuso-Mateos JL (2010) The continuum of psychotic symptoms in the general population: a cross-national study. Schizophr Bull. doi:10.1093/schbul/sbq099 Oertel V, Rotarska-Jagiela A, van de Ven VG, Haenschel C, Maurer K, Linden DE (2007) Visual hallucinations in schizophrenia investigated with functional magnetic resonance imaging. Psychiatry Res 156:269–273 Piaget J (1962) Play, dreams, and imitation in childhood. Norton, New York Prior H, Schwarz A, G€ unt€ urk€ un O (2008) Mirror-induced behavior in the magpie (Pica pica): evidence of self-recognition. PLoS Biol 8:e202 Rakic P (2009) Evolution of the neocortex: a perspective from developmental biology. Nat Rev Neurosci 10:724–735 Reiner A (1993) Neurotransmitter organization and connections of turtle cortex: implications for the evolution of mammalian isocortex. Comp Biochem Physiol 104A:735–748 Rilling JK, Glasser MF, Preuss TM, Ma X, Zhao T, Hu X, Behrens TE (2008) The evolution of the arcuate fasciculus revealed with comparative DTI. Nat Neurosci 11:426–428 Ryan TJ, Grant SG (2009) The origin and evolution of synapses. Nat Rev Neurosci 10:701–712 Ryan TJ, Emes RD, Grant SG, Komiyama NH (2008) Evolution of NMDA receptor cytoplasmic interaction domains: implications for organisation of synaptic signalling complexes. BMC Neurosci 9:6 Saha S, Chant D, Welham J, McGrath J (2005) A systematic review of the prevalence of schizophrenia. PLoS Med 2:e141 Sass LA (1992) Madness and modernism. Insanity in the light of modern art, literature and thought. Basic Books, New York Sass LA, Parnas J (2003) Schizophrenia, consciousness, and the self. Schizophr Bull 29:427–444 Semendeferi K, Lu A, Schenker N, Damasio H (2002) Humans and great apes share a large frontal cortex. Nat Neurosci 5:272–276 Singer W Bewusstsein und freier Wille. Gross P, Bonhoeffer T (eds) Zukunft Gehirn. Verlag C.H. Beck, M€unchen (in press) Sporns O, Chialvo DR, Kaiser M, Hilgetag CC (2004) Organization, development and function of complex brain networks. Trends Cogn Sci 8(9):418–425 Sritharan S, Kaiser M, Rotarska-Jagiela A, Singer W, Uhlhaas PJ (in preparation) Abnormal long range connectivity in schizophrenia: a graph theoretical analysis of DTI data Stopfer M, Bhagavan S, Smith BH, Laurent G (1997) Impaired odour discrimination on desynchronization of odour-encoding neural assemblies. Nature 390:70–74 Striedter GF (2005) Principles of brain evolution. Sinauer Associates, Sunderland, MA

Brain Evolution and Cognition: Psychosis as Evolutionary Cost for Complexity

17

Tomasello M, Call J, Hare B (2003) Chimpanzees understand psychological states – the question is which ones and to what extent. Trends Cogn Sci 7:153–156 Uhlhaas PJ, Mishara A (2007) Perceptual anomalies in schizophrenia: integrating phenomenology and cognitive neuroscience. Schiz Bull 33:142–156 Uhlhaas PJ, Phillips WA, Mitchell G, Silverstein SM (2006) Perceptual grouping in chronic schizophrenia. Psychiatr Res 145:105–117 Uhlhaas PJ, Pipa G, Lima B, Melloni L, Neuenschwander S, Nikolic´ D, Singer W (2009) Neural synchrony in cortical networks: history, concept and current status. Front Integr Neurosci 3:17 Venter JC, di Porzio U, Robinson DA, Shreeve SM, Lai J, Kerlavage AR, Fracek SP Jr, Lentes KU, Fraser CM (1988) Evolution of neurotransmitter receptor systems. Prog Neurobiol 30:105–169 Zhang K, Sejnowski TJ (2000) A universal scaling law between gray matter and white matter of cerebral cortex. Proc Natl Acad Sci USA 97:5621–5626

.

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture of the Perceptual System Rainer Mausfeld

Abstract It is a characteristic feature of our mental make-up that the same perceptual input situation can simultaneously elicit conflicting mental perspectives. This ability pervades our perceptual and cognitive domains. Striking examples are the dual character of pictures in picture perception, pretend play, or the ability to employ metaphors and allegories. I argue that traditional approaches, beyond being inadequate on principle grounds, are theoretically ill equipped to deal with these achievements. I then outline a theoretical perspective that has emerged from a theoretical convergence of perceptual psychology, ethology, linguistics, and developmental research. On the basis of this framework, I argue that corresponding achievements are brought forth by a specific type of functional architecture whose core features are as follows: (1) a perceptual system that is biologically furnished with a rich system of conceptual forms, (2) a triggering relation between the sensory input and conceptual forms by which the same sensory input can be exploited by different types or systems of conceptual forms, and (3) computational principles for handling semantically underspecified conceptual forms. Characteristic features of the proposed theoretical framework are pointed out using the Heider–Simmel phenomenon as an example.

1 Multiperspectivity The phenomenon in question can be vivaciously illustrated by a self-portrait of the French painter and caricaturist Alfred Le Petit, as depicted in Fig. 1. This painting visualises multiperspectivity in the literal sense of providing, within a single integral situation, multiple views of the same object as seen from different vantage points. At the same time, it allegorically illustrates a much more abstract feature This chapter draws on material published in Mausfeld (2010a, b). R. Mausfeld Department of Psychology, Christian Albrechts University Kiel, Olshausenstr. 62, 24098 Kiel, Germany e-mail: [email protected]

W. Welsch et al. (eds.), Interdisciplinary Anthropology, DOI 10.1007/978-3-642-11668-1_2, # Springer-Verlag Berlin Heidelberg 2011

19

20

R. Mausfeld

of our mental make-up, namely our mental capacity to employ different “mental perspectives” in a given situation of sensory stimulation. For instance, if we mentally focus on the concrete painting hanging on a wall (or being reproduced in a book), the painted canvass is the salient object of perception. If, in another mental perspective, we focus on the depicted scene, we are mentally reading through the canvass, as it were, and regard it simply as a medium by which the depicted scene is conveyed to us as the salient object of perception. While the second mental perspective usually imposes itself more strongly than the first one, we have no problems adopting each of them and switching between the two. Each mental perspective constitutes a specific way of organising the sensory input in terms of meaningful categories and generates its own mental objects with their proprietary attributes. Hence, the assignments of attributes, such as colour, material qualities, depth or the degree of “realness”, and their values, are subordinated to the organisation of the input in terms of perceptual objects. Each mental perspective is associated with and governed by its own specific forms of internal causal analyses. In the first type of mental perspective, these internal causal analyses notably pertain to the kind of operations to which a perceptual object of the type “canvass” is amendable. Accordingly, the four heads seen in Fig. 1 do not differ as to their type of “realness”, namely being painted flat figures on a canvass. In the second mental perspective, these internal causal analyses pertain to a great variety of operations that are associated with the organisation of objects in three-dimensional perceptual space. Accordingly, each of the four portrayed heads in Fig. 1 is being associated with its own type and degree of “realness”. The painting shown in Fig. 1 exemplifies that the same situation of sensory stimulation can elicit in us quite different and even conflicting mental perspectives on what it is that we perceive. This capacity to employ different mental perspectives on what the senses offer to us is so blatantly obvious that we simply take it for granted

Fig. 1 Alfred Le Petit, Autoportrait, 1893

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

21

in everyday life. It is an all pervading property of the way we are designed, and we are perpetually and effortlessly engaged in corresponding mental activities. As it is so pervading, it is, in the context of scientific enquiry, extraordinarily difficult for us to direct our attention to it and to find out on which specific features of our mental makeup it is based. However, once we begin scrutinising it, we will quickly become aware that the human capacity to engender, in a given situation, different mental perspectives is a uniquely powerful feature of our mind whose consequences for our perceptual and cognitive achievements one can hardly overestimate. Most importantly, this capacity provides a pre-eminent pillar for our competence to generate culture. Although this capacity permeates all of our mental activity and thus remains largely inconspicuous, there are some situations that more easily draw our attention to it. For instance, in theatre or cinema, we perceive what happens on the stage or on the screen as segments of events going on in the world while at the same time being aware of actually watching a drama or a movie. With respect to a broad range of aspects, we can employ a mental double book-keeping, as it were, and simultaneously have different spatial, temporal, emotional, and other types of perspectives at our disposal. In the “duplication of space and time that occurs in theatrical representation”, Michotte (1960/1991, p 191f.) noted, “the space of the scene seems to be the space in which the represented events are actually taking, or have taken, place and yet it is also continuous with the space of the theatre itself. Similarly for time also, instants, intervals, and successions for the spectators belong primarily to the events they are watching, but they are left nevertheless in their own present”. The ability to perform this mental double book-keeping emerges quite early in our cognitive development. It notably manifests itself in the pretend play of children (Leslie 1987; cf. also Wyman and Racozcy, this volume), for which there is experimental evidence for children as young as 1–2 years old. Theatre and pretend play provide striking phenomena in which the availability of simultaneous mental perspectives almost becomes conspicuous at the surface of our phenomenal experience. They prototypically exemplify what William James has recognised as the hallmark of our mental life, namely that our “mind is at every stage a theatre of simultaneous possibilities” (James 1890/1983, p 277). Pretend play and other more conspicuous cases of a mental double book-keeping have prompted the question whether they can be explained by a common type of architectural principles of our mind. At the same time these striking phenomena have often impeded the recognition of the extent to which the capacity for simultaneous mental perspectives pervades all our perceptual achievements. The effects of this capacity are much harder to notice in apparently much more elementary situations. We can, for instance, perceive railroad tracks that recede from the observer towards the horizon both as being parallel and, in another mental perspective, as converging. We can perceive four points that are geometrically arranged as forming a square both as a concrete instance of a square and, in another mental perspective, as being four isolated points. We can perceive a circle that is partly occluded by another circle both as two integral circles lying at different depth levels, where the nearer one occludes part of the farther one, and,

22

R. Mausfeld

in another mental perspective, as two differently shaped objects lying side by side in a plane. We can perceive some location on a white wall at which a reddish spot light is directed as both appearing white, or, in another mental perspective, as appearing reddish. Traditionally, such phenomena have been treated under headings such as “proximal mode”, “amodal completion”, or “constancy phenomena”. They have usually been assumed to predominantly result from highly special and often modality-specific mechanisms rather than from a common core of more abstract and general principles of our perceptual architecture. It is far from obvious, and, needless to say, an empirical question whether these different classes of phenomena express, to some theoretically interesting extent, a common abstract property of our mental machinery. However, the conjecture that they do seems to me warranted, on both theoretical and empirical grounds. These and a range of other phenomena share, or so I believe, at their core some structural aspects that point to much deeper principles of the conceptual and computational organisation of our perceptual system; more specifically to structural and dynamic aspects of the conceptual forms with which our perceptual system is biologically endowed and which provide the given “data format” for its internal computations. Before I venture, in the last section of this paper, some corresponding conjectures, I will try, in the Sects. 2 and 3, to identify some of the obstacles and misconceptions that have impeded an appropriate theoretical treatment of achievements expressing forms of multiperspecticity. In Sect. 4, I outline a general theoretical framework that appears to me promising, theoretically and empirically, for dealing with perception in general, and with certain phenomena of multiperspectivity that are grounded on specific design properties of the perceptual system, in particular. To bring out these cases, I introduce the notion of “intrinsic multiperspectivity”. In Sect. 5, I will state more precisely and exemplify what I mean by “intrinsic multiperspectivity”, and then venture, in Sect. 6, a theoretical conjecture as to its architectural and computational foundation. Phenomena and perceptual achievements such as the ones mentioned before raise theoretical problems that are overwhelmingly complex and deep. Therefore, it is hardly surprising that we still lack agreement on even the most basic issues. Opinions are already deeply divided on what can be considered core phenomena that can be assumed to serve as a promising empirical basis for the sharp idealisations on which any successful explanatory account of perception has to be based. As analyses proceed, agreement about how to address the corresponding phenomena in theoretical terms will become even less likely, given the fact that our theoretical understanding of apparently much simpler achievements, such as the constitution of “perceptual objects”, is rather thin (actually, this fundamental issue is almost entirely bypassed in prevailing approaches). Nevertheless, phenomena pertaining to simultaneous conflicting mental perspectives in perception cannot be regarded simply as side issues. Any theory of perception that neglects phenomena of multiperspectivity would be of marginal theoretical interest only. It poses, however, a particular challenge for perception theory that the variety of achievements that in our ordinary experience and in our ordinary discourse testify to some form of multiperspectivity can hardly be expected to rest on a single coherent set of

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

23

principles. Rather such phenomena most likely mirror a variety of different aspects of the functional and computational architecture of our mind and involve complex interactions of various subsystems, such as the perceptual system proper, systems for imagination, higher-order interpretative systems, and language. The question then arises whether there are subclasses of such phenomena that can be assumed, on an appropriate level of idealisation, to result from specific internal properties and design principles of the perceptual system proper, understood as that modular component of our mental architecture on which our perceptual achievements are based. On the basis of an appropriate conception of the functional architecture of our perceptual system, a core of common principles becomes recognisable that yield phenomena of multiperspectivity as an essential consequence. In a later section, I will outline a corresponding theoretical framework that appears to me warranted on theoretical and empirical grounds and deal with ensuing principles that conjecturally generate phenomena of multiperspectivity. However, beforehand, I will turn the spotlight on the main obstacles that have fettered the development of an appropriate theoretical account of perception in general and have disposed perceptual psychology to pay little heed to phenomena of multiperspectivity.

2 Common-Sense Intuitions on Perception Our common-sense conceptions1 of perception can be characterised as a blending of three different but related premises. The first premise pertains to what is regarded as the core achievement of perception, namely conveying a “truthful” representation of the external world. The second premise is intimately related to the first one and pertains to the properties and attributes of perception, which, according to our ordinary intuitions, are regarded to result from adaptations to our environment. The third premise pertains to the unit of analysis in terms of which we conceptualise perception in ordinary discourse, namely in terms of the unit of a person.2 I will briefly deal with these three aspects in turn.

1 In speaking of common-sense conception about perception, I, in the present context, understand the term in the broadest possible way, namely as the diversity modes in which we conceive of perceptual phenomena and the process of perception itself in all contexts other than that of the natural sciences. This usage comprises not only those concepts and ways of world-making, which underlie, as part of our biological endowment, our ordinary discourse about the world and our acts of perceiving – sometimes referred to as “folk physics” and “folk psychology”, but also derived concepts and notions pertaining to perceptual issues that have been developed for other purposes than those of the natural sciences, whether technological, philosophical, or of any other kind. 2 Common-sense-based intuitions about perception are furthermore marked by a preference for “concretistic” conceptions of explanation in terms of tangible objects and concrete mechanism – hence their fascination with neuroreductionist “explanations” and their dislike of more abstract accounts (cf. Mausfeld 2010c).

24

2.1

R. Mausfeld

Perception and “Reality”

According to our ordinary conception of perception, the senses convey to us an appropriate picture of the external world. Figure 2 illustrates, by the example of visual perception, how we basically conceive of the process of perception in our ordinary intuitions. At the core of our ordinary conception of perception thus is some kind of a (culturally refined) naı¨ve realism, that is, the idea that the world, as it really and independently of an observer is, is mirrored in perception. We are convinced that perception basically works the way it phenomenally appears to us and that – apart from a few “perceptual illusions” – we perceive the world as it really is. Hence, we are strongly inclined to take the categories of our perceptual experiences as categories of the external world.3 On this account, perception is regarded as an entirely conspicuous process. At the same time, we are, of course, willing to accept some refinements and subtleties that are compelled by physics, which in turn claims to describe the world as it really is. This, however, does not diminish our conviction that our perception of the world is – apart from some subjective qualities and a few illusions – veridical. At the core of our common-sense conception of perception is what is often been referred to as naı¨ve realism, i.e. the idea that the world, as it really and independently of an observer is, is mirrored in perception. While naı¨ve realism founders already in the face of the most elementary scientific facts say about the properties of our sense

Fig. 2 The pictorial or similarity conception of perception

3 The predisposition to take, in Kant’s words, mere ideas for things in themselves, is the distinguishing mark of all of our mental activity. Kant thought of it as a “transcendental illusion”. The transcendental illusion is the propensity to “take a subjective necessity of a connection of our concepts. . .for an objective necessity in the determination of things in themselves” (Critique of pure reason, A297/B354). Due to this propensity, the influence of which cannot be remedied by intellectual insight into it, we inevitably tend to mistake our own mental categories to hold “objectively” (cf. Grier 2001).

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

25

organs,4 it captures some of our basic intuitions of the mental activity of perceiving, namely being in direct touch with a mind-independent world. The inappropriateness of our ordinary conception of perception becomes already obvious when we compare the world as it is represented in perception with the world as physics describes it.5 The kind of entities that populate our perceptual world are not the kind of abstract entities that populate the explanatory accounts of physics. There are only two ways of characterising what we suppose to be objects in the world, and thus of assigning a meaning to the attribute “real”: One way is marked out by theoretical physics, which considers all these entities as objects of the world, and thus as real, that figure in its currently best explanatory accounts. The second way is delineated by our perceptual capacities and considers as objects of the world, and thus as “real”, all the entities that can show up in our phenomenal experience. These ways are actually, as a result of our cultural development, inextricably interwoven in our scientifically and culturally moulded common sense. In the context of perception theory, it is crucial to be aware that we do not have (aside from theoretical physics) a way for characterising the “objects in the world” that is independent from our perceptual capacities. Thus, we cannot refer to objects in the world in a way that does not crucially depend on the output of the perceptual system. What we regard as “reality” accordingly is a product of or perceptual system. This already indicates that our concept of “reality” is the product of the specific perceptual and mental apparatus with which we are equipped. The fact that we have to presume sufficiently strong structural relations between the kind of reality as depicted by physics and the kind of reality as apparent in perception must not seduce us to mistake the meaningful categories of our mind with the way in which we describe the external world as independent of an observer.6

4 Most significantly, the overwhelming part of the physical energy pattern that impinges on the organism is not perceived or used for biological purposes; we can, for instance, neither perceive the plane of polarisation level nor the direction of magnetic fields. 5 Note, that even in cases, in which our linguistic vocabulary seems to suggest that our perceptual categories have a direct counterpart in the physical world, such as in the case of “illumination” or “surface”, a closer examination reveals that physically defined categories and perceptually defined categories do not coincide and that, with respect to the functioning of the perceptual system, the occurrence of the former is neither necessary nor sufficient for the occurrence of the latter. The way we perceptually segment our world does not conform to the way we divide up things in the world when we are doing physics, an observation, which Ludlow (2003, p 150) terms a “type mismatch”. 6 Historically, our inclination to pursue our theoretical enquiries in such a way that they comply with common-sense intuitions is revealingly exemplified by the propagation of the notion of representation in perception theory. The widespread acceptance of this term, which has yielded tremendous philosophical confusion, historically seems to have been enhanced by the French translations of Descartes’ profound and deep analyses of perception. Descartes used, e.g. in his Third Meditations, the Latin verbs “exhibeo” and “repraesento” interchangeably, in the sense of something that is internally presented to the mind. In seventeenth-century French both Latin verbs were translated by “representer”, so that the meaning of the term shifted, due to the impact of common-sense intuitions, from denoting a predominantly mind-internal presentation to denoting a mind-world relation. While Descartes had an internalist meaning in mind, the French translation tacitly was guided by our ordinary preference for an externalist interpretation of the term.

26

2.2

R. Mausfeld

Can the Structure of Perception be Derived from Adaptational Needs?

Our conviction that the goal of senses is to convey to us a basically “truthful” account of the external world predisposes us to regard the specific properties and features of perception as predominantly formed by our biological adaptive needs. In perceptual psychology, corresponding intuitions find their expression in adaptational accounts of perception according to which the core properties of perception result from the biological adaptive needs to appropriately couple the organism to the external world. Because such adaptational intuitions derive from our commonsense conceptions, they unsurprisingly have a long history in perceptual psychology and still dominate, at least implicitly, the field. Spencer (1855, p 583) already presumed “that there exist in the nervous system certain pre-established relations answering to relations in the environment”. He postulated a “continuous adjustment of internal relations to external relations”. The structure of the mind is, according to Spencer, the “result from experiences continued for numberless generations”, whereby the “uniform and frequent of these experiences have been successively bequeathed” in the process of evolution. James (1890, p 1222) lauded this as a “brilliant and seductive statement” that “doubtless includes a good deal of truth”. It founders, however, according to James, “when the details are scrutinised, many of them will be seen to be inexplicable in this simple way”. It is a matter of course that perception must structurally mirror or at least not contradict biological relevant aspects of the external world. This, however, is hardly an insight but rather simply rephrases from a functional point of view the kind of mental phenomena that have been singled out as an object of enquiry. From it, not much follows about the specific properties of the perceptual system. Even if the output of the perceptual system would not even in a single case mirror the true manner of being of the external world (however we might conceive of it), the perceptual system still could ensure a coupling of the organism to biologically relevant structural aspects of external word.7 To be sure, the perceptual system, as other computational or non-computational biological systems, has, in its evolutionary development, taken advantage of external physical regularities. From this, however, it neither follows that its core principles and its conceptual structure can be understood from these external regularities nor that considerations about adaptive purposes or about the “proper” external objects of perception play an explanatory role in accounts of the internal principles by which the perceptual system generates its outputs on the basis of specific inputs.8 7

This insight also finds its expression in Helmholtz’ sign theory (and in previous sign conceptions, notably Descartes’). 8 As in the case of other biological systems, an understanding of the internal functioning of the perceptual system does neither rest on a diachronic analysis of its selectional history. Explanatory accounts of perception furthermore do not rest on considerations of which physical entities should

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

27

Furthermore, what we regard as a relevant regularity already depends on the specific nature of the system whose properties we intend to explain from those regularities. We can formulate literally infinitely many physical regularities, i.e. relations on sets of physico-mathematical entities that remain invariant under certain sets of transformations, of any degree of “unnaturalness”, under which the perceptual system has evolved. There is no apriori notion of organism-relevant physical regularities. The nature of the organism determines which regions of the parameter space of the physical world are regarded as an environment. What constitutes a regularity depends on the structure of the organism under scrutiny, such as its size, the spatial and temporal integration properties of receptors and other neural structures, the properties of its memory, and its conceptual capacities. Nevertheless, the idea that the nature of our perceptual system is formed by a “continuous adjustment of internal relations to external relations” is mostly regarded not as a hypothesis about the relation between specific external regularities and internal properties of the perceptual system but as a kind of self-evident truth. It then expresses the conviction that the fundamental properties of the perceptual system can be adequately accounted for by adaptational factors and that no essential explanatory importance needs to be attached to factors such as internal physical, architectural, or computational constraints in the evolutionary development of our mental architecture. Such a conviction again derives its apparent plausibility from common-sense intuitions. It sharply conflicts with the wealth of empirical evidence that has been gathered in Gestalt psychology and other areas of cognitive science, ethology, and evolutionary biology, an issue to which I will return in a later section. Here, it may suffice to mention that the principles underlying perception must not only be appropriate with respect to relevant properties of the external world but also be functionally adequate, i.e. they have to fit into the entire perceptual and cognitive architecture.

2.3

The Unit of Analysis

The detrimental impact of common-sense conception on perception theory stems not only from intuitions about the “truthfulness” and adaptive usefulness of perception but is as much due to our ordinary intuitions about the architecture of our mind. In fact, the entire idea of “architecture of our mind” plays almost no role in our ordinary thinking about mental phenomena. Common sense tells us that it is our integral self, which is in direct contact with the world, and no brain, no intermediate substrate and no properties of whatever happens in the body between the sensory stimulation and the percept figure in its ordinary accounts. In our ordinary conceptions of perception, we discount all the processes that occur between the distal be regarded as the “true” or “proper” antecedents of the sensory input (aside from heuristic purposes and our ordinary or meta-theoretical talk, in which such inquiries are inevitably embedded) (cf. Mausfeld 2002).

28

R. Mausfeld

causes and the percept. Accordingly, we are convinced that we are in direct contact with the world. However, the impression of the unity of our mental life and the conviction that our mind is an integral whole are core – and functionally most important – achievements of our brain. These achievements cannot be taken to convey to us a conception of the underlying functional architecture. In fact, the theoretical picture of the functional architecture of our mind that is emerging in cognitive science, and whose origin can be traced back to Plato and Aristotle, stands in marked contrast to our ordinary conceptions. According to our theoretical insights, we can compare our mind with a huge orchestra in which a plenitude of different instruments act together to yield all our mental achievements. What we can consciously experience is only the sound of the orchestra of our mental faculties working as a whole. We have, however, no experiential access by which we could identify which instruments make which contributions and the precise way in which they act together. Because ideas about a “functional architecture of our mind” practically have no place in our ordinary conceptions of perception, our ordinary conceptions cannot but to regard the integral person as their unit of analysis. Common-sense-based conception of perception has no need to distinguish between the contributions of different subsystems, and thus between the output of a specific subsystem and the potential uses it is put to by other systems. Accordingly, they tend to identify the output of a specific system, namely the perceptual system, with the results of the functioning of the entire orchestra of mental subsystems, including interpretative ones used for the pragmatics of referring. Due to the fact that in our acts of perceiving, we, i.e. a person, refer to things in the world, reference and ensuing notions play a crucial role in our ordinary conception of perception. Scientific enquiries of perception, however, have to pursue a different path, which is dictated by their specific explanatory purposes. Perception theory regards as its unit of analysis a specific subsystem of our mind and attempts to identify the abstract internal principles on which its achievements are based by abstracting away from the contributions of other systems of the mind.9 Once we regard, in perception theory, a specific subsystem, namely the perceptual system, as the unit of analysis, the notion of reference becomes void of meaning. Consequently, the notion of “reference” and related notions such as “perceptual error”, “veridicality”, or “proper function” have no place in explanatory accounts of the functioning of the perceptual system, and there are no explanatory lacunae in perception theory to be

9 It is a matter of course that the assumption that the perceptual system qualifies as a subsystem of the brain that can, by standard methodological practices of idealisation and abstraction, be studied in isolation in no way implies the denial of dependencies with other systems. With respect to rational enquiry, the question is not how in reality things are related to each other. The nature and functioning of the perceptual system is related to and dependent on various aspects of reality such as its phylogenetic development, on the metabolic system, the immune system, or a great variety of other internal computational systems, or on the physics of the brain. The question rather is what constitutes an appropriate level of idealisation for successful explanatory frameworks of perception.

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

29

filled by introducing these notions. Particularly the notion of “perceptual error”, which is a distinguishing element of our ordinary discourse about perception, is of no avail for perception theory (cf. Mausfeld 2002). As already Helmholtz (1855, p 100) emphasised, there are no “errors” with respect to the perceptual system’s own principles: “The senses cannot deceive us, they work according to their established immutable laws and cannot do otherwise. It is us who are mistaken in our apprehension of the sensory perception”. Notions, such as “error” or “reference”, pertain to a different level of analysis, e.g. that of a person, not to the level of a specific subsystem.10 The pre-occupation with “perceptual illusions”, which is still characteristic for large fields of perceptual psychology and philosophy of perception, indicates the extent to which perceptual research is still driven by common-sense conceptions of perception. Notions of “perceptual errors” have greatly impeded theoretical progress in perception theory and fall back behind clarifications that have been achieved some 100 years ago, and since then have been again and again explicitly expressed by Helmholtz, Hering, Koffka, K€ohler, and many other perceptual psychologists. Apart from the aims of scientific inquiries, the intuitions that characterise our ordinary conception of perception are, needless to say, appropriate and functional. Our ordinary intuition that perception ties us directly to the categories of the world is itself a core achievement of our brain. It becomes deeply misleading only when we employ it in perception theory. Once we decide to theoretically understand, in the context of the natural sciences, the abstract principles by which a specific biological system, the brain, generates its meaningful categorisations and brings forth its perceptual achievements, we have to follow the same kind of methodological principles that we pursue in the case of other objects of nature. This stance, however, which is metaphorically illustrated in Fig. 3, appears to us highly alien and unmotivated from our ordinary intuitions about perception, and thus poses almost insurmountable difficulties for developing appropriate theoretical accounts of perception. With respect to physics, it is well known that its entire history can be read as a continuous quarrel with common-sense intuitions. What holds for physics, even stronger holds for perception theory. We are held captive by the way perception appears to us. Accordingly, pictorial and similarity-based conceptions of perception are so deeply entrenched in our conception of the world and our

10

Most of the currently prevailing philosophical accounts of perception assume that percepts have truth conditions and that the “meaning” of a percept should be conceived in terms of its alleged reference to the external world, i.e. in terms of the conditions under which a corresponding proposition is true. Such accounts rest on some kind of local mapping conception of perception and remain within the confines of common-sense conceptualisations of perception. Whatever their philosophical merits may be, they are of no interest to perception theory.

30

R. Mausfeld

Fig. 3 The target of enquiry for perception theory: the perceptual system regarded as a biological object

interaction with it that it is hardly surprising that they exercise a continuous impact on perception research.11 With respect to mental phenomena, we are convinced that our ordinary intuitions about the functioning of the mind were basically correct and that we had some kind of privileged access to the way perception works. However, it is an essential part of the functioning of our brain that it does not allow us immediate introspective insights into its principles. It is part and parcel of the functional architecture of our brain that it almost completely hides the functioning of the perceptual system

11

Also in philosophy of perception, conceptions prevail that are based on philosophically sophisticated varieties of the intuitions underlying naı¨ve realism, such as “critical realisms”, or “scientifically informed realism”. They are often accompanied by some kind of metaphysical materialism, epistemological reductionism, and the idea that the “meaning” of a percept is determined by its reference to the external world. In contradistinction to other varieties of philosophical realism, structural realism (Russell 1927, cf. also Maxwell 1970; Worrall 1989; Ainsworth 2009) radically dispenses with any kind of realist common-sense intuitions, while still preserving our natural preferences for a realist conception of the world. Structural realism is based on the theoretical observation that from the structure of our perceptions we can “infer a great deal as to the structure of the physical world, but not as to its intrinsic character” (Russell 1927, p 400), and argues that our epistemic access to the external world is restricted to its structural aspects. Such an account is consonant with sign-conceptions of perception. If correspondingly also our epistemic access to the nature of perception is confined to purely structural aspects, qualia by their very nature will escape a deeper explanation in terms of the conceptual frameworks of natural science. Non-structural aspects of perception would then be as much beyond the limits of scientific inquiry as are the non-structural aspects of the physical world, i.e. the “thing in itself”. Descartes noticed this and emphasised that the character of qualia, such as the sensation of redness, is not intelligible for us by the rational means by which we investigate the causal structure of the nonexperiential world. Qualia are, in Descartes’ formulation, “ordained by nature”. Nature has chosen that “certain corporeal events should be “flagged” for us in certain way”. But “there is, for Descartes, no scientific explanation of the flagging system which actually obtains” (Cottingham 1986, p 140).

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

31

from our conscious experience. What we consciously perceive is only the end product of the perceptual system. Only by the standard methodological procedures of the natural sciences can we attempt to identify the principles on which perceptual achievements are based.

3 The Standard Model of Perception The development of an appropriate theoretical account of the internal principles that underlie our perceptual achievements is, unsurprisingly and almost inevitably, impeded by the obstructive influences of common-sense intuitions. Variants of naı¨ve realism and ensuing conception are still prevailing as a tacit background assumption in perceptual psychology, albeit often concealed by technically sophisticated formulations. These conceptions have affiliated with the idea, originating from sensory physiology, that the sense modalities can be considered as the natural starting point and as the natural units of analyses for perceptual psychology. As a result, the nature of the perceptual system predominantly has been carved along the lines of sensory input channels, and perceptual psychology has been organised in terms of elementary perceptual attributes. Corresponding ideas have condensed in what has become the Standard Model of Perception, as depicted in Fig. 4. The Standard Model of Perception, in its many guises, is grounded on a distinction between sensations, as the “raw material” of experience, on the one hand, and perceptions, which are typically conceived of as referring to objects in the external world. According to this model, the process of perception can essentially be described as subsequent stages of “information” processing by which the sensory input is transformed into the meaningful categories and distinctions, namely the “perceptions”, whose meanings derive from what they refer to in the external world. More specifically, it is presumed that the sensory input provides information about elementary sensory qualities, such as colour and motion, and that by so-called lower

Fig. 4 The standard model of perception

32

R. Mausfeld

level processes these elementary qualities are glued together by some associative or inferential machinery.12 The remaining gap to the percept is then allegedly bridged by what is usually referred to as “cognitive processes” or “higher-order” processes, which are assumed to account for everything that cannot be explained by the so-called lower-level processes. The Standard Model presumes that the biologically given conceptual endowment underlying perception is confined to a rather thin set of sensory concepts from which all other concepts have subsequently to be derived by some inductive or associative machinery.13 The Standard Model is already deeply flawed on conceptual grounds. Its key conceptual flaw is that it is based, in line with our common-sense intuition about perception, on a fatal conflation by which the output of the system under scrutiny is mistaken for its input. The Standard Model surreptitiously borrows semantic distinctions, such as “perceptual object”, “surfaces”, “shadows”, or “illumination”, tacitly from the output of the perceptual system and uses them for a description of the input (particularly when the Standard Model is supplemented by an inverse optics approach of “recovery of world structure”). By its very conceptualisation of perception, the Standard Model dodges an essential task of perceptual research, namely the identification of the internal conceptual structure of perception and the symbolic objects to which the computational procedures of the perceptual system apply. Characteristically, the Standard Model shirks the explanatory task to come to terms with the most fundamental theoretical notion, namely that of a “perceptual object”, by deriving this notion from an allegedly “corresponding” notion of an “external world object” and by placing the explanatory burden on experience and some inferential machinery. The Standard Model trivialises what can be regarded as the Fundamental Problem of Perception Theory, namely the identification of the principles, by which the perceptual system generates, given a specific physical spatio-temporal energy pattern as input, an output that is organised in terms of semantic categories. Accordingly, the problem of perceptual semantics has not even been recognised by the Standard Model 12

Corresponding ideas are, however, profoundly inadequate already on conceptual grounds (as has also been emphasised by gestalt psychology). Insights into what can be achieved by inductive procedures made clear that no general inductive machinery, however powerful, can derive from the sensory input, and thus, more generally, from experience, the kind of internal conceptual structure that is explanatorily required (unless it is already itself based on a conceptual structure as powerful as the one to be inferred). However powerful the inductive machinery is assumed to be, there is no way to arrive at symbolic objects that are logically more powerful in the sense that their structure is not expressible in the logical language in which we describe the bases of the inductive procedure (cf. Fodor 1980). Since essentially no perceptual object or attribute of our perceptual system is definable in the logical language by which we describe the physical energy pattern that constitutes the sensory input, the internal structure underlying perceptual meaning cannot be attained by inductive procedures (unless we surreptitiously describe the input in terms of the yet-to-be-explained output categories). 13 Apart from being unmotivated from the point of evolutionary biology, this presumption, however, does not derive from empirical evidence but rather expresses a deeply entrenched epistemological prejudice, namely the empiricist conception of the mind, which itself can be regarded as a philosophically refined expression of our ordinary pre-conceptions about mental achievements.

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

33

as a serious theoretical challenge because it is concealed precisely by one of the eminent achievements of our brain, namely the externalisation of its own semantic categories into what we regard as the external world.14 In the history of perceptual psychology, the strongest critique of the Standard Model had been advanced by Gestalt psychologists, on the basis of accumulating empirical evidence. They furthermore recognised that the Standard Model’s emphasis on issues of processing results from mistaking the explanatory task of neurophysiology for the explanatory task of perceptual psychology, and thus from conflating different levels of analysis. However, the Standard Model is, despite its utter conceptual and empirical inadequacy, is still prevailing in perceptual research, where it is mostly regarded as a truism (cf. Mausfeld 2002, 2003a, 2010b).15 Fortunately, however, more appropriate conceptions have emerged during recent decades, in a convergence of empirical findings and theoretical ideas from ethology, research with newborns, perceptual psychology, and cognitive science. These conceptions, which were foreshadowed by Gestalt psychology and theoretical insights during the seventeenth century, provide a theoretical framework for perception that radically deviates from the Standard Model. Already at the beginning of the last century, the empirical and theoretical evidence in support of corresponding ideas was enormously rich. But only after advances in the computational sciences provided a

14

The conceptual defects of the conceptions underlying the Standard Model had already been clearly identified in the seventeenth century. By corresponding enquiries, it became evident that the problem of perceptual meaning cannot be resolved by deferring the explanatory duty to the sensory information. Even the core notion of a “perceptual object” cannot be derived, by whatever mathematical machinery, from the sensory input. Hume was well aware of this problem and noted, in his Treatise of Human Nature (Book 1, Part IV, sec II), that the senses “give us no notion of continu’d existence, because they cannot operate beyond the extent, in which they really operate. They as little produce the opinion of a distinct existence, because they neither can offer it to the mind as represented, nor as original. . . We may, therefore, conclude with certainty, that the opinion of a continu’d and of a distinct existence never arises from the senses.” Therefore, it has generally been noticed since the seventeenth century that there is an explanatory gap to be filled. 15 The Standard Model is deeply rooted in the behavioristic tradition. One can hardly overestimate the impact and longevity of behavioristic thinking. Although it is concealed by the cognitivist jargon it has adopted, its hallmarks are easily discernible, namely an underlying empiricist conception of mind, an inductivist and empiricist epistemology, and a bias favouring investigations of input–output relations at the expense of inquiries into internal principles. Miller (1979) observed two decades after the beginning of the so-called cognitive revolution: “What seems to have happened is that many experimental psychologists who were studying human learning, perception, or thinking began to call themselves cognitive psychologists without changing in any obvious way what they had always been thinking and doing – as if they suddenly discovered they had been speaking cognitive psychology all their lives. So our victory may have been more modest than the written record would have led you to believe”. In a similar vein, Fodor (2003) noticed: “In fact, though, practically all experimental psychologists and philosophers of mind continue to be behaviourists of one kind or other. They have just ceased to notice that they are”. Even worse, the impact of behaviourist pre-conceptions is mostly not regarded as a flaw but claimed as a virtue. Roediger (2004) gives an unusually explicit statement of this attitude: “Behaviorism is alive and most of us are behaviorists”.

34

R. Mausfeld

new conceptual apparatus, could these ideas be taken up and further explored in a fruitful manner. Although the theoretical picture that is emerging from corresponding enquiries is still very skeletal, it has already yielded intriguing results with respect to a range of significant phenomena and has suggested novel and fruitful questions about the internal architecture of perception. It is conceptually and empirically well supported and has considerably gained in explanatory width and depth. Our confidence in the appropriateness of the emerging theoretical framework furthermore is, as always in the natural sciences, fortified by the fact that it is yielded by a theoretical convergence of quite different disciplines. All the same, these developments have not yet gained wider acceptance because they are grossly at variance with our common-sense conceptions about perception.

4 Fundamental Aspects of the Functional Architecture and of the Conceptual Forms Underlying Perception The Fundamental Problem of Perception Theory originates from the fact that the output of the perceptual system, namely meaningful categories, is vastly underdetermined, as it were, by the sensory input, namely physico-geometric energy patterns. Approaches that are in line with the ideas underlying the Standard Model tend to gravely underrate the explanatory gap involved because they confound output and input categories. As a result, these approaches underestimate the quantity and complexity of the biologically given conceptual endowment of the perceptual system explanatorily required to account for its achievements. They rather exhibit a preference for ascribing to the system only a meagre set of sensory concepts.16 While corresponding ideas are suggested by common-sense intuitions, they prove to be profoundly inadequate in the context of the Fundamental Problem of Perception Theory. The theoretical framework that has been emerging during the last decades from a convergence of different disciplines acknowledges, in line with a growing body of experimental evidence, that we can only deal in an explanatorily adequate way with the Fundamental Problem of perception theory when we are willing to subscribe to the perceptual system a biologically given conceptual structure that is rich enough to account for its core perceptual achievements. Of course, the theoretical picture whose contours are presently emerging at the horizon is still faint and inevitably speculative. I cannot evaluate here the abundantly rich empirical evidence that speaks in favour of this picture. Since my agenda here is primarily programmatic in 16

If we regard the perceptual system, in a rather loose sense, as a computational system, we have to explicitly specify the data formats on which computational processes are based. Approaches that follow the lines of the Standard Model of Perception tend to confine the type of data formats that are part of the biological endowment of the perceptual system to those that can be defined in terms of elementary sensory features.

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

35

character, I will attempt to venture some ideas about the nature of conceptual forms that appear to me theoretically motivated and empirically warranted on the bases of what can be distilled from experimental findings in various domains. Ethology, perceptual psychology, and research with newborns and babies have provided, notably in more recent studies, a rich and fruitful basis for theoretical speculations about the nature of the conceptual forms underlying perception.17 The available evidence suggests that the perceptual system is biologically endowed with an exceedingly rich set of complex conceptual forms in terms of which we perceive our “external world”, such as “surface”, “physical object”, “intentional object”, “event”, “potential actor”, “self”, or “other person” , with associated attributes, such as “colour”, “shape”, or “emotional state”, and their appropriate relations, such as “causation”, “intention”, or, in the case of “event”, “precedence”, “beginning”, “end”, or “subsequence”. These conceptual forms cannot be derived, by whatever mechanisms of inductive inference, from the elementary sensory predicates delivered by the senses. We thus have to assume that the perceptual system is biologically endowed with a rich conceptual structure in terms of which the signs delivered by the senses are exploited. At the core of this theoretical picture is the idea that, in more complex organisms, the sensory input serves as a kind of sign for the activation of biologically given conceptual forms, which determine the data format of the computational processes involved. The conceptual forms, say for “surface”, “food”, “enemy”, or “tool”, cannot be reduced to or inductively derived from the sensory input but have to be regarded as part of the specific biological endowment of the organism under scrutiny.18 In order to account for the relation between the sensory input and the irreducible and complex perceptual concepts that constitute the output of the perceptual system, it is useful to distinguish at least two levels of abstraction19 with respect to the functional architecture of the perceptual system, namely a

17

These speculations can only be regarded as fruitful to the extent that they contribute to a core task of perception theory, namely the development of an explanatory framework of sufficient explanatory depth and explanatory width for a range of basic phenomena. Due to this requirement, regarded as a matter of course in the natural sciences, any confidence in corresponding theoretical speculations will not be based on isolated experimental findings but rather on a deeper convergence of quite different disciplines on a joint theoretical perspective. 18 It is important to be aware that these linguistic appellations of the conceptual forms of the Perceptual System are only makeshift descriptions of non-linguistic entities (which are, of course, further shaped by the properties of subsequent linguistic and interpretative systems). 19 Employing different levels of abstraction does not eo ipso imply any ontological commitments as to the physical or biological organisation of the system under scrutiny. However, additional arguments, e.g. from evolutionary biology, can be advanced in favour of the assumption that there are in fact different and to some interesting degree modular subsystems involved. Therefore, I identify the required levels of abstraction with actual (idealised) subsystems (and the logical language by which we describe the functioning of a system with the logical language in which the actual operations of the system can be expressed).

36

R. Mausfeld

Fig. 5 Functional architecture of the perceptual system

Sensory System, on the one hand, and Perceptual System, on the other hand (as illustrated in Fig. 5).20 The Sensory System, understood in the technical sense used here, deals with the transduction of physical energy into neural codes and their subsequent transformations into codes that are “readable” by and fulfil the structural and computational needs of the Perceptual System; we can refer to these codes as “cues” or “signs”.21 For each given sensory input, all codes that are computable from this sensory input by the available computational apparatus and that can be “read” by the Perceptual System are simultaneously provided at the interface to the Perceptual System. Thus, the Sensory System can be understood as providing a specific subset of the (infinitely many) relations etc. of the sensory input, namely that subset that can be computationally exploited by the Perceptual System in terms of its conceptual forms. The coding properties of the Sensory System are highly

20

I keep the term “perceptual system” for loosely referring to the entire modular system of perception, which includes the Sensory System as well as the Perceptual System as characterised here. It is important to note that the distinction between a Sensory System and a Perceptual System as employed here is entirely different in character from the sensation–perception distinction as introduced by Spencer, James, Wundt, or Helmholtz, which refers to an alleged hierarchy of processing stages, in terms of the same data formats, by which the sensory input is transformed into “perceptions”. On this highly influential distinction, sensations are regarded as the “raw material” of experience, while perceptions are typically conceived of as referring to objects in the external world. Sensations and perceptions were generally held to “shade gradually into each other, being one and all products of the same psychological machinery of association” (William James 1890/1983, Ch. XIX), or of some other inferential machinery. 21 Note that for the generation of “signs” by the Sensory System, it is irrelevant whether the sensory input has been causally generated by a real object, by a picture of this object on a computer screen or by an appropriate stimulation of nerve cells. Hence, the “external objects” in terms of which we describe, in our ordinary discourse, the external world, do not figure in explanatory accounts of perception.

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

37

idiosyncratic,22 with respect to our physical descriptions of the external world, and often strongly deviate from our ordinary intuitions and expectations. The distinguishing feature of the Sensory System is that the data formats on which its computations are based can be entirely defined in terms of the elementary physico-geometrical language in which we describe the geometrically organised sensory input, i.e. in terms of lines, gradients, texture, contour, and luminance. The Perceptual System, on the other hand, contains the rich set of complex conceptual forms (such as “perceptual object”, “surface”, “food”, “enemy”, “tool”, and “causal event”) that are not definable in terms of the conceptual forms of the Sensory System or reducible to them. These conceptual forms thus define the entities or abstract objects for the internal computations of the Perceptual System. They are the basis for our “perceptual ontology” and thus constitute the realm of “perceptual objects”. We can conceive of the Perceptual System as a self-contained system of “perceptual knowledge”, an “integrated complex represented in the mind” which pertains to “a specific domain of potential fact, a kind of mentally constructed world” (to borrow Chomsky’s 1981, p 6, description of a different subsystem of the mind). We can refer to what is coded in the structure of its conceptual forms as the “internal semantics” of perception. According to the architectural conception proposed here, core aspects of most types of the explanatorily required build-in “knowledge” as assigned to “core knowledge systems” by Spelke (2000) and Spelke and Kinzler (2007) or to “conceptual-intentional systems” in the minimalist programme of linguistics have to be imputed to the Perceptual System already. At the core of this theoretical conception is the idea that, in more complex organisms, perception is based on an internal computational organisation in terms of abstract data types or conceptual forms; and that the sensory input serves as a kind of sign for the activation of these biologically

22

With respect to sensory systems, no one will doubt the obvious fact that their coding properties cannot simply be derived from alleged adaptive requirements. For instance, we have no problems in accepting the idea that the qualitative experience of colours, say the experience of a certain red, and also their structural relations, say the fact that red is perceptually nearer to blue than to green, cannot be derived, by however sophisticated mathematical transformations, from the sensory input or from external physical regularities. Rather, we regard these properties as being essentially codetermined by internal physical and neural constraints, i.e. by constraints that are idiosyncratic and contingent with respect to any adaptive coupling of the organism to its environment. Oddly enough, the same considerations receive less credibility in the case of more complex properties of our perceptual system. However, the structure of the conceptual forms, on which the computations of the perceptual system are based, will, in its evolutionary development, most likely be essentially co-determined by constraints that stem from the fact that only certain physical and computational channels were open as feasible evolutionary paths. The conceptual forms of the perceptual system must not only be adequate with respect to the external world (however one understands such a requirement); they must also be computationally adequate, i.e. they have to fit into the entire computational architecture. Any attempt to understand the nature of the conceptual forms of the perceptual system merely from the perspective of an adaptive coupling to the physical world will most certainly be doomed to fail. Apart from the fact that it is based on a misguided and defective research heuristics, it grossly underestimates the rich internal constraints that arise within the computational architecture of the biological system under scrutiny.

38

R. Mausfeld

given conceptual forms, which constitute the data formats of the computational processes involved. This conception is metaphorically illustrated in Fig. 6. The output of the Perceptual System consists of structured entities that are generated by its proprietary computational machinery and that represent legible instructions for subsequent systems by which interpretations in terms of external world properties are rendered. The instructions provided by the Perceptual System are exploited and “interpreted” by subsequent interpretative systems (which include belief systems and systems comprising non-perceptual knowledge about the world), systems for imaginations, or systems by which the phenomenal percept is generated. Thus, the Perceptual System provides, at its interfaces to higher-order interpretative systems, options for exploiting activated conceptual forms in terms of the conceptual forms and computational requirements of these higher-order systems. A demarcation between the Perceptual System and higher-order interpretative systems is much more difficult to achieve than between the Perceptual System and the Sensory System. Hence, the line of demarcation depends to some extent on theoretical and meta-theoretical preferences. At any rate, operations that enable us to refer to properties of our external world and thus the computational and conceptual apparatus for externalising, as it were, the conceptual forms of the Perceptual System have to be assigned to higher-order interpretative systems rather than to the Perceptual System itself. The conceptual forms of the Perceptual System can be understood as those data formats underlying its computations that are “visible” at its interfaces with subsequent systems and thus can be “read” by those systems. However, the output of the Perceptual System cannot be expected to be fully expressible in terms of higher-order interpretative and phenomenal systems. Rather, it can be used by other

Fig. 6 Perception as triggering of biologically given conceptual forms

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

39

internal systems, such as ones subserving motorial purposes, and remain partly phenomenally silent, or un-interpreted by higher-order interpretative systems. The conceptual forms of the Perceptual System can be investigated with respect to various aspects. I will briefly address structural aspects, i.e. aspects of their logical form, their dynamical role in structure-building computational processes within the Perceptual System, and the question whether such conceptual forms can be assigned some biological plausibility from an evolutionary point of view.

4.1

Structural Aspects of Conceptual Forms

As to the structural aspects, I have to confine myself to a few more intuitive remarks guided by conceptual considerations because our current understanding is too poor as to make any attempts at a more formal characterisation premature (this situation is exacerbated by the problem that, for well-known reasons, conceptual forms cannot be defined by the extensional notion of a set of objects). On the assumption that the biological system under scrutiny can be regarded as a computational one, we can conceive of the conceptual forms of the Perceptual System as abstract structures, each of which has its own proprietary types of parameters, relations, and transformations that govern its relation to both other conceptual forms and sensory codes. The values of the free variables of a conceptual form in general will not be – and, for subsequent computational processes, need not be – exhaustively specified by the activating input. Empirical evidence furthermore indicates that the conceptual forms of the Perceptual Systems can be classified into different types. Types are sets of computational objects with uniform behaviour which usually can, in the meta-language we employ, be named (e.g. “surface”). Types are associated with constants, operators, variables, and function symbols by which they code the constraints on the computational processes associated with certain “perceptual objects”. For instance, computational processes associated with “perceptual objects” of the type “living animal” pertain to internal causal analyses in terms of “hidden powers” and “essences” of those types of objects; computational processes associated with “perceptual objects” of the type “mykind”, pertain to causal analyses in terms of “mental states”. The hierarchy of types constitutes the build-in knowledge about entities, situations, actions, perspectives etc. of our perceptual world. Conceptual forms and their associated computational processes provide a “grammar of vision”, as it were. “Employing this “grammar of vision” – largely innate – higher animals are able to “read” from retinal images even hidden features of objects, and predict their immediate future states, thus to “classify” objects according to an internal grammar, to read reality from their eyes (Chomsky 1975, p 8). From the theoretical perspective propose here, there is no difference of principle, with respect to the fundamental mode of operation of the perceptual system, between the perception of material properties and the perception of, say, mental states of others.

40

R. Mausfeld

Conceptual forms can be regarded as semantic atoms of the internal semantics of perception. They express the core semantics of minimal meaning-bearing elements. Conceptual forms generate semantic and syntactic molecules and constitute, together with the computational apparatus of the Perceptual System, the combinatorial language of perception, as it were. The Perceptual System can be expected to have its own computational and generative principles for the structure-building processes by which systematically related patterns of activated conceptual forms are generated, in particular principles that pertain, e.g. to an evaluation metric, to the satisfaction of internal constraints, or to more global coherence constraints. For instance, perceptual phenomena suggest that it possesses a rather idiosyncratic evaluation metric (as mirrored in phenomena traditionally discussed under the heading of “cue integration”) and likewise idiosyncratic principles by which it glues together fragmented structures of activated conceptual forms into a globally more coherent structure (as witnessed, e.g. by phenomena pertaining to so-called impossible objects, or to the “dual nature” of picture perception). Conceptual forms are organised in highly hierarchical structures. They comprise, as part of their structure, relational parameters that act as computational synapses, as it were, to other computational forms and whose values define the strength of the intrinsic connections between conceptual forms. Accordingly, conceptual forms build systematically connected packages whose nature expresses the kind of built-in “knowledge” of the Perceptual System; also the activated conceptual forms offered at the interface to higher-order systems come in systematic bundles.

4.2

Biological Plausibility of Conceptual Forms

Because of the highly abstract and logically complex nature of conceptual forms, adherents to the Standard Model tend to regard them as biologically implausible entities and thus as explanatorily unwanted and dubious concepts for perception theory. However, which kind of theoretical notions are biologically plausible and which are not cannot be assessed in an a priori way but only by the explanatory success of an entire theoretical framework. Within the framework of evolutionary biology, independent lines of theoretical arguments can be discerned by which the biological emergence of abstract data types or conceptual forms can be motivated. These arguments are based on the observation that modularity, in some form or the other, presumably is, at all levels of biological organisation, the basis for the evolvability of complex systems and a driving force in their evolution (e.g. Kirschner and Gerhart 1998). Accordingly, we can also expect computational systems, such as those subserving perceptual achievements, to exhibit a high degree of modularity with respect to the data structures over which the corresponding computations are defined. An evolutionary increasing differentiation or specialisation of computational subsystems mediating between the front end side of sensory transduction, on the one hand, and the motoric effector side, on the other, will accordingly yield an increasing number of interface problems that the system has to solve as a result of its increasing

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

41

modularisation. For organisms with the simplest types of nervous systems, we can observe that the integration of different sensory channels is kept to a minimum, as required for instant motor control and action selection. An example is Cubozoans or box jelly fishes, which possess 24 eyes of 4 different types, where each type presumably serves a different adaptive function (Nilsson et al. 2005; Jacobs et al. 2007). Computationally more thorough degrees of informational or conceptual integration in organisms with multiple sets of eyes can be found at higher evolutionary levels of neural organisation (e.g. in jumping spiders, Harland and Jackson 2000). For organisms with more complex nervous systems, comparative neurobiology and ethology offer many findings that indicate that the same type of sensory input can be exploited by different computational subsystems, each having its proprietary data type, apparently without being conceptually integrated in a more thorough and comprehensive way (cf. Tinbergen 1951, classical example of the egg-roll behaviour of the grey goose, which presumably employs, depending on context, two disparate and apparently non-integrated concepts with different triggering conditions, namely “egg outside of the nest” vs. “egg inside the nest”). Although our current knowledge of the evolutionary emergence of abstract data types/conceptual forms (also referred to as “symbolic representations”, e.g. Gallistel 1998) is still wanting, we can infer some general features from synchronic investigations of extant species as well as from insights into the role of internal (i.e. non-adaptational) physical principles in evolutionary processes (e.g. Kirschner and Gerhart 1998; Kitano 2004; Wagner et al. 2005; M€uller 2007; Goodwin 2009; Newman and Bhat 2009), principles that presumably have their analogues on computational levels of biological systems. On the assumption that complex perceptual systems are computational systems and that modularity, with respect to corresponding data formats, is a driving force for their evolutionary differentiation, abstract data types will likely emerge as a consequence of internal computational and physical requirements of the functional organisation of the evolving system. In the course of an increasing evolutionary differentiation of the neural substrate that processes the information from various sensory channels, the system is increasingly faced with the functional demands of computationally tying together different subsystems by a joint data format that is sufficiently abstract for an integration of the information provided by different subsystems. While conceptually shallow interfaces may suffice in simple systems, in which different subsystems can be closely tied to single adaptive functions, such as recognition of prey, of mates, or of conspecific rivals, more complex systems would yield an increasing number of executive conflicts at the motorial interface unless they exhibit an appropriate degree of inferential and conceptual integration between different subsystems. In this sense, the biological tendency for an evolutionary increasing amount of modularisation spurs and enforces, with respect to computational systems, an increase in data abstraction and hence the development of conceptual forms. Because a data type together with its associated operations can be regarded as a computational module, the availability of conceptual forms themselves can be regarded as an extreme variant of modularity. If, as seems not unreasonably to conjecture, conceptual forms are evolutionary brought forth by internal computational requirements, their specific forms are not

42

R. Mausfeld

solely constrained by the adaptive requirement of coupling the organism as an entirety to its environment. Rather, the specific structure of the conceptual forms underlying perception will most likely be essentially co-determined and shaped, within the apparently rather broad latitude of design options that are left open by global adaptational restrictions, by powerful internal constraints that arise during the evolutionary development of a system of this complexity. Accordingly, the conceptual forms on which the computations of the Perceptual Systems are based have their own properties, which can be rather surprising when viewed exclusively from the perspective of an adaptive coupling to the external world. In fact, the most complex perceptual achievements, such as seeing invisible properties of objects (e.g. material qualities), intentional properties of objects (e.g. tools), or mental states of others, only became possible by evolutionary decoupling the output of the Perceptual System from the information provided by the sensory codes and by furnishing the Perceptual System with its proprietary types of abstract data types or conceptual forms.

4.3

Dynamic Aspects of Conceptual Forms

Finally, I address some dynamical aspects of the structure-building computational processes of the Perceptual System. The computations within the Perceptual System are instigated, via a triggering function between the Sensory System and the Perceptual System, by the sensory codes caused by a given sensory input.23 We can intuitively think of the triggering functions as an interface function that takes specific sensory codes as an argument and calls conceptual forms24 (cf. Fig. 5). 23

Corresponding intuitions had already been formulated in the seventeenth century. They found their elaborate expression notably by Cudworth (1731), who conjectured that conceptual forms “are excited and awakened occasionally from the appulse of outward objects knocking at the doors of our senses” and that “sense is but the offering and presenting of some object to the mind, to give it an occasion to exercise its own inward activity upon”. The Perceptual System “is enabled as occasion serves and outward objects invite, gradually and successively to unfold and display it self in a vital manner, by framing intelligible ideas or conceptions within itself of whatsoever hath any entity or cogitability” (cf. Mausfeld 2002). 24 The explanatory need for a triggering function as a function mediating between computational systems of different logical expressive power has been clearly recognised by Descartes. He was aware that any reference to “external objects” in sign conceptions of perception would lead, within scientific enquiry, to conceptual incoherences. In his attempts to provide, in entirely naturalistic terms, an explanatory framework for bridging the explanatory gap associated with the Fundamental Problem of Perception, Descartes (1642/1985) formulated a purely internalistic version of a sign theory of perception. Yolton (1984, 1996) referred to Descartes’ conception as “inverse sign relation”, because for Descartes, the physical motion, as expressed by neural activity, is the sign, and what is signified is what is expressed in the percept (cf. Gaukroger 1990, p 24). For Descartes, there are “two reactions operating in perception: the causal, physiological reaction and the signification reaction” (Yolton 1996, p 74). The significatory or semantic relation “replaces the causal relation between physical motion and ideas, but the representing relation goes, as it were, outward from awareness” (Yolton 1996, p 190). Descartes thus recognised the explanatory need for postulating a “semantic relation” in perception. “The connection between the signs and innate

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

43

This intuition, however, is not entirely appropriate because it neglects the dynamic bidirectional aspect of this function by which conceptual forms search, as it were, for satisfying conditions of sensory codes. On the theoretical account proposed here, the sensory codes serve a twofold function. They activate, via the triggering function, appropriate conceptual forms and thus determine the potential data formats in terms of which input properties are to be exploited. Furthermore, they assign concrete values to the free parameters of the activated conceptual forms. Such a description, though, is much too static to capture, in an empirically adequate way, the computational achievements of the Perceptual System. Actually, a satisfactory theoretical account of the structure-building processes within the Perceptual System, by which outputs are generated that are legible to higher-order systems, has to provide something in its explanatory toolkit by which we can cope with the dynamic aspects and the “creative forces” (von Szily 1921) of perception. These aspects pervade the entire realm of perceptual phenomena, although they rarely become phenomenally salient. We can, however, notice them in a more perspicuous way in phenomena pertaining, for instance, to context-dependent reorganisations of part–whole relationships, or to re-conceptualisations of “perceptual objects” and their “parts”, and in all situations in which ambiguities or perceptual vagueness occur. Any explanation of such phenomena requires the availability of special computational means and of conceptual forms with structural properties that promote and boost these dynamic properties. Corresponding intuitions about dynamic conceptual forms stretch back to Descartes, Humboldt (with his emphasis on active structure-building “energia”), and Cassirer. Despite the overwhelmingly rich empirical evidence fostering such intuitions, we are still lacking appropriate scientific conceptions for capturing these “active”, abductive and dynamic elements of perception in a more precise and explanatory adequate way. The general theoretical perspective adopted here, which focuses on conceptual forms and their triggering functions, prompts to derive from experimental findings and phenomenological observations some preliminary speculations about potential structural aspects of conceptual forms that yield or support these highly contextsensitive dynamic properties of the Perceptual System. A structural aspect that suggests itself is the option to leave values of parameters undetermined by the input (or to only constrain them to regions of their associated parameter space). It is the presence of un-interpretable elements that makes conceptual forms “computationally active” (in the sense that they are probing, as it were, the available sensory cues), and they remain active as long as they contain such elements demanding

ideas is clearly more intimate than any causal connection would be, for it is the innate ideas that make the signs what they are, whereas effects can never make causes what they are”. (Gaukroger 1990, p 25). Descartes’ intuitions about this “semantic relation” between what is provided by the senses, on the one hand, and the conceptual forms or “ideas” that yield meaningfully organised percepts were far ahead of anything that could be expressed in terms of the conceptual apparatus available at his time. Unsurprisingly, Descartes’ vacillating and tentative usage of terms, due to which he is notoriously hard to interpret, mirrors this lack of an appropriate conceptual framework, as provided much later by computation theory.

44

R. Mausfeld

satisfaction. If a triggering function would yield an activation pattern of conceptual forms that completely specifies all values of parameters and satisfies all internal constraints of the Perceptual System, no further structure-building would be required. If, however, constraints are violated or undetermined values of parameters remain, further structure-building by computational processes is required, involving appropriate evaluation functions. Experimental findings (for striking cases see, e.g. Kellman and Spelke 1983; Craton 1996; Tse 1999) as well as phenomenological observations indicate that the Perceptual System seems to routinely operate semantically with underdetermined conceptual forms. There is every indication that this structural feature is a general property of conceptual forms that extends to subsequent systems. Consider, for example, systems by which phenomenal percepts are generated. From a naı¨ve perspective, one could be tempted to assume that these systems assign a semantically fully and uniquely determined phenomenal percept to the output of the Perceptual System. In fact, however, the ubiquitous phenomena pertaining to vagueness, indeterminacy, or uncertainty indicate that the underlying conceptual forms inherit the underspecifications, with respect to a given input, of the conceptual forms of the Perceptual System. The output of the Perceptual System, at its interfaces to subsequent systems, thus needs not to be semantically determinate or unique but only “good enough” for the semantic needs of the subsequent systems. With respect to the possibilities of lacking specificity and semantic determinateness, it is useful to roughly distinguish, among the entire spectrum of types of underspecification, global and local aspects. Global aspects of underspecification pertain to an entire perceptual “interpretation”, as determined by a systematically related bundle of conceptual forms. Local aspects pertain to features of individual abstracts objects that belong to an activated systematic bundle of conceptual forms. Accordingly, we can speak of globally underspecified (bundles of) conceptual forms and locally underspecified conceptual forms. Local underspecifications, in otherwise globally determined “interpretations”, pervade almost all aspects of perception; mostly, the underspecification is never completely resolved or only resolved in the context of specific tasks. In contrast, global types of underspecification are not as prevalent and seem to require the kind of input situations that are associated with phenomena of multiperspectivity. All forms of underspecification convey constitutive computational advantages to the perceptual system. By postponing disambiguation to higher-order interpretative systems, the Perceptual System can increase its global stability with respect to the superordinate “interpretations” provided at its interfaces to subsequent systems. This protects the system from settling, under insufficient or “impoverished” input situations, on some definite “interpretation” that would have to be changed to an entirely different “interpretation” following a small variation in the input. In addition, and independent of issues of handling impoverished input situations, underspecified conceptual forms boost the potency of generative processes and enhance the conceptual versatility of the Perceptual System. By routinely operating with underspecified conceptual forms, computations can be performed on different

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

45

levels of semantic granularity. The computational advantages of the capability to perform semantic operations over underspecified conceptual forms have been, albeit in different theoretical frameworks, profitably explored in computational semantics (e.g. Pinkal 1996; Pustejovsky 1998). The theoretical picture underlying this way of framing core issues of perception evidently differs considerably from prevailing ones. It reverses the traditional perspective and approaches perception rigorously from within outwards, as it were. Needless to say that in the light of what is currently at all well understood about fundamental principles underlying perception, the framework adumbrated is inevitably highly speculative. Yet, it appears to me well motivated in the light of forceful theoretical arguments that can be advanced in favour of it, as well as in the face of a great variety of impressive empirical evidence supporting it. Presently, however, only the contours of a theoretical picture have become faintly visible due to the current convergence of different disciplines. More specific ideas and hypotheses can only be ventured in the course of enquiries that systematically explore the purview and fertility of this theoretical framework. Here, I have to content myself with intuitively illustrating the potential fecundity of the proposed theoretical stance. While traditional accounts of perception are fundamentally ill equipped to deal with aspects of meaning, multiperspectivity, and perceptual vagueness, attributing them essentially to “higher-order” cognitive systems, the direction of enquiry outlined above promises a natural way of approaching them as intrinsic properties of the perceptual system proper.

5 Intrinsic Multiperspectivity As mentioned in the introduction, the capacity to employ different mental perspectives with respect to the same stimulus situation pervades all of our mental activity. Once general properties of such phenomena of “multiperspectivity” are explored on a sufficient level of abstraction, it becomes obvious that cognitive science teems with corresponding phenomena, which witness, it seems to me, our cognitive capability to simultaneously handle different layers of perceptual interpretations, as it were, that are triggered by the same input. Many of the corresponding phenomena and achievement will likely be due to the complex interaction of a large class of instruments of our orchestra of mental capacities. In the present context, however, I am interested in the question whether there are subclasses of such phenomena that can be directly tied to structural and functional properties of a single subsystem, namely the Perceptual System. Therefore, I will make a tentative attempt to single out, among the huge class of phenomena that in some way or the other can be related to “multiperspectivity”, a class of phenomena that appear to me as potential candidates for explanatory accounts in terms of specific design principles of the Perceptual System. My claim is that the specific design features of the Perceptual System, as outlined above, yield phenomena of “multiperspecitivity” as a constitutive and

46

R. Mausfeld

functionally crucial consequence. I will refer to types of “multiperspectitivity” that are a direct consequence of properties of the Perceptual System as “intrinsic multiperspectivity”. Accordingly, phenomena witnessing “intrinsic multiperspectivity” can be explained by intrinsic properties of the Perceptual System, without substantially referring to properties of higher-order interpretative systems, systems by which the phenomenal percept is generated, or to properties of interactions of higher-order systems. “Intrinsic multiperspectivity” is a theoretical (and thus, needless to say, an abstract and highly idealised) notion rather than a phenomenological one and refers to those internal structural and computational features of the Perceptual System by which it can, for a given sensory input, simultaneously provide at its interfaces to higher systems multiple “interpretations”. These “interpretations” can be conflicting in the sense that the higher-order cognitive systems, by which meanings are assigned in terms of “external world” properties, assign mutually incoherent interpretations to them. The notion of intrinsic multiperspectivity presupposes an internal structure that is sufficiently rich to yield the corresponding phenomena. If, as presently prevailing accounts of perception presuppose, the conceptual endowment of the Perceptual System were rather thin and were basically exhausted by sensory concepts, the class of phenomena pertaining to intrinsic multiperspectivity would be empty. Rather, all forms of “multiperspectivity” would be due to, say, inferential and interpretative properties of higher-order systems. If, however, empirical evidence speaks in favour of a rich and powerful conceptual structure of the Perceptual System, the property of intrinsic multiperspectivity emerges, as I will argue below, as a natural (and functionally important) consequence of the computational relation between sensory codes and conceptual forms. Perceptual psychology abounds with phenomena that are eligible for providing examples of intrinsic multiperspectivity, notably those that are grouped under headings such as figure-ground segmentation, bi-stability, the proximal mode, amodal completion, vagueness, or ambiguity between entire mental “frames of reference”. A well-known prototypical example is provided by phenomena that are grouped under the heading of a dual character of pictures. This notion basically refers to the phenomenon that pictures can generate an in-depth spatial impression of the scene depicted while at the same time appearing as flat two-dimensional surfaces hanging on a wall. In picture perception, we can simultaneously have the phenomenal impression of two different types of objects, each of which seems to thrive in its own autonomous spatial framework, namely, on the one hand, the picture surface as an object – with corresponding object properties such as orientation or depth – and, on the other hand, the depicted objects themselves with their idiosyncratic spatial properties and relations. We seem to have two mutually incompatible spatial representations at the same time; at least in the sense that they are available internally and we can, without any effort, switch between them. This aspect is phenomenally so conspicuous and striking that we usually do not pay much attention to it. Though such switches are correlated with depth aspects, they actually pertain to the entire perceptual organisation of the visual field and thus to attributes like shape, or shading and brightness gradients.

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

47

Gombrich (1982) made the important observation that one has to achieve the proper “mental attitude” to take full advantage of the capacity to switch back and forth between the reality of the picture as an object and the reality of the depicted objects. Michotte (1948/1991) recognised the challenge that this kind of phenomenon poses for perception theory. Although a wealth of observations pertaining to this kind of phenomenon has accrued in the literature, our theoretical understanding continues to be poor [see Mausfeld (2003b), for a more detailed account]. Here, I will use the famous Heider–Simmel demonstration (cf. Scholl and Tremoulet 2000) to illustrate how intrinsic multiperspectivity is yielded as constitutive and functionally crucial consequence of the architectural and structural properties of the perceptual system outlined above. This demonstration seems to me particularly suited for making more conspicuous structural properties of our mental architecture that we otherwise find difficult to notice because they are an all pervading property of the way we are designed. In the Heider–Simmel demonstration, a small and a large triangle and a small circle move against and around each other and an open rectangle (see Fig. 7 for a movie still), and where observers unanimously perceive this event in terms of intentional concepts, such as chasing, looking for, hiding, conferring, being furious or frightened, etc. If, however, the movement of the figures is speeded up or slowed down these intentional impressions deteriorate or vanish entirely. Obviously, the movement patterns of these geometrical objects suffices for eliciting a perceptual ascription of complex internal attributes pertaining to dispositional attributes of perceptual objects of the type “agent” or “mykind ”. Although in this demonstration, we undoubtedly and irresistibly perceive self-propelling objects with intentional attributes, we still are aware that we actually see only geometrical objects moving, again a kind of double book-keeping. As in the case of picture perception, one kind of percept does not vitiate the other, both can exist simultaneously as two layers of perceptual interpretation, as it were. Despite the fact that we ascribe in the Heider–Simmel demonstration anthropomorphic properties to the perceptual objects, it would never occur to us to actually interact with these objects.

Fig. 7 Movie still from the Heider and Simmel (1944) movie

48

R. Mausfeld

As the Heider–Simmel demonstration illustrates, our perceptual systems is highly sensitive to mechanical and intentional contingencies, which can activate perceptual “interpretations” that go far beyond any given sensory information. Via these contingencies, we segment our world into animate and inanimate objects and, in the case of non-mechanical contingencies or causation at a distance, attribute agency and internal intentions to a perceptual object. Perceptual objects of the type “animate object” can already be activated by appropriate temporal sequences of two-dimensional shapes (even pre-linguistic infants attribute intentionality to abstract shapes based solely upon spatio-temporal variables, e.g. Johnson 2003; Hamlin et al. 2007; Baillargeon et al. 2009). These perceptual achievements are part of our general capacity for making causal assignments and for embedding all of our experiences into various kinds of internal causal analyses.25 Due to this capacity we can visually attain aspects of “perceptual objects” that pertain to their “hidden” dispositional powers and propensities (which in the case of perceptual objects of the type “mykind” pertain to “mental states”). Empirical observations and theoretical considerations strongly suggest that the capability to mentally interact with others is part of the newborn’s biological endowment, which quickly matures to a state where the child can impute mental states to oneself and to others. This biological endowment requires the availability of appropriate conceptual forms (whose nature is still at the boundary of scientific elucidation) that have their proprietary ways of exploiting the sensory input. The Heider–Simmel demonstration illustrates that the conceptual forms involved code properties that go far “beyond” those physico-geometrical properties that can be coded by sensory codes. The notion of intrinsic multiperspectivity as employed here is tied to a specific theoretical conception of the functional architecture underlying perceptual achievements. While it is generally agreed that a perceptual system, however conceived, can in an idealised manner be distinguished from other systems of the mind and thus can be made an object of independent enquiry, ideas about how to conceive of this system, in particular assumptions about its conceptual and computational richness, diverge vastly. If it is assumed, as reigning theoretical frameworks in perceptual psychology do, that the conceptual apparatus of the perceptual system is confined to elementary sensory concepts, no issue of intrinsic multiperspectivity could arise. On such conceptions, as they also find their expression in the Standard Model of Perception, the entire explanatory burden with respect to phenomena such as the dual nature of pictures, the Heider–Simmel demonstration, amodal completion, etc. is placed upon higher-order interpretative systems. While this could have been possible in principle, corresponding conceptions are highly inadequate on both empirical and theoretical grounds, as pointed out above.

25 All perceptions come, to use a seventeenth-century term, cum argumento causae. Accordingly, in the terminology employed here, the output of the Perceptual System is always couched in terms of internal causal analyses, whose specific forms are coded by conceptual forms of the type “event”.

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

49

6 Intrinsic Multiperspectivity as the Activation of Multiple Layers of Underspecified Conceptual Forms The property of intrinsic multiperspectivity will emerge as a direct consequence of the type of functional architecture roughly indicated in Fig. 2. In contrast, this property will be well-nigh absent in simpler types of nervous systems in which the same type of sensory input can be independently exploited by different subsystems and simultaneously used for different tasks, but in which the integration of different subsystems is kept slim. Intrinsic multiperspectivity only shows up in more complex systems, in which an increasing number of modular subsystems with a rich computational machinery have been evolutionary interposed between the sensory input, on the one hand, and the behavioural effector systems, on the other hand. As a consequence of their qualification as a computational system, such systems have to be furnished with sufficiently abstract data types/conceptual forms over which their computational processes are defined. Due to the availability of abstract data types, these systems are prone to exhibit a pervasive and profound degree of inferential and conceptual integration. Because conceptual forms become increasingly decoupled from specific sensory inputs in corresponding architectures, it becomes almost unavoidable that each given sensory input yields a (partial) activation of a great variety of different types of conceptual forms. On the assumption that conceptual forms with a sufficient degree of activation are offered, at the corresponding interfaces, to higher-order systems, phenomena pertaining to intrinsic multiperspectivity are inescapable in these types of functional architecture. While we can have some confidence, or so I believe, in the appropriateness of the general idea that the core principles of perception pertain to a triggering of conceptual forms, our understanding of the specific nature of these forms and of the associated computational mechanisms is still extremely thin. However, some more superficial aspects of the basic logic of an internal handling of underspecified conceptual forms can be illustrated by the Heider–Simmel phenomenon. As is apparent from the phenomenal percept, the same sensory input, namely that provided by the Heider–Simmel movie, is exploited by two different types of conceptual forms: one pertaining to non-living physical objects and their attributes and the other pertaining to living objects of the type “mykind” (or “agent”) and their internal attributes. Both types of conceptual forms are underspecified by this kind of sensory input. In particular, in conceptual forms of the type “mykind”, intrinsic parameters for attributes pertaining to faces, eyes, limbs etc. cannot be specified by the given input. Rather, sensory codes pertaining to form aspects actually impede an activation of these types of conceptual forms. At the same time, sensory codes pertaining to motion patterns evidently foster an activation of the type “self-propelled object”, “agent”, or “mykind”. Apparently, the Perceptual System attaches, with respect to the activation of these types of conceptual forms, more weight to the corresponding motion codes than to the form codes, in this situation. For conceptual forms of the type “non-biological physical objects”, the triggering situation reverses. Here, the sensory codes for motion aspects vitiate an activation of these conceptual

50

R. Mausfeld

forms because the specific motion properties of the input violate internal constraints of the kinds of causal analyses associated with these types of conceptual forms (see Fig. 8 for an illustration of this triggering condition). This broad description of the triggering situation is, first of all, a specific way of framing theoretical questions that arise in the context of the Heider–Simmel phenomenon. It is, needless to say, a far cry from being an explanation of it. An explanation would have to refer to independent descriptions of the nature of the conceptual forms involved and of the specific triggering function. Furthermore, it has to address, e.g. issues regarding the integration of sensory codes; the integration of the violations of various types of internal constraints; the computational means by which various fragments or aspects are brought together into a globally coherent pattern; and local and global evaluation functions. However, Fig. 7 makes clear that intrinsic multiperspectivity emerges as a natural feature within the assumed kind of functional architecture. The same input can simultaneously yield multiple conflicting “interpretations” at its interfaces to higher-order systems. In each of these conflicting “interpretations”, part of the input or the sensory codes remains un-interpreted (either the self-motion of the objects and the kind of trajectories, or the form of the objects that carry intentional attributes). This is illustrated by Fig. 9. While the number of simultaneous “interpretations” provided by the Perceptual System is in principle only limited by the number of systematically connected packages of conceptual forms, computational requirements (pertaining, e.g. to stability aspects) and limitations of subsequent systems (in particular attentional ones) will reduce it to a very small value, usually down to two. Each of these “interpretations” embraces its own types of perceptual attributes; thrives in its own global framework; and is accompanied with its own degree of “realness”. The theoretical picture tentatively explored here helps to understand the important fact that (contrary to misconceptions that dominate reigning conceptions) “realness”, or rather “unrealness”, is a purely internal attribute that is assigned, on the basis of internal evaluation functions, to “interpretations” offered by the Perceptual System. The evaluation functions for assigning degrees of realness will presumably belong to higher levels at which phenomenal percepts are generated.

Fig. 8 Differential triggering of two different conceptual forms by the Heider–Simmel movie

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

51

The triggering conditions for the assignment of the global attribute “real” / “unreal” can be quite idiosyncratic with respect to our ordinary intuitions about what is real. Furthermore, the degree of perceptual salience of an “interpretation” of the Perceptual System and the degree of “realness” can be largely dissociated. In the Heider–Simmel case, the perception as “agents” with intentional attributes is more salient phenomenally; nevertheless, the “interpretation” in terms of moving geometric objects is assigned a higher degree of “realness”. Intrinsic multiperspectivity is not simply an odd by-product of the functional architecture of the perceptual system. It rather constitutes, for a system that is biologically endowed with a rich system of conceptual forms, an essential computationally element for dealing with internal problems pertaining to aspects of vagueness, ambiguity, and indeterminacy. Its counterpart will most probably be found in all higher-order systems that take advantage of the output of the Perceptual System. Intrinsic multiperspectivity, in one way or the other, is a pervading property of our mental architecture. It is the

Fig. 9 Layers of conflicting perceptual “interpretation” triggered by the same sensory input

52

R. Mausfeld

foundation pillar of a stupendous range of capacities that allow us to simultaneously take conflicting perspectives in “looking at things and thinking about the products of our minds” (Chomsky 2000, p 36), capacities whose imprints range from simplest perceptual phenomena to our capability to read or to employ metaphors or allegories. Acknowledgement This work was supported by BMBF-grant 01GWS060 and DFG-grants MA 1025/10-3 and 1025/10-4.

References Ainsworth PM (2009) Newman’s objection. Br J Philos Sci 60:135–171 Baillargeon R, Wu D, Yuan S, Li J, Luo Y (2009) Young infants’ expectations about self-propelled objects. In: Hood B, Santos L (eds) The origins of object knowledge. Oxford University Press, Oxford, pp 285–352 Chomsky N (1975) Reflections on language. Pantheon Books, New York, NY Chomsky N (1981) On the representation of form and function. Linguist Rev 1:3–40 Chomsky N (2000) New horizons in the study of language and mind. Cambridge University Press, Cambridge Craton LG (1996) The development of perceptual completion abilities: infants’ perception of stationary, partially occluded objects. Child Dev 67:890–904 Cudworth, R. (1731). A treatise concerning eternal and immutable morality. James and John Knapton, London: (Reprinted 1976 by Garland, New York) Cottingham J (1986) Descartes. Blackwell, Oxford Descartes R (1642/1985) Meditations of first philosophy. In: Cottingham J, Stoothoff R, Murdoch D (eds) The philosophical writings of Descartes, vol 2. Cambridge University, Cambridge, MA Fodor JA (1980) Fixation of belief and concept acquisition. In: Piatelli-Palmarini M (ed) Language and learning: the debate between Jean Piaget and Noam Chomsky. Harvard University Press, Cambridge, MA, pp 142–149 Fodor (2003) Is it a bird? Problems with old and new approaches to the theory of concepts, Times Literary Supplement, January 17: 3–4 Gallistel CR (1998) Symbolic processes in the brain: the case of insect navigation. In: Scarborough D, Sternberg S (eds) Methods, models and conceptual issues an invitation to cognitive science, vol 4. MIT, Cambridge, MA, pp 1–51 Gaukroger S (1990) The background to the problem of perceptual cognition. In: Gaukroger S (ed) Arnauld: on true and false ideas. Manchester University Press, Manchester Goodwin B (2009) Beyond the Darwinian paradigm: understanding biological forms. In: Ruse M, Travis J (eds) Evolution: the first four billion years. Harvard University Press, Cambridge, MA, pp 299–312 Gombrich EH (1982) The image and the eye. Phaidon, Oxford Grier M (2001) Kant’s doctrin of transcedental illusion. Cambridge University Press, Cambridge Hamlin JK, Wynn K, Bloom P (2007) Social evaluation by preverbal infants. Nature 450:557–559 Harland DP, Jackson RR (2000) ‘Eight-legged cats’ and how they see – a review of recent work on jumping spiders. Cimbebasia 16:231–240 Heider F, Simmel M (1944) An experimental study of apparent behaviour. Am J Psychol 57: 243–259 € Helmholtz Hv (1855) Uber das Sehen des Menschen. In: Vortr€age und Reden. 4. Aufl., Bd.1, 1896. Vieweg, Braunschweig

Intrinsic Multiperspectivity: Conceptual Forms and the Functional Architecture

53

Jacobs DK, Nakanishi N, Yuan D, Camara A, Nichols SA, Hartenstein V (2007) Evolution of sensory structures in basal metazoa. Integr Comp Biol 47:712–723 James W (1890/1983) The principles of psychology. Harvard University Press, Cambridge, MA Johnson SC (2003) Detecting agents. Philos Trans R Soc B Biol Sci 358:549–559 Kellman PJ, Spelke ES (1983) Perception of partly occluded objects in infancy. Cogn Psychol 15:483–524 Kirschner M, Gerhart J (1998) Evolvability. Proc Natl Acad Sci USA 95:8420–8427 Kitano H (2004) Biological robustness. Nat Rev Genet 5:826–837 Leslie AM (1987) Pretense and representation: the origins of ‘theory of mind’. Psychol Rev 94:412–426 Ludlow P (2003) Referential semantics for I-languages? In: Antony LM, Hornstein N (eds) Chomsky and his critics. Blackwell, Oxford, pp 140–161 Mausfeld R (2002) The physicalistic trap in perception. In: Heyer D, Mausfeld R (eds) Perception and the physical world. Wiley, Chichester, pp 75–112 Mausfeld R (2003a) ‘Colour’ as part of the format of two different perceptual primitives: the dual coding of colour. In: Mausfeld R, Heyer D (eds) Colour perception: mind and the physical world. Oxford University Press, Oxford, pp 381–430 Mausfeld R (2003b) Competing representations and the mental capacity for conjoint perspectives. In: Hecht H, Schwartz B, Atherton M (eds) Inside pictures: an interdisciplinary approach to picture perception. MIT, Cambridge, MA, pp 17–60 Mausfeld R (2010a) Intrinsic multiperspectivity: on the architectural foundations of a distinctive mental capacity. In: Frensch PA, Schwarzer R (eds) Cognition and neuropsychology: international perspectives on psychological science, vol 1. Psychology, London, pp 95–116 Mausfeld R (2010b) The perception of material qualities and the internal semantics of the perceptual system. In: Albertazzi L, van Tonder G, Vishwanath D (eds) Perception beyond Inference. The information content of visual processes. MIT, Cambridge, MA, pp 159–200 Mausfeld R (2010c) Psychologie, Biologie, kognitive Neurowissenschaften. Zur gegenw€artigen Dominanz neuroreduktionistischer Positionen und zu ihren stillschweigenden Grundannahmen. Psychol Rundsch 61:180–190 Maxwell G (1970) Theories, perception, and structural realism. In: Colodny RG (ed) The nature and function of scientific theories. University of Pittsburgh Press, Pittsburgh, PA, pp 3–34 Michotte A (1948/1991) L’e´nigma psychologique de la perspective dans le dessin line´aire. Bulletin de la Classe des Lettres de l’Acade´mie Royale de Belgique 34:268–288 (The psychological enigma of perspective in outline pictures. In: Thine`s G, Costall A, Butterworth G (eds) (1991) Michotte’s experimental phenomenology of perception, Erlbaum, Hillsdale, NJ) Michotte A (1960/1991) Le re´el et l’irre´el dans l’image. Bulletin de la Classe des Lettres de l’Acade´mie Royale de Belgique 46:330–344 (The real and the unreal in the image. In: Thine`s G, Costall A, Butterworth G (eds) (1991), Michotte’s experimental phenomenology of perception, Erlbaum, Hillsdale, NJ) Miller GA (1979) A very personal history (occasional paper no. 1). Center for Cognitive Science, Massachusetts Institute of Technology, Cambridge, MA M€uller G (2007) Evo-devo: extending the evolutionary synthesis. Nat Rev Genet 8:943–949 Newman SA, Bhat R (2009) Dynamical patterning modules: a ‘pattern language’ for development and evolution of multicellular form. Int J Dev Biol 53:693–705 Nilsson DE, Gisle´n L, Coates MM, Skogh C, Garm A (2005) Advanced optics in a jellyfish eye. Nature 435:201–205 Pinkal M (1996) Radical underspecification. In: Dekker P, Stokhof M (eds) Proceedings of the 10th Amsterdam Colloquium, Institute for Logic, Language and Computation, ILLC Publications, Amsterdam, pp 587–606 Pustejovsky J (1998) The semantics of lexical underspecification. Folia Linguist 32:323–347 Roediger HL (2004) What happened to behaviorism, Presidential column, APS Observer Russell B (1927) The analysis of matter. George Allen & Unwin, London

54

R. Mausfeld

Scholl BJ, Tremoulet P (2000) Perceptual causality and animacy. Trends Cogn Sci 4:299–309 Spelke ES (2000) Core knowledge. Am Psychol 55:1233–1243 Spelke ES, Kinzler KD (2007) Core knowledge. Dev Sci 10(1):89–96 Spencer H (1855) The principles of psychology. Longman, Brown, Green and Longmans, London Tinbergen N (1951) The study of instinct. Clarendon, Oxford Tse PU (1999) Volume completion. Cogn Psychol 39:37–68 von Szily A (1921) Stereoskopische Versuche mit Schattenrissen. Gr€afes Archiv f€ur Ophtalmologie 105:964–972 Wagner GP, Mezey J, Calabretta R (2005) Natural selection and the origin of modules. In: Callabaut W, Rasskin-Gutman D (eds) Modularity understanding the development and evolution of complex natural systems. MIT, Cambridge, MA, pp 33–49 Worrall J (1989) Structural realism: the best of both worlds? Dialectica 43:99–124 Yolton JW (1984) Perceptual acquaintance from Descartes to Reid. University of Minnesota Press, Minneapolis, MN Yolton JW (1996) Perception and reality. A history from Descartes to Kant. Cornell University Press, Ithaca, NY

Prospects of Objective Knowledge Christian Spahn

Abstract When looking at modern epistemology, one might come to the conclusion that the prospects for a solid realism are not so good. Two main sources for this common assessment can be identified: (1) First of all, Logical Positivism has initiated a new phase in the recent history of philosophy that was first and foremost characterized by a commitment to science. It has become evident, however, in the development of this tradition that the early hopes for a new foundation of a strong epistemic realism were soon disappointed: Tendencies toward internalism, constructivism, pragmatism, antirealism, or even cultural relativism were established. The most important steps in this development will be analyzed in the first part of the chapter: The working hypothesis in the short sketch of the historical development of epistemology is that a certain ultimately Cartesian and antievolutionary dualism plays the decisive role in the formative period of modern realism. It is this same dualism that leads from realism to more skeptical positions. By taking this Cartesian opposition for granted, modern realism again turns from empiricism to internalism or linguistic idealism. The main argument against realism will be examined and rejected: the defense of a coherent realism is only then impossible if one accepts the dualism of mind and world in the first place. But in this case most arguments against realism become a powerless tautology. (2) Secondly, it will be argued that this implicit dualistic terminology plays also a decisive role in the interpretation of the results of the modern science of cognition. Many modern approaches to cognition begin with the claim that cognition is essentially a biological phenomenon. These views are therefore at first glance intrinsically antidualistic. But also in this domain one can trace back a shift from more realistic to more constructivist or relativistic paradigms. It will be shown that it is not surprising that the “biologization of reason” has led to two radically different interpretations: a realistic one and a more constructivistic one. It will be argued that also in these approaches the dualistic and eventually antievolutionary opposition is

C. Spahn Department of Philosophy, College of Humanities, Keimyung University, 100 Sindang-Dong, Dalseo-Gu, Daegu 704-701, South Korea e-mail: [email protected]

W. Welsch et al. (eds.), Interdisciplinary Anthropology, DOI 10.1007/978-3-642-11668-1_3, # Springer-Verlag Berlin Heidelberg 2011

55

56

C. Spahn

in fact ultimately decisive for the general philosophical and epistemological conclusions. The main point, however, is that any Evolutionary Epistemology simply cannot answer the basic problems concerning objectivity in knowledge. A positive solution is a vicious circle, the negative solution entails a self-contradiction. If one follows this line of thought then it becomes evident that only one of the interpretations can be a research paradigm that is both fruitful and does not undermine itself. (3) The last part of the chapter following Millikan (1984, 2004), Thompson (2007), Lorenz (1973), Goodson (2003), Proust (1999) and Noe¨ (2004), therefore focuses on a very short sketch of how the combination of a philosophical realism and an evolutionary conception of the phenomenon of cognition could be spelled out in such a way that one does not fall into the traps of either dualism or selfcontradiction.

1 Preliminary Remarks: Conceptual and Scientific Problems of Realism When looking at modern epistemology, one might come to the conclusion that the prospects for a solid realism are not so good. Two main sources for this common assessment can be identified: First of all, Logical Positivism has initiated a new and interesting phase in the recent history of philosophy that was first and foremost characterized by a commitment to science.1 Along with this commitment comes an embracing of both of realism and naturalism. It has become evident, however, in the development of this tradition that the early hopes for a new foundation of a strong epistemic realism were soon disappointed: Tendencies toward internalism, constructivism, pragmatism, antirealism, or even cultural relativism were established that pointed away from a strong realism toward more skeptical positions. The most important reasoning for this shift is the pertinent claim of Putnam, that something like a realistic external “God’s eye view” (Putnam 1981) to the world in itself – or a “view from nowhere”, to use Nagel’s (Nagel 1986) terminology – is necessarily inaccessible for us, so that any realism that entails aspects of such an externalist claim to objective knowledge is profoundly misguided. This argument of the self-contradiction in the conception of a real world “as such” – that was already discussed in German Idealism – involves a revival that leads to tendencies of “internalizations”, such as can be found in Putnam’s work or even more so in antirealist doctrines.2 This philosophical debate focuses on important basic conceptual questions: If one wants to defend any kind of realism that is up to date, then inevitably one must look at the most crucial steps in the recent history of epistemology. The working 1 For a detailed analysis of the connection of the modern scientific world view and the main tenets of Logical Positivism, see Spahn (2007, 23ff.). For a critique of the main tenets of both scientistic and anti-scientistic epistemologies, see 48ff. 2 See Dummett (1963), a paper that started the debate about “antirealism”.

Prospects of Objective Knowledge

57

hypothesis in the following short sketch of the historical development of epistemology is that a certain ultimately Cartesian and antievolutionary dualism plays the decisive role in the formative period of modern realism. It is this same dualism that leads from realism to more skeptical positions. This dualistic attitude follows a long tradition in European thinking of opposing nature and mind in such a strong way that whatever is natural does not entail “mind” (God forbid!) and whatever mind is, it is not simply natural (it would be devastating if it turns out not to be this way!3). By taking this opposition for granted, modern realism again turns from empiricism to internalism or linguistic idealism.4 If one embraces this conceptual opposition and rediscovers the impossibility of abstraction from the conceptual imprinting done by our consciousness in any alleged insight about the world,5 then it is difficult to see how any objectively true insights into the real world (as it would be apart from our “imprints” or “additions”) could even be conceptualized as possible. It shall be shown in the following pages that these new arguments against realism are indeed simply convincing, but also that they are such only insofar as one accepts the dualistic, eventually pre-Darwinian opposition of subjectivity (mind) and objectivity (reality). If one modifies this starting assumption – following the insights of modern science and the debates of modern epistemology – toward a more up-to-date assumption of the continuity between nature and mind, then one can sketch the basic tenets of a modern epistemic realism that overcomes certain basic conceptual difficulties (see Sect. 3.2). Secondly, this implicit dualistic terminology plays a decisive role not only in the philosophical debate on realism but also in the interpretation of the results of the modern science of cognition. Also in this domain one can trace back a shift from more realistic to more constructivist or relativistic paradigms. While, for example, the early Evolutionary Epistemology argues in favor of realism, new results and tendencies6 lead to more pragmatic, eventually subjective-idealistic or constructivist, views: Optical illusions, change blindness, etc., count as paradigmatic examples from which a call to cognitive modesty (if not even to cognitive self-humiliation due to the high degree of idiosyncratic cognitive mechanisms) is derived. This pattern of argumentation shall be examined on its implicit assumptions: could it be that here, too, it is not so much the new scientific results as such, but the implicit dualistic prejudgment that opposes nature and mind, assumes that cognition is unworldly,

3

See Spahn and Tewes (2011) for an overview of the impact that the dualism between mind and world has in the philosophical debate about the theory of evolution. Naturalism it seems faces two main challenges: the qualia-problem and the problem that ethics cannot easily be naturalized. 4 Nagel (1986, esp. Chaps. V and VI) motivates his defense of a new realism with the correct observation that within the recent philosophy a tendency toward idealism and internalism prevails: “There is a significant strain of idealism in contemporary philosophy, according to which what there is and how things are cannot go beyond what we could in principle think about” (Nagel 1986, 9). 5 This insight is actually almost a tautology, but we will have to deal with its bewitching consequences later more in detail. 6 Think for example on the criticism of a strong adaptationism and the modern emphasis of internal constraints for any evolutionary and cognitive achievement.

58

C. Spahn

and plays the crucial role? It will be argued that in fact the dualistic and eventually antievolutionary opposition is ultimately decisive for these general philosophical and epistemological conclusions. Nevertheless, the cogent criticism against Evolutionary Epistemology cannot be denied. Evolutionary Epistemology cannot decide on its own if objectively true knowledge is achieved or even possible. Yet, because the arguments of the first part of this chapter lead to a favorable decision toward realism (just as the appeal to natural sciences in any scientifically inspired philosophy of cognition already presupposes some kind of realism, to which a complete account of the evolution of cognition must provide justice), the last part of this chapter therefore focuses on a very short sketch of how the combination of a philosophical realism and an evolutionary conception of the phenomenon of cognition could be spelled out. What would – in allusion to the fact that this paper was presented in Jena7 – an evolutionary expanded phenomenology of spirit, i.e., a conception of the possibilities, limits, and stages of development from nonconceptual to full-fledged linguistic cognition, look like? Following Millikan (1984, 2004), Thompson (2007), Lorenz (1973), Goodson (2003), Proust (1999) and Noe¨ (2004),8 what can we learn for an evolutionary account of cognition? What would be the most important conceptual steps one must assume to move from preconceptual to linguistic knowledge and how would the dualism of “inside” and “outside”, of “objective input” and “subjective reconstruction”, have to be spelled out so that it renders the whole story conceptually plausible and not self-contradictory?

2 Prospects of Objective Knowledge: The Dualism of Mind and World and the Problems of Realism The main tenets of Logical Empiricism can be stated in a short way just by appealing to the denomination of this philosophical school. True knowledge stems from experience and must ultimately be verified or falsified by experience; justified knowledge must be deduced in a logically proper way from simple observed empirical facts. Ideally, all propositions of sciences and philosophy could be translated into a precise mathematical or formal language such that every proposition could in finite steps either be proven to be true or false or be revealed as “nonsensical”. The most important task for philosophy according to this view is to work on a precise axiomatization and formalization and to criticize those

7 Also the basic tenets of the reconciliation of realism and idealism follow some basic, eventually Hegelian, insights. For the debate on modern realism and skepticism in comparison to Hegel’s answer to skepticism, see Spahn (2011). 8 One might be surprised that Lorenz’ book is included. Indeed, it is “pretty old” since empirical sciences have made a lot of progress. But the conceptual order and clarity of this book is still unsurpassed.

Prospects of Objective Knowledge

59

philosophical approaches and questions that exceed experience and empirical verification or are based on a misuse of an as-of-yet not strictly formalized language.9 By applying this approach to the problem of realism, Carnap sets a new starting point for the debate by distinguishing “external” and “internal” questions (Carnap 1950). According to him the conflict between realism and idealism (questions as to whether there really are things, e.g., chairs, numbers, etc., or if we only see the world as if it would entail these entities) is an empty debate: it is a “Scheinproblem” (pseudo-problem) that is based on a misunderstanding of language (Carnap 1961 [1928]). According to him, questions concerning “existence” can only be asked within a certain conceptual scheme (or “a language”). After I have, for example, defined what a cat is supposed to be, I can answer the question “do cats exist” in the other room in an empirical way (I go and look). In the same way I can answer the question whether there exists a prime number larger than 100 through mathematical methods. Given the proper definitions (of objects or mathematical entities like numbers) and methods (looking or calculating), these internal questions about “existence” could in principle be answered. The “philosophical” question if there exist numbers or cats as such, in a nonconcept-dependent way, is for Carnap a pseudo-question simply because there is no empirical (or formal or logical) method for verifying or falsifying such an existence, or for making the circularity of this approach more obvious: it is because “existence” is understood or defined by him in a way that means “internal existence”, i.e., existence in relation to a conceptual scheme, whereby speaking of nonconceptual existence as such is meaningless. Since, via dogmatic assumption, there are only empirical or formal logical ways to answer any question, including philosophical questions, the external question becomes a pseudo-question. What exists does so simply in relation to (or depending on) a conceptual scheme and the choice between different conceptual schemes cannot be decided by theoretical insights but merely by pragmatic considerations (Carnap 1950, 38). The question whether our conceptual scheme “fits” or is “in accordance with” the world cannot be answered in either a positive or negative way. The claim that we can only speak about existence in a meaningful way within certain conceptual frames sets the track for a further development of the debate toward internalistic options, even if Carnap denies that and still sticks to an empirical realist position. If any meaningful use of the concept existence is bound to be internal to conceptual frames, linguistic decisions and operations, and if we cannot reasonably argue for these basic frameworks, then this must lead in the long run to somehow more idealistic or antirealist views simply because this idea itself is the main claim of any idealistic position. If by no means can we conclude “from thinking to existence”, because even the concept of “existence” is a function of conceptualization and only gives us rules for reidentifying things within certain frameworks, then how can we claim that there is a mind-independent world that gives us something like “input”? How can we maintain any realist view that deserves this name?

9

For a more detailed account, see Spahn (2007, 16–31).

60

C. Spahn

The answer is that the following realist positions do not deserve that name or that they do not solve this problem. But how could this peculiar assumption that we could not conclude “from thinking to being” in any way ever be justified? It is evident that in the writings of Russell or Carnap, the argument is in most cases a circular one (see esp. Carnap 1931). This might lead to the suspicion that behind this assumption, we may find again the same modern antievolutionary dualism of mind and world. If our thinking was in fact alien to the world we could not conclude from it to “being” or the world, but the question is how do we know just that (namely, that it is different)? If, however, our thinking is immersed in the world, then such a conclusion from thinking to being would be possible or even natural. We come back to this point later, but first let us move on with the historical sketch. Quine follows Carnap’s assumptions and places existence claims within the boundaries of conceptual frameworks. In his famous essay On what there is (Quine 1948), he basically proclaims Carnap’s solution: By accepting a theoretical language we accept a certain ontology. The bounded variables in my system of quantification define the objects that I have accepted as “existing” (Quine 1948, 1968a [1958], 1969a, b, 1981). But Quine warns not to confuse the empirical evidence that makes the agreement on a speech act probable with empirical evidence for the existence of an object (Quine 1969a, 11). By saying this, however, he puts the cart before the horse: it is not the world that determines what kind of ontology or theory we should choose, but rather it is our theoretical decisions that constitute “what there is”. Nevertheless, Quine still wants to be a realist; therefore, we find numerous assertions that there would be an empirical reality that has an influence on the border of our theories. But still, Quine aggravates the problem of realism enormously in comparison to Carnap, despite his attempt to pull the emergency brake by referring to an external world and its general influence upon us. While Carnap still seems to believe that within science only one framework or language will suffice and survive the logical and empirical scrutiny (Carnap 1931, 1961 [1928]), Quine radicalizes and pluralizes the theorem of conceptual internality. He believes that one cannot ignore that there are in principle many consistent but mutually exclusive ways to describe the same empirical reality. The same empirical clues could be interpreted in ontologically radical different ways depending on one’s rules of quantification, instantiation, and identity claims. It might not even be possible to distinguish the implicit differences between the ontologies of different theories due to the problem of radical translation: Do the natives that say “gavagai” refer to rabbits, undetached rabbit parts, or rabbit stages? (Quine 1960, 1969a, b). Of course, Quine concedes that if we know their quantifiers and their rules of identification, we might be able to decide whether there is a difference, but, alas, this only postpones the problem, because their term “is identical with” could just as well mean in our language “occurs together with” (Quine 1969b, 33). Furthermore, Quine thinks that from a logical point of view, it might be possible to translate one language into two different, consistent, but mutually exclusive, languages depending on the matrix of translation that we use (Quine 1969b, 55ff.). Finally, Quine’s holistic theory does not allow such a simple one-to-one relation between propositions and reality or empirical evidence that Carnap was aiming at in

Prospects of Objective Knowledge

61

his theory of protocol statements. If, for example, an experiment falsifies a prediction it is only clear that the whole system of propositions in its specific conjunction is false, but we still do not know what basic propositions we have to give up. The main point, therefore, is that Carnap’s view of the theory-dependence of existence claims gets radicalized in many ways. One and the same reality does not only allow different theoretical interpretations but even incommensurable interpretations. Even if Quine, as we have seen, confesses his realism again and again, the step to an explicit internalist theory and the step toward antirealism are not far apart. It is noteworthy that also in Quine’s terminology, “the input” or the “empirical” is interpreted as coming from the “world outside”, which lies beyond the mind and propositional or mental representations; being open to radically different conceptualizations, it is in itself different from or neutral toward our theories. Theory, language, and conceptual frameworks, on the other hand, are considered to be the “subjective part”, the exclusively human addition that is eventually variable and “unworldly”. So again, we find the same questionable dualistic contravalence between concepts and reality in Quine’s approach as well. Davidson argues in response to Quine that such a pluralization of incommensurable conceptual frames is not possible. Thereby he tries to exorcize the danger of idealism that was unwillingly or negligently evoked by Quine. Davidson wants to prove that there can ultimately only be one framework; everything that is formulated in different conceptual frames should in principle be translatable (Davidson 2006 [1973–1974], 196–208). To be more precise, after Davidson has shown this (i.e., that there can be no two radically distinct frameworks or schemes), he wants to argue that there is no dualism whatsoever between content and scheme and that we are in “direct touch with reality”. We will come back to this remarkable idea later, in Davidson’s essay it seems, however, that this conclusion of a “direct touch” is more pulled out of a hat like a “gavagai” (or rabbit) then it is carefully exploited and elaborated.10 But the main intuition, that in order to defend any realist position we have to give up the strong dualistic contravalence between mind or language, on the one hand, and the world or input, on the other hand, seems to be right. But how we get rid of this dualistic conception, and with what better approach we can replace it, remains obscure. Even Davidson himself says that it is not clear how we can call any position an empirical realism if we give up this distinction (Davidson 2006 [1973–1974], 201). Later Davidson himself grew so frightened by Rorty’s objection that he withdrew the label “theory of correspondence” for his approach. One must agree with Rorty that it is not easy to see how you can still evoke the notion of a correspondence if you give up the distinction between the two elements that could correspond to each other (Rorty 1991, 126–150, 126ff.). Putnam, in one of his changing assessments of the problem of realism, is the first to draw some explicit consequences from this, until now, only more implicit dualistic assumption. Any nonnaive realism can only be an “internal realism” (Putnam 1981). We do not have any grasp of a reality as such beyond the reach of our

10

See also the very early critical essay (Rorty 1972, 3–18).

62

C. Spahn

conceptualizations. His argument is well known and it renews an old basic epistemological insight: the conception of a nonconceptualized world as such is in itself a concept. To think that I grasp the realm that lies beyond my thinking is in itself an act of thinking. The bizarre aspect of the renewal of this insight is that this argumentation, which is actually an argument (with Hegel) in favor of realism (or to be more precise of objective idealism), is often understood as an argument against realism, and has become the standard argument for (subjective) idealism or constructivism. To conclude, the first part of this argument shall now be reconstructed and criticized in detail. But first let me admit that I think this argument, which says that any knowledge whatsoever is basically always knowledge of somebody and must therefore always be in accordance with his conditions of conceptualization, has such strong power because it is eventually a tautology – and one should not fight against tautologies since they are logically or conceptually necessary truths.11 In opposing certain “idealistic tendencies”, Nagel has already emphasized that if knowledge is always somebody’s knowledge (that of “a subject”), then it must be “subjective” in its form, but that does not in itself yet mean that it cannot have an objective content (Nagel 1986, 93). But the main question coming from the idealist side is: how do we know that the content is objective? It seems this mere verbal distinction of content and form does not help us much because in order to know if the content is objective we must be able to (1) somehow detach it from our subjective mode of representation (call this the detachment theorem DT) and then compare it with the real reality (let us call this the externalist conception of reality ER). While Davidson wants to show against DT that such a separation is not possible (thereby radicalizing Quine’s holism which would also raise strong objections against DT), the main point in the debate about Putnam’s internalism concerns ER.12 Following the basic insight of the later Wittgenstein and Putnam, it is not possible to get a “picture of the world” that really is “the world” and not “a picture”: we simply have no access to the way the world would look like “as not seen by us”. Even our concept of the relation of our world and our conceptualization would, then, not be a true representation of this relation, but our model of it,13 so that any aspect of a “world as such” would always be the “world as such for us”, and is thereby assumed not to be the real world. We must use our concepts and constraints even in painting out the world beyond these constraints: to say it is different uses our notion of difference and to say it delivers input uses our notion of causality. In speaking about this world as such we cannot help but applying all those categories and conceptions that are the necessary conditions of any thought or speech act. Just postulating the existence of this outer world as a “Grenzbegriff” (limit-concept), as Kant does with his conception of the “Ding an sich” (thing in itself), or as “noise” in 11

On the other hand, Putnam, of course, wants at the same time to refute the idea that we are just brains in a vat (Putnam 1981). Just how to combine the tenets of internalism and externalism remains the crucial task. 12 See also Spahn (2010) for a critical analysis of the debate between Nagel and Davidson. 13 This point is elaborated in Rorty (1979, 89) against Putnam and Davidson: “No phenomenal terms can describe the phenomenon–noumenon relation, and we have no other terms”.

Prospects of Objective Knowledge

63

the way modern constructivists like Maturana (1980 [1970]) do, will not do the trick. It remains a self-contradiction to assume existence and causal influence of this world as such or to ascribe “nonascribability” to it. Even negative ascriptions are still (our) ascriptions.14 But does this mean that any realism will always turn out to be a disguised internalism or even idealism? This kind of argumentation may not be new and in a certain sense it is simply convincing, but it is wrong to think that this would constitute a viable argument against realism. It only does so if and only if one has already accepted a dualistic contravalence of mind and world, human knowledge and objective reality. Therefore, this argumentation reduces itself to a harmless tautology: If our knowledge is not in correspondence with or does not represent the world, then our knowledge is not in correspondence with or does not represent the world. The truth of the matter is that an idealism or subjectivism does not follow from the logical (or better: semantical) impossibility of conceiving a world as such (or a neutral input) beyond our conceptual capacities, but it follows from the assumption that the real world is in its essence truly transcendent to our categories, such that it cannot be truly captured with human concepts, frameworks, or schemes. Only if one grants this assumption, does one need to abstract from our conceptual ingredients, our distortions and subjectivities to get a clear view on the real world. By evoking the dualism of mind and world, or of man and nature, it is simply presupposed that concepts are subjective, unworldly, not objective and must be left behind insofar they have their origin in the human mind, and the human mind is assumed to be opposed to nature. If our perspective (as the term suggests) is not and cannot be an objective perspective, and if secondly, we can never leave the fact behind that we have a perspective, then it is true that we remain “trapped” in “our merely subjective way” of seeing the world. This may, of course, be true, it may be that our conceptualizations and frameworks are altogether subjective and idiosyncratic, but the sketched arguments from ER and DT do not merely show it, they presuppose it.15 This can be shown by tentatively assuming that our concepts and frameworks could be objective, whereby we see at once that grasping the world as such with our categories would of course be possible, and the fact that we use “our” categories would not be a dilemma. This is so just because in this view the “world as such” is not defined as being in opposition or contradiction with the “world for us”; this opposition only makes sense if we already assume a gap between the possibility of conceptualizing things and the way the things are as such. The opposing term for world as such would then be something as “world misconstrued by us” as opposed to “world as such correctly reconstructed”. In this view ER would be totally wrong; it would be neurotic not to use the concepts we have and look for the truth only “outside of our thinking”. This attempt would amount to getting rid of your eyes by

14

This point is already at the core of the debate between Hegel and Kant about the Ding an sich, see Spahn (2004), and in relation to Nagel, see Spahn (2010), a new version of this argument can be found in H€osle (1998, 1–40). 15 For a more detailed analysis of the idealist trap, see Spahn (2010).

64

C. Spahn

tearing them out in order to see more clearly the undistorted world. Language, concepts, and frameworks are not “mediating” in a “distorting way” between us and the world once we give up the basic dualism.16 Even though we find hints in the modern debate in this direction of giving up this old dualism, there is not yet any elaborated theory available that renders justice both to this more monistic intuition and the fact that we can misrepresent things. A lot of questions would need to be discussed: the whole debate about reference,17 about what it is to be a concept, what it is to be a thing, what a relation of difference or identity could describe the relation of referring beyond the false alternatives of externalism or internalism: all of these questions need to be answered in detail in any modern attempt to defend a nonnaive realist position. It cannot be the task of this chapter, which aims at an overview, to get into a detailed debate of any of these questions; instead two short remarks shall shed some light on the road one might want to take starting from this antidualistic assumption. Firstly, against Nagel’s impressing attempt to defend realism (Nagel 1986), one might argue that we need more sound arguments for the capacity of grasping the world within our conceptual possibilities instead of insisting that the world as such in “a strong sense extends the reach of our mind” or concepts. A modern realism that rejects this dualistic intuition might not subscribe to Nagel’s emphasis on the realm that lies beyond our reach, and one might well defend a realism by accepting Davidson’s analysis and reinterpreting it in an objectivist way.18 It might be said that McDowell’s ambitious theory (McDowell 1996) is the most recent and most promising attempt to combine the insights of Sellars (1956) (giving up the “myth of the given”) and Davidson’s inclination to leave the dualism of scheme and content behind. One can grant to McDowell that “the space of reason” and the “space of nature” must in a certain way be reunited even if one does not subscribe to his ultimately Kantian fashion of bridging the gap. But it remains correct that if a radical dualism of contravalence between man and world, mind and nature, can be refuted by its self-contradictory implications, then a realism that deserves the name might again have a chance in the debate. Two ways to overcome this dualism can be distinguished. One might follow the tradition of Logical Empiricism and broaden it: A conceptual or transcendental theory of meaning would not only need to get rid of nonsensical terms that are empirically empty but would also have to provide a criteria to detect and exclude those concepts (and thought experiments) that are self-contradictory.19 Maybe the idea of making a dualistic distinction between thought and reality is a candidate for 16

One crucial point is the distinction between the aspect of an “independent existence” of the world and the idea that it is “transcendent to knowledge in its structure”. Searle clearly sees the difficulties in using the dualistic metaphors of “inside” and “outside” for the relation of mind and world: (Searle 1983, 37). For the senses, but not for language and thinking, he therefore likes to use the word “presentation” instead of “representation” (Searle 1983, 44). 17 Searle avoids the ontological questions concerning representation (Searle 1983, 11). Millikan (1984) tries to sketch a naturalistic answer based on biological categories. 18 See Spahn (2010). 19 A very impressing sketch of such a “transcendental semantics” can be found in Braßel (2005).

Prospects of Objective Knowledge

65

such a critical examination. Does not the view that says that the world is or could be absolutely different from our view presuppose that the concept of difference is applied correctly? Do not we need to have some knowledge about this difference and the way in which it is different (not like we see it, not conceptual but real, etc.) in order to make this claim?20 Assuming that our reasoning does not have any ontological valence is in itself an ontological claim about the relation between reason and the world and presupposes implicit or explicit assumptions about the way the world and our reasoning are structured. It might well be true that one can only argue in a circular fashion for any objectivity of our thinking, but surely one can only doubt the valence of reasoning in a self-defeating fashion. Establishing a radical dualism of “the conceptual” and “the real” presupposes in itself that this dualism is wrong; otherwise one could neither understand what this dualism is supposed to mean nor could one argue in favor of it. In addition to this first way of conceptual clarification and transcendental reflection on the implicit assumptions of skeptical conclusions (which of course would need further elaboration and discussion), there is secondly a somehow “inverted” but complementary approach. If this dualistic assumption cannot be defended without running into the traps of self-contradiction, then just how plausible is the insistence on the dualism between the human, unworldly mind, and a nature that remains in its core inaccessible to us from an evolutionary perspective? Is not the interpretation of the results of evolutionary approaches pervaded by this old dualism of a “separated” nature of humans, even if it is coated in a language of humility, e.g., when one says: “Humans don’t have any special or exceptional way to the truth because any evolutionary thinking would have to deny such wishful thinking of exceptionality”. But let us take a closer look: perhaps even this interpretation is a disguised way of still sticking to the idea of an exceptional human nature because it implies that thinking is not pervaded by nature or continuous with its objects but remains the alien, uniquely human, or special trait that lies beyond nature. This second more interdisciplinary and integrative question of balancing cognition and objectivity and avoiding one-sided approaches shall now be addressed. Should not a more detailed look at the evolutionary debates in cognitive sciences and philosophy make this dualism that was already proven to be conceptually and logically defective even more dubious?

3 The Implausibility of a Cognitive Separateness of Human Thinking from Nature Many modern approaches to cognition begin with the claim that cognition is essentially a biological phenomenon: “Where there is life there is mind, and mind in its most articulated forms belongs to life”. This is for example E. Thompson’s view (2007, ix) 20

This idea also seems to be the driving force behind Hegel’s early discussion of ancient and new skepticism; see Spahn (2011).

66

C. Spahn

following Varela and Maturana (Maturana 1980[1970]; Maturana and Varela 1980 [1970], 1992),21 but also taking up Hans Jonas’ insights (Thompson 2007, 26f.; Jonas 1966). Millikan in her defense of a teleosemantic approach to the problem of realism ascribes “language and thought” to the “biological categories” (Millikan 1984). Popper (1972, 1984) already proposed a view that all life is “problem solving”, Lorenz (1941, 1973) argued that life from its simplest forms of noncognitive adaptation via behavior and cognition up to our intellectual life must be understood as “information gathering”. The concept of cognition and knowing that is spelled out in these views is thereby already placed within the world: Knowing is linked to acting, or more specifically to surviving within a given environment. There is no need to prove the existence of an external world since the idea of an unworldly subject (ego) is regarded as completely misguided. These views are therefore intrinsically antidualistic; they start by assuming that a contravalence between world and mind or cognition does not exist. Since the categories or cognitive makeup of the “subject” is derived from a long history of repeated interactions with the world, it seems plausible to assume that cognition is not “outside of” the world but rather infused within it. Against a Kantian or subjective-idealist view, these approaches take a stance that you might call Heideggerian: Being-in-World (In-der-Welt-Sein) is the fundamental starting point. But is it true that this concept of cognition really supports the idea of a world-relatedness of cognitive achievements and of an objectivistic realism? Or can there be arguments found even within these views that support, on the contrary, a more Cartesian doubt and suspicion that knowledge is still not objective but illusory and unworldly? Even if these views take an evolutionist stance, we will see that it is not so easy to say what consequences for cognition should be drawn from this starting point. But first of all, one should appreciate the broadening of the concept of cognition such that it not only refers to a uniquely human rationality and world-directedness but includes many different kinds of “knowing” about the world. Of course “information” in genes, adaptation in designs [Lorenz (1941) says that the fin of a fish represents stored knowledge about the environment], covariance from inner and outer states, emotive responses to the external world, proto-conceptual and conceptual or linguistic knowledge must be distinguished. But it seems natural to follow (Thompson 2007) in assuming that all these types of “Intentionality” or “aboutness” come into the world once we have replicating and interacting organisms. The basic structure of a semiotic relation or Intentionality in Searle’s sense (Searle 1983) – the basic precondition for any attribution of knowledge about the world – only comes into existence through biological relations: Something (x, for example, sucrose) counts as (r: “is perceived as, is used as, is acted upon in such a way. . .” or “has the meaning of”) something else (food, a signal, danger, etc.) for somebody (an organism) (Thompson 2007, 146).

21

See also Maturana’s famous definition that life is cognition in Maturana (1980[1970]).

Prospects of Objective Knowledge

67

Through a metabolic system that must sustain itself, a net of “meaning and valuation”22 is imposed on the world by the organism. The environment is “not just” a physico-chemical milieu; it must be regarded as being either possibly functional or dysfunctional for the given “aim” of reproduction and self-sustainment. In this sense, organisms impose “meaning” or relevance to a mere neutral realm of physical facts (Thompson 2007, 145).23 This seems to establish the basic teleonomical relations that modern theories of cognition, Intentionality, and world-relatedness need to expand in a way that we still understand why and in what sense we can or even have to employ it for these cases. A full history of cognition would have to both stress the common structure of aboutness and Intentionality on all levels of cognition and find the basic and essential steps that slowly transform this structure from a mere functional relation without explicit representations (embodied knowledge) via different modes of awareness and sensual inference to a full-fledged conceptual cognition. But before we can look at this point in more detail, we should first try to examine the possible consequences of such an approach of inner-world cognition for the problem of objectivity.

3.1

Constructivism and Realism in Evolutionary Epistemology

It is not surprising that this “biologization of reason” has led to two radically different interpretations: a realistic one (Lorenz 1941, 1973; Popper 1972, 1984; Vollmer 1990) and a more constructivistic one (Maturana 1980[1970]; Maturana and Varela 1980[1970], 1992). The idea of adaptation seems to suggest that the gathering of information about the environment is always checked for correctness or at least viability by natural selection (Lorenz 1941): organisms make different “hypotheses” about their environments and only those that are not completely misguided can survive. On the other hand, the fact that different organisms can survive with very different cognitive makeup seems to suggest a form of species-relativism. In the end, perhaps only viability as such (and not “truth”) is what selection presses for Maturana (1980 [1970]). This may even include the suggestion that a distortion of reality, if it fits the needs of an organism, could be more useful than an “objective” picture. Even when Maturana later emphasizes the idea of a “structural coupling” with the environment (Maturana 2003), it remains true – as also Vollmer (1990, 127f.) confesses – that an evolutionary realism might at best only give us reasons to trust our “mesocosmic” world view. To understand scientific knowledge (which includes 22

Thompson following Varela speaks of “sense-making” (Thompson 2007, 147). Thompson in his thoughtful “philosophy of the organism” also refers to Kant’s ideas that get expanded in Hegel’s philosophy of biology; also Hegel sees the connection of the structure of organisms and the process of “sense-making” (Spahn 2007, 194; Brandom 2004). 23 Also Millikan exploits biological concepts to build her theory of Intentionality (Millikan 1984, 17–50).

68

C. Spahn

knowledge about evolution) that seemingly successfully transcends the limits of our mesocosmic beliefs and prejudices, we cannot rely on an evolutionary realism. We must consider more abstract arguments for objective knowledge to defend science (like coherence and consilience, logical conclusiveness, and so forth) that specifically urges us to leave the evolutionary constraints of our intuitive knowledge and thinking behind.24 Why is it not surprising that we find these two opposing ways of interpreting the results of evolutionary epistemology? There are reasons for these two options that can be traced back to the structure of Darwin’s theory itself, and, more importantly, they are reasons that have to do with the aforementioned dualism having an impact on our interpretation. We find two opposing principles in Darwinism: the idea of selection and that of random, i.e., nondirected mutation. These principles open up a field for oscillating approaches of adaptationism, antiadaptationism, and synthetic interim solutions. At first glance, if one emphasizes adaptationism, then that seems to support Lorenz’s, Vollmer’s, or Popper’s realistic arguments. If one emphasizes the drift or “internal constraints”, then apparently the opposite assumptions about cognitive reliability follow. But of course, it is not that simple. There is, for example, a bizarre contravalence in the question as to whether something is either to be understood as an “adaptation” or as an expression of internal constraints. Of course, any property of a given organism – be it adaptive, maladaptive, or neutral – must follow internal constraints as Maturana (1980[1970]) rightly argued: viability is the first imperative of any organism that wants to be able to survive (this insight is close to a tautology). Therefore, the question cannot be that if something is either adaptive or following internal constraints, but rather the question is what, among all of the possible options within the given constraints, could rightly be called an adaptation. There is an obvious analogy to the debate about realism here: it cannot be the question of what aspects of our knowledge do we owe to the inner constraints or architecture of our cognitive makeup, what is infused by our concepts, or what “comes as an addition from within”. All our knowledge is our knowledge and it is therefore shaped by the “constraints” of our cognitive apparatus. So the real question, therefore, is not if there is a true kind of knowledge outside of our “cognitive constraints” and our principles of thinking; the question is which of our cognitive endeavors are guided by merely idiosyncratic concepts and which are guided by true and objective concepts (that could not reasonably be named “constraints” even if they “come from within”). Just as a “true adaptation”, whatever that would be, could not be meaningfully conceptualized as being against or outside of the framework of given constraints, so too true knowledge could not be defined as something that lies by definition outside of our possibilities or capacities just because they are possibilities that employ “inner” mechanisms of “interpretations”. In short the question is not if we have internal frameworks of interpretations, but rather which of them are correct or reliable.

24

This point is also made by H€ osle (1988).

Prospects of Objective Knowledge

69

Beyond this open field of interpretations due to the tension between the necessity of certain traits for survival and mutational, historical or structural contingency, there is a more profound philosophical problem. Adaptation, even if it exists, and even if it exists within the framework of given “constraints”, simply does not guarantee truth. It has already been said that any bias that distorts a true picturing of the world that yields a positive survival value can be successfully selected.25 Behind this attack against the main tenet of a realistic evolutionary epistemology lies a more basic philosophical insight. There is always a difference between the traditional realistic definition of truth (i.e., “correspondence” of thought and reality) and the criteria for truth (success in a process of selection, coherence, the consensus of scientists, etc.). But if there is a gap between the definition of truth and the criteria for knowing that something is a true proposition, then one can always imagine scenarios in which the criteria is fulfilled but there is no truth. Of course, again, after stating this gap or difference, we have to be aware of misleading dualisms. Also, here we should not ask if something is either only a survivalguaranteeing (or simply viable) image, or (“instead of being this”) a true objective picture. It might be favorable for surviving not to have a too “subjective” distortion of reality. Even interpretations according to the guidelines of “subjective relevance” (for example in the first and simple emotional or behavioral functions) should be objective. To consider something as “poisonous” or “un-poisonous” might not exhaust all of the truths there are about a given food, but it should at least be objectively correct in this given aspect (Wandschneider 1999, 82f.). Thus, it remains true that adaptation as such does not guarantee truth, but it is also true that it does not make it unlikely or even impossible. The main point, however, is that any Evolutionary Epistemology simply cannot answer the basic problems concerning objectivity in knowledge. A positive solution is a vicious circle,26 as it has been pointed out many times, and the negative solution entails a self contradiction. The second approach, even with all of its problems, is more interesting because it expands the suspicion against uniquely human features to the realm of cognition and operates still within a very naive dualism of nature and reason, knowledge and world. Just because something is the product of internal cognitive mechanisms does not mean that it must be a construction. It could still be a reasonable re-construction that might go beyond the mere physical input, but that still relies on schemes of interpretation established by evolution. So by leaving the reduced information of the input behind and expanding it with evolution-based, learned preknowledge (or prejudices), one could still be closer to the world than by just slavishly adhering to “the given input”. In the literature about optical or other illusions, however, we very often find the opposite interpretation (see for example Maturana and Varela 1992). Take for

25

The practical relevance of perception and cognition is therefore emphasized in the “enactive approach” to cognition; see for example: Thompson (2007, 1–87), Noe¨ (2004). 26 One presupposes the objectivity of science (i.e., a product of thought) in order to argue in favor of a trust in thinking; see H€ osle (1988).

70

C. Spahn

example Adelson’s well-known checkerboard illusion27: it seems impossible to see that the two fields (one in the shadow of an object) have the same color. First of all we know of course that they have the same color or else we could not speak of an illusion. Any insights into constructions or illusion always presuppose that we actually know better. We must have positive knowledge about the world to call something a construction or an illusion. Secondly, it is a strange philosophical interpretation to say that the input in these cases gets falsified by an ill-operating mechanism of interpretation. Reality and construction should not be conceptually opposed in this way. The input, i.e., whatever hits the retina, is supposed to be the real world, the actual material input that is “really there.” What happens next is in the standard interpretation of perception described as a subjective processing and transformation of this objectively given material. But does it really make sense to say that the world is given to us in or by the input? Would it not be more true to say that, in fact, we only have light rays, sound waves, etc., i.e., not the world but units of information that are fragmentary and partial and that require an intelligent integration and completion? By going beyond the given, by using top-down modulation of input processing we do not necessarily go beyond the real. We envisage what could be the real case or real cause behind what seems to be present in or at our senses: Colors do appear darker in a shadow under normal circumstances; checkerboards usually have only two colors. This preknowledge surely seems to be built in our system and it does not come from the sensory data as these illusions show. It is easy to see that this is not some random predecision or prejudice, but that it makes sense to interpret this world in just this manner even if we can then be tricked by special cases or computer simulations. To analyze what kind of preknowledge we have might deliver some clues about the prevailing standard circumstances of our history of evolutionary selection. The list of examples of contravalent interpretations could be prolonged; most arguments against realism from illusion rely on this: the input is the “real” – the addition is the “construction”.28 Instead of pursuing this dualism, it might be more fruitful to ask just why we have this mode of interpretation and why it was selected. The most important point would then be this: If science is possible, if knowledge about illusions is possible (or presupposed as real in all of these cases), then we should be able to explain how these cognitive achievements are possible and not consider them as something that is so idiosyncratic and misleading that we cannot even understand the possibility of finding out about it. But in order to do this, we need more than empirical research; we also need conceptual clarifications and reflection about how to spell out the relation of “internal” and “external” facts. What prejudices do we already accept if we assume that mind, cognition, or the “inner processing of information” has to be understood

27

http://web.mit.edu/persci/people/adelson/checkershadow_illusion.html. Vollmer (1990) in his defense of a hypothetical realism also opposes this contravalent view and argues that there are reasons why for example we see more clearly within a certain light spectra (Vollmer 1990, 118f.)

28

Prospects of Objective Knowledge

71

as unworldly and that the input, the physical surface reaction on sensory organs (not “the fact” behind this), has to be understood as “the reality”? It could be that the same oscillation between realism and subjective idealism that was described in the first part of this chapter also continues to go on as long as the conceptual groundwork remains unfinished. To conclude this point, however, let us ask what remains of the basic insight against realism that could be taken from these considerations of evolutionary arguments, even if a radical subjective interpretation will, of course, eventually undermine its own validity. As it has been said, the important point is to recognize that knowledge about evolutionary facts cannot solve the questions concerning realism. Instead, to appeal to what we have learned from the life sciences or from modern research already presupposes a positive assumption about the reliability of science and therefore of our cognitive abilities. It is also worth noting that any skepticism concerning our cognitive abilities also relies on a positive assumption about the reliability of science because it uses scientific results and insights to fuel this skepticism.29 But this shows that only one of the interpretations can be a research paradigm that is both fruitful and does not undermine itself. The relevant philosophical issue is therefore not to ask if something is reliable because it is an adaptation or has been positively selected. Instead, if we can find convincing epistemological arguments for realism and the reliability of cognition – whether because one can defend a philosophical realism or simply because even the acceptance of the theory of evolution and of scientific results already implies a commitment to objectivity – then the question is how is it possible given evolution that objective cognition could emerge? This question should be answered by any complete and not self-undermining phenomenology of the mind, not if the idea of objective cognition is plausible, given evolution, but how, given our ability to grasp facts about the world, such an ability could have evolved naturally without assuming any external ingredients, miracles or divine help, or interaction. What are the necessary preconditions and steps, how is a path from idiosyncratic and partial adaptation, from emotional and survival oriented distortion, from mesocosmic limitation, to scientific investigation and objective thinking possible? What steps of cognition do we have to assume and how can we explain the transformation from one step to the other given our biological, neurological, and evolutionary knowledge?

3.2

The Evolution of Cognition: Balancing Internalism and Realism?

The approaches of Millikan (1984, 2004), Thompson (2007), Lorenz (1973), Noe¨ (2004), Proust (1999), Goodson (2003) and Donald (2001), but also the earlier insights from Lorenz (1973), are important steps toward a comprehensive typology 29

This point that any skeptical interpretation of science relies on our cognitive ability is made against the “grand illusion” claims in modern theories of perception by Ludwig (2006).

72

C. Spahn

of organic cognition “from the amoeba to Einstein”. One of the most important common insights of these views is the idea that at some point between objectivity and interaction, on the one hand, and subjectivity and representation, on the other, a gap has to develop: misrepresentation and readjustment open the space for both errors and real knowledge that goes beyond mere input–output relations that are not yet conceptual knowledge (Proust 1999). The basic question then becomes very similar to the conceptual questions in the debate between internalism and externalism: just how does this gap appear and grow and what are the more and more complex steps of bridging this gap? How, through evolution, does cognition develop between internalizations and higher reintegrations of externality? If one tries to give a very brief summary sketch of the very different approaches of Milikan, Thompson, Donald, Lorenz, and Goodman, then the rough common outline of a more detailed picture behind the differences seems to be this: First, most of these theories start with a purely functional “external” concept of knowledge or cognition. Here cognition, morphology, self-sustainment, and actions are structurally coupled. “Knowledge” is not yet an individual achievement or “learned”, but rather is just the result of an external evolutionary selection. The coupling of actions or motor programs and morphologies to external triggers and circumstances is just a product of repeated selection and interaction (Lorenz 1973; Millikan 1984). But with this coupling of motor and sensory abilities for the first time, a second kind of “knowledge” comes into the world that exceeds pure “morphological” or “bauplan” in-formation that we might find in plants or, as (Lorenz 1941, 98) puts it, in the fin of a fish. The first forms of this structural coupling of motor and sensory activities are just a product of evolution: think of taxes in bacteria (Lorenz 1973, 74–78). Still you might call these first action patterns a kind of inner representation or “aboutness” insofar as certain triggers are considered to be (or acted upon as if they were) representing something else. The coupling of a bacterium that swims slower in a more nutritious milieu should be better causally connected to a fact in the world that, under normal circumstances, would “indicate” a nutritious milieu. (Lorenz 1973, 76). But according to Proust, we would not yet call this a real “inner representation”. Only the ability for cross-modal integration and recalibration of an organism’s own cognitive processing according to criteria of coherence could really be understood as the achievement of distal information. If there appears to be incoherence between the inputs of different senses, some organisms have the ability to evaluate this as a “misrepresentation” and are able to recalibrate their sensory cognition (Proust 1999, 51–56). This ability transcends the mere mechanical input–output coupling that is given by evolution toward a more flexible response that is guided by internal corrections and therefore by first “interpretations” and “critical internal corrections” of the input. One might even argue that this is the crucial precondition for something like a “pre-conceptual” knowledge (Proust 1999).30 To put it in a rough

30

See also Ruth Millikan’s (2004, 158f) emphasis on the ability to misrepresent things which is crucial for any real capacity for having representations.

Prospects of Objective Knowledge

73

slogan: “input-fetishism” gets left behind: it does not matter what hits the senses, but rather what “should” hit the senses if the world were coherent. This surely constitutes a new step in the deepening and overcoming of the gap between subjectivity and objectivity or reality. The next true step forward in the development of cognition (again sketching a very broad picture) would of course be achieved if the ability of recalibration becomes more and more a matter of individual modification and later individual control. This is the step from external coupling and corrections due to given internal constraints to the ability to also individually change and alter those constraints. Lorenz (1973, Chaps. V and VI) precisely depicts several different steps here. Individual modification starts with habituation, facilitation, and amplification up to “open programs”. Even here he distinguishes between inflexible (but still individually acquired!) first imprints or the increasing freedom in ways to couple triggers and actions by reinforcement, individually acquired associations, play-behavior, and learning. All of these kinds of “individual actions” are different to the first mode of sensory–motor coupling because they imply mechanisms of feedback of success and failure. A bacteria might be able to act in a way that is somewhat adequate to different “individual” situations, but one reason we would not say that it can learn is precisely because this mechanism is missing: learning in a higher sense presupposes feedback of success, i.e., it presupposes memory (Lorenz 1973, 122, 177). What we see here is the slow transition from an external phylogenic “acquiring” and accumulation of information via natural selection to “ontogenetic” mechanisms of information gathering that in some branches of the animal kingdom becomes more and more important. All this knowledge, however strongly it relies on mechanisms of individual learning, is “action knowledge”: it remains within the “pushmepullyu logic” as it is labeled by Millikan (2004). Only if the ability to “break up” learned motor or informational programs openly and freely evolves, the ability to recombine the given programs in an open way with any given trigger or situation, do we then have a completely new level of cognitive abilities that mainly consist in leaving passive conditioning (as described by the behaviorist school) behind and replacing it more by active learning, exploring, and “trying things out”.31 If one were to name a trend in this development, we could say that it follows a path from external and strict givenness in a “bauplan”, from inflexible but clever given and selected mechanisms up to more and more space for internal reconfiguration: higher cognition must be understood as this ability to relate to its own methods of information-processing in such a way that the chain from input to output, input to representation, could be separated and be recombined. When it comes to “the inner logic” of cognition, we can identify the same transition: From a given impression we come to an evaluation and ordering of impressions according to an “inner ontology” based on knowledge of possible actions: the world is not only passively observed but gets evaluated for

31

The classic critique against behaviorism can, of course, be found in the profound analysis of Merleau-Ponty (1942), which is also extensively discussed in Thompson (2007, esp. 73–81).

74

C. Spahn

its “affordances” and possibilities. One of the most important points of the idea of the “enactive approach cognition” (Thompson 2007; Noe¨ 2004, 1–33) or of “perception in action” is just this interaction of knowing-how and knowing-what. But this means that knowledge in this sense is always more than just perceiving what is “objectively” (i.e., physically) given on the sensory surface: it is evaluating it in the light of preknowledge and opportunities. This establishes a complex interplay of subjective modes of interpretation and the aim of objectively needing to know more about the world than what is available through the senses and the given perspective from here and now. For example, if I see certain materials, I already know or expect what they would feel like if I were to touch them; if I am not impaired by any defect and see a coin on the table, I know that it only appears to have an oval shape, but that in fact it is circular (Noe¨ 2004, 102). By acknowledging this, the modern approaches of evolutionary epistemology and evolutionary psychology have moved beyond the old ideas of a tabula rasa cognition or of a free and flexible learning-from-scratch-by-association found in behaviorism. Human cognition, as Tooby and Cosmides famously describe this paradigmatic change in how we look at cognition, can no longer be seen as an instinct-deprived “all-purpose” computer, but rather it relies heavily on preknowledge and instincts, on cognitive preparedness for learning certain things easier than others (Tooby and Cosmides 1992). From this starting point of an evaluation of the given by using our preknowledge we could come to a higher form of cognition by evaluating and maybe flexibly changing the preknowledge itself. First, by simulating actions in thought and understanding a match or mismatch between expectations and actions or results, then by using language32 and reflection, might be a way to know and get a hold of one’s own evolutionary prejudices. This may lead to the possibility of freeing oneself from these prejudices, or instead to actually know that it is good to rely on them. Finally, since language is a social product it can be understood as a common cognitive effort or as a common interpretation of the world. As Tomasello rightly argues: a language implies a world view: I can learn and see what things mean to others and how one could or even should look at them. Language offers a normative dimension for shared Intentionality, even for the creation of social facts: We proclaim that some metal counts as money, and it has this value partly because we think and commonly believe that it has it (Tomasello 1999; Searle 1995). Through evolution, mind and cognition have to struggle to adapt to an environment, but in language and culture the mind creates its own environment according to its own criteria and evaluations: a world of culture that can only exist and carry on if it survives the critical scrutiny according to shared rational standards. According to this very brief sketch of what a phenomenology of cognition today might look like, one could say along the lines of Hegel that we eventually export the

32 Donald (2001) emphasizes among other aspects the role that language plays to get hold of the content of memory to have an even more free and flexible way to deal with things you have learned.

Prospects of Objective Knowledge

75

search for inner coherence that was guiding the struggle of internal and external aspects of cognition, the balance of the subjective and the physical or objective, to the world of culture by creating a sphere in which “the given objectivity” is itself a product of our subjectivity. The task remains to truly elaborate both on a conceptual level and on an empirical level a synthesis of what we know today in philosophy and in the cognitive sciences: about cognition, subjectivity, errors or skepticism, realism, and the success of science, even if any result will always be provisional for the present time. Any such integration can, however, only avoid a self-undermining if it elaborates a clear conceptual reflection about the concepts of “inside” and “outside” and “subjectivity” and “reality” and if it renders justice to both what we can know and to the fact that we already employ our cognition when we think about its evolution. Acknowledgment I like to thank Christian Tewes, Annett Wienmeister, and Andre´ Wunder (University of Jena) for many fruitful and inspiring discussions. I like to thank Duane Lacey (United Arab Emirates University) for correcting my English.

References Brandom RB (2004) Selbstbewußtsein und Selbst-Konstitution. Die Struktur von W€unschen und Anerkennung. In: Halbig C, Quante M, Siep L (eds) Hegels Erben. Suhrkamp, Frankfurt/M, pp 46–77 Braßel B (2005) Das Programm der idealen Logik. K€ onigshausen & Neumann, W€urzburg Carnap R (1931) Die physikalische Sprache als Universalsprache der Wissenschaft. Erkenntnis 2:432–465 Carnap R (1950) Empiricism, semantics, and ontology. Rev Int Philos 4:20–40 Carnap R (1961 [1928]) Der logische Aufbau der Welt: Scheinprobleme in der Philosophie. 2. Aufl. F. Meiner, Hamburg Davidson D (1973–1974) On the very idea of a conceptual scheme. In: Proceedings and addresses of the American Philosophical Association, vol 47, pp 5–20, reprinted in Davidson 2006, pp 196–208 Davidson D (2006) The essential Davidson. Clarendon Press/Oxford University Press, Oxford/ New York, NY Donald M (2001) A mind so rare: the evolution of human consciousness. W.W. Norton, New York, NY Dummett M (1963) Realism, reprinted in: Truth and other Enigmas, Harvard University Press, Cambridge, MA, 1978, pp 145–165 Goodson F (2003) The evolution and function of cognition. Lawrence Erlbaum Associates Publishers, Mahwah, NJ H€osle V (1988) Tragweite und Grenzen der evolution€aren Erkenntnistheorie. Z Allg Wissenschaftstheor 19:348–377 H€osle V (1998) Objective idealism, ethics, and politics. University of Notre Dame Press, Notre Dame, IN Jonas H (1966) The phenomenon of life: toward a philosophical biology. Harper & Row, New York, NY Lorenz K (1941) Kants Lehre von Apriorischen im Lichte gegenw€artiger Biologie. In: Bl€atter f€ur Deutsche Philosophie 15, S. 94–125 Lorenz K (1973) Die R€ uckseite des Spiegels. Piper, M€ unchen

76

C. Spahn

Ludwig K (2006) Is the aim of perception to provide accurate representations? In: Robert JS (ed) Contemporary debates in cognitive science. Blackwell, Malden, MA, pp 259–274 Maturana H (1980[1970]) Biology of cognition. In: Maturana HR, Varela FJ (eds) Autopoiesis and cognition: the realization of the living. D. Reidel, Dordrecht, pp 1–18 Maturana H (2003) Autopoiesis, structural coupling and cognition: a history of these and other notions in the biology of cognition. Cybern Hum Knowing 9(3–3):5–54 Maturana HR, Varela FJ (1980[1970]) (eds) Autopoiesis and cognition: the realization of the living. Dordrecht: D. Reidel publishing company Maturana H, Varela FJ (1992) The tree of knowledge: the biological roots of human understanding. Shambhala/Random House, Boston, MA/New York, NY McDowell J (1996) Mind and world: with a new introduction. Harvard University Press, Cambridge, MA Merleau-Ponty M (1942) La structure du comportement. Presses universitaires de France, Paris Millikan R (1984) Language, thought and other biological categories: new foundations for realism. MIT, Cambridge, MA Millikan R (2004) The varieties of meaning. MIT, Cambridge, MA Nagel T (1986) The view from nowhere. Oxford University Press, Oxford Noe¨ A (2004) Action in perception. MIT, Cambridge, MA Popper KR (1972) Objective knowledge: an evolutionary approach. The Clarendon Press, Oxford Popper KR (1984) Evolutionary epistemology. In: Pollard JW (ed) Evolutionary theory: paths into the future. Wiley, London Proust J (1999) Mind, space and objectivity in non-human animals. Erkenntnis 51(1):41–58 Putnam Hilary (1981) Reason, truth, and history. Cambridge University Press, Cambridge, MA Quine W (1948) On what there is. Rev Metaphys 2(5):21–38 Quine W (1960) Word and object. MIT, Cambridge Quine W (1968a [1958]) Speaking of objects. In: Proceedings and addresses of the American Philosophical Association, vol 31, pp 5–22, reprinted in Quine W (1969) Ontological relativity: and other essays. Columbia University Press, New York, NY, pp 1–25 Quine W (1969a) Ontological relativity: and other essays. Columbia University Press, New York, NY Quine W (1969b) Ontological relativity. J Philos 65(7):185–212, reprinted in Quine W (1969) Ontological relativity: and other essays. Columbia University Press, New York, NY, pp 26–67 Quine W (1981) Theories and things. Harvard University Press, Cambridge, MA Rorty R (1972) The world well lost. J Philos 69(19):649–665 Rorty R (1979) Transcendental arguments, self-reference, and pragmatism. In: Bieri P, Horstmann R-P, Kr€uger L (eds) Transcendental arguments and science. D. Reidel, Dordrecht, pp 77–103 Rorty R (1991) Objectivity, relativism, and truth. Cambridge University Press, Cambridge Searle John (1983) Intentionality, an essay in the philosophy of mind. Cambridge University Press, Cambridge Searle J (1995) The construction of social reality. The Free Press, New York, NY Sellars W (1956) Empiricism and the philosophy of mind. In: Feigl H, Scriven M (eds) Minnesota studies in the philosophy of science, vol 1. University of Minnesota Press, Minneapolis, MN, pp 253–329 Spahn C (2004) Was k€ onnen wir (nicht) wissen? Vernunftkritik und metaphysischer Optimismus bei Hegel, Kant und in der Gegenwart. In: Ingensiep HW, Baranzke H, Eusterschulte A (Hg) Kant-Reader, K€onigshausen und Neumann, W€ urzburg, pp 34–52 Spahn C (2007) Lebendiger Begriff, begriffenes Leben: Zur Grundlegung der Philosophie des Organischen bei G.W.F. Hegel. K€ onigshausen und Neumann, W€urzburg Spahn C (2010) Ein anderes Argument f€ ur den Idealisms? Nagel’s Ausweichen vor der Idealismusfalle. In: Geier F et al Perspektiven philosophischer Forschung, forthcoming Spahn C (2011) Alte, neue und ganz neue Skepsis? Hegels Begr€undung der Philosophie. In: Ficara E (ed) Die Begr€ undung der Philosophie im deutschen Idealismus. K€onigshausen und Neumann, W€urzburg (in press)

Prospects of Objective Knowledge

77

Spahn C, Tewes C (2011) Naturalismus oder Integrativer Monismus? Zur Verh€altnisbestimmung von Natur und Geist. In: K€ ochy K, Kolmer P (eds) Gott und Natur. Philosophische Positionen zum aktuellen Streit um die Evolutionstheorie. Karl Alber Verlag, Freiburg/M€unchen, pp 141–185 Thompson E (2007) Mind in life: biology, phenomenology, and the sciences of mind. Harvard University Press, Cambridge, MA Tomasello M (1999) The cultural origins of human cognition. Harvard University Press, Cambridge, MA Tooby J, Cosmides L (1992) The psychological foundations of culture. In: Barkow JH, Cosmides L, Tooby J (eds) The adapted mind: evolutionary psychology and the generation of culture. Oxford University Press, New York, NY, pp 19–136 Vollmer G (1990) Evolutiona¨re Erkenntnistheorie: angeborene Erkenntnisstrukturen im Kontext von Biologie, Psychologie, Linguistik, Philosophie und Wissenschaftstheorie. Stuttgart, Hirzel Wandschneider D (1999) Das Problem der Emergenz von Psychischem – im Anschluß an Hegels Theorie der Empfindung. In: H€ osle V, Koslowski P, Schenk R (eds) Jahrbuch f€ur Philosophie des Forschungsinstituts f€ ur Philosophie Hannover. Band 10, pp 69–95

.

Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain Ricarda I. Schubotz

“Whatever these uniquely human functions are, they must be so subtle that even clever scientists do not see them. In other words, the clue is to look for differences between humans and nonhumans that are not obvious”. (Tulving and Kim 2007, p. 334).

Abstract Long-term planning and prediction have been suggested to be human faculties that we do not share with other species. Here, evidence for a neurofunctional gradation of prediction and planning functions is reviewed that is located in the frontal lobes and reflects a more or less continuous time scale. It shows that those parts of the gradation that reflect long-term planning and prediction, particularly the frontopolar cortex, appear to mature later during ontogeny and are suggested to have evolved later during phylogeny than those underlying shortterm planning and prediction. The review is followed by the report of a series of studies in human subjects that aimed to further our understanding of cognitive planning and prediction and its neural basis in the frontal lobes.

1 Introduction Long-term planning and prediction have been suggested to be human faculties that we do not share with other species. But where does short-term end and where does long-term begin? Short-term planning and prediction refer to the preparation of movements and immediate concrete perceptual expectations, whereas long-term planning and prediction allude to the strategic modulation of behavior and the coarse anticipation of upcoming events. Nobody would deny that macaques are able to engage in highly skilled motor planning, but everybody would agree that they lack foresight in the proper meaning of the word. However, long and short are relative terms and hence are ill suited to build a clear criterion to mark off functions as being unique to humans.

R.I. Schubotz Max Planck Institute for Neurological Research, Gleueler Str. 50, 50931 K€oln, Germany e-mail: [email protected]

W. Welsch et al. (eds.), Interdisciplinary Anthropology, DOI 10.1007/978-3-642-11668-1_4, # Springer-Verlag Berlin Heidelberg 2011

79

80

R.I. Schubotz

Whenever the issue of planning and prediction is addressed in cognitive neuroscience, there is talk of the frontal lobes (e.g., Ingvar 1985). It goes way beyond the scope of this chapter to recapitulate the anatomical and functional properties of the frontal lobes or to provide a solution to the diversity and heterogeneity of models and accounts dedicated to these topics. Rather, discussions are based on the following question: Are functions of planning and prediction organized within the frontal lobes along a neurofunctional gradation or hierarchy that reflects a more or less continuous time scale? If such a gradation can be identified and if long-term planning and prediction are indeed a human-specific faculty, those parts of the gradation that reflect long-term planning and prediction should mature later during ontogeny and should have evolved later during phylogeny than those underlying short-term planning and prediction. Hierarchical models of frontal lobe function have a long tradition in brain research. Originally, they root in the observation of morphological characteristics of cortical layering and of connections between adjacent as well as remote cortical fields. Before turning to function, these structural foundations will be considered in the following. As a caveat, the allusion to hierarchies and hierarchical organization is immanent in many levels one encounters here. At first glance, it seems evident what is meant when we say that hierarchies describe structure in evolution, cognitive abilities, or brain networks. However, this notion can be confusing as it implies a patriarchic system, i.e., a multilevel master–slave system or monohierarchy. Actually, brain networks are more appropriately conceived of as building modules with a characteristic intrinsic and extrinsic connectivity, i.e., modular hierarchies (Passingham et al. 2002; Kaiser 2007). In the mammalian brain, projections are arranged in clusters of cortical areas that are closely linked among each other, but less frequently with areas in other clusters. Importantly, such structural clusters broadly agree with functional cortical subdivisions. A hierarchical cluster architecture has been suggested to provide the structural basis for the stable and diverse functional patterns observed in cortical networks, a mechanism for preventing large-scale activation as, e.g., during epileptic seizures (Kaiser 2007). When hierarchies of function are mentioned in the following, as done by many authors, this should not imply that lower levels in that system cannot do without the higher ones; typically, the opposite can be observed, i.e., if “higher” levels are active, lower levels are active as well, but not vice versa. In order to avoid ambiguous meanings of the term “hierarchy”, it will be often replaced by the reference to “gradations” in the following.

2 Morphology-Based Evidence for a Frontal Gradation Based on the cyto and myeloarchitectonical properties of neighboring cortical areas, and phylogenetically as well as ontogenetically comparative studies in humans and many other mammalian species, Sanides (1962) described stepwise directional changes of cytoarchitectonic and myeloarchitectonic properties of

Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain

81

adjacent cortical fields. He took these so-called gradations (a term he adopted from his teachers, the Vogts (cf. Vogt and Vogt 1919, p. 269) to reflect the main directions of evolutionary differentiation. Among these structural, i.e., cyto- and myeloarchitectonically, gradations, he describes one running “polewards”, including a principled anterior trend in the lateral frontal cortex (Sanides 1962, p. 190). Roughly, this gradation originates from anterior cingulate fields (portions of BA 24 and BA 32) and proceeds over mesial premotor areas (supplementary motor area, SMA) and lateral premotor cortex (PMC) to dorsolateral prefrontal cortex (PFC) and terminates in frontopolar cortex (cf. Figs. 59 and 60, p. 100 of Sanides 1962). There is some evidence from the stepwise myelinization of the human brain white matter that projections subserving the mesial frontopolar areas maturate slower and later than all other areas in the brain (Lebel et al. 2008). Yet another methodological approach yields that when estimations of cortical folding were computed along the anterior–posterior axis, a gradation in the degree of folding of primate brains, and in particular, a higher degree of folding of the human PFC is observed, while the posterior third of human and ape brains shows a comparable folding (Zilles et al. 1988, 1989; Toro et al. 2008). A similar folding gradation is also reflected during human ontogeny, as gyrification lasts up to 1 month longer in the anterior than in the posterior cerebral cortex (Armstrong et al. 1995). The prolonged maturation may be the cause for the particular increase in prefrontal folding, and possibly the same argument applies to observed differences in folding among primate brains. Among all frontal areas, only a closer inspection of the frontopolar cortex has yielded a telling difference between humans and the other five hominoids as well as macaques (for a broader perspective, cf. Rakic 2009; Zilles et al. 1988; Rilling 2006; Semendeferi et al. 1997; Semendeferi and Damasio 2000). This region is anything but easy to define. It largely amounts to Brodmann area 10 in humans and to area 10 of Walker’s map (1940). For reasons of clarity, structural and macroanatomical characteristics of the frontal lobe are depicted in Fig. 1, where cytoarchitectonic maps proposed by Petrides and Pandya (2004, 1994) were projected onto anatomical images of Duvernoy (1999). It can be assumed that the notion of one frontopolar area, as often discussed in the macaque literature, is too simplistic for a description of the human configuration; thus, based on distinct connectional fingerprints, the frontopolar area can be subdivided into three subregions, a mesial, a lateral, and an orbital part (Kondo et al. 2003; Ong€ur et al. 2003; Petrides and Pandya 2007; Sujazow et al. 2010). The frontal pole was found to be considerably larger in humans, even in relative terms: it is twice as large as in any of the great apes (Semendeferi et al. 2001; Allman et al. 2002). Moreover, less densely packed supragranular layers II and III of this area indicate that during evolution of man, area 10 increased in intrinsic as well as extrinsic connectivity,1 the latter mostly consisting in projections to other higher-order association areas (Semendeferi et al. 2001). 1

While area 10 in humans contains more than three times more neurons than in chimpanzees, bonobos, or orangutans, the ratio between cell bodies and neuropil in this area is much lower in humans than in any of the other great apes or gibbons. The more the neuropil as compared to cell bodies is found, the more the space for connections.

82

R.I. Schubotz

Fig. 1 Structural and macroanatomical characteristics of the frontal lobe: lateral, orbital, and mesial view. Cytoarchitectonic maps proposed by Petrides and Pandyam (1994) were projected onto anatomical images of the Duvernoy (1999)

The dense intrinsic connectivity of this area is reflected by the remarkable number of spines and hence dendritic complexity that exceeds that of any other cortical region in the human brain (Jacobs et al. 2001). This profile is indicative of an area that subserves high-dimensional integration of many different sources of information. As a further remarkable characteristic of area 10, it receives input from a morphologically distinct type of neuron in the adjacent anterior cingulate cortex that is only found in humans and great apes (Allman et al. 2002). Allman and colleagues abridge these so-called spindle neurons to convey “the motivation to act” to area 10, a view that deserves and needs some elaboration. Anterior cingulate is found to quickly and strongly respond to committed errors (Rushworth et al. 2004; Van Veen and Carter 2006); its spindle neurons may provide adjacent area 10 with information about changing conditions so as to feed into the planning of behavioral adaptation against the background of past experiences, a view that, incidentally, is in line with the finding that frontopolar cortex is engaged in the detection of (or responds to) strategically relevant sudden contextual change (Pollmann and Manginelli 2009). These views elaborate on the traditional view that the specific

Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain

83

interaction between adjacent anterior cingulate and adjacent frontomedian cortex results in the motivation to act (Devinsky et al. 1995). Together, comparative neuroanatomy provides some evidence for a particular development of area 10 during evolution of man. However, according to the mosaic hypothesis of evolution (Barton and Harvey 2000), the selective, species-specific expansion is to be expected for functional modules rather than for specific cortical regions. Hence, the expansion of area 10, interpreted as an indicator of some specific cognitive development, has to be considered in relation to the areas that it is connected with (Rilling 2006). Area 10 is the only prefrontal area that is predominantly interconnected with supramodal cortices (residing in the lateral PFC, the anterior temporal cortex, and the cingulate cortex), suggesting that the coordination of the information transfer between cognitive operations realized by supramodal cortices is a key to this area’s functional evolution. If so, it is particularly informative to consider the frontal white matter, in particular connections to and from area 10. Indeed, hyperscaling of both neocortex and the frontal lobe to the rest of the brain has been proposed to be mainly due to frontal white matter increase, suggesting that frontal white matter is a principal component in explaining variation in cerebral brain size (Smaers et al. 2010); however, this is true across mammal evolution, i.e., not a unique phenomenon in human brains.2 Moreover, whether prefrontal white matter volume is disproportionately enlarged in humans (Schoenemann et al. 2005) remains controversial due to technical issues (Sherwood et al. 2005). Again, it seems that scaling (here: of long fiber connections to and from the frontal lobes) may not be indicative of a unique human mental faculty. Remarkably, no study has so far addressed a potential disproportional increase of white matter volume of cytoarchitectonically determined areas in the frontal lobes of the human brain. Instead, one can consider developmental properties of the long fiber connections to and from frontopolar cortex. The time course of the myelination of cortico-cortical axons in humans provides a measure that accounts for the white matter increase in the developing brain. While the overall brain volume does not change significantly from age 5 to 30 years, total gray matter decreases and total white matter increases linearly in the same time (Giedd et al. 1999). Regarding the timing of this process, it shows that parts of the PFC are among the cortical areas whose myelination is not completed until the end of the third decade, i.e., they mature slower and longer than all other connections in the brain (Lebel et al. 2008; Yakovlev and Lecours 1967). Thus, major fiber bundles supporting frontal connectivity such as the inferior and superior fronto-occipital fasciculi show early maturation compared to other structures, particularly the fronto-temporal ones such as the uncinate fasciculus and the cingulum bundle, which mature later

2 This does not rule out that hyperscaling is at the basis of unique human abilities; it just indicates that hyperscaling may be a necessary, but cannot be a sufficient, component in explaining these abilities.

84

R.I. Schubotz

than all other connections (Lebel et al. 2008). In favor of the notion that ontogeny recapitulates phylogeny (Kaas 2006), it is advisable to consider which brain regions are interconnected via the latter – the latest – two fiber bundles. The uncinate fasciculus (UF) connects the most anterior part of the temporal lobes, including rostral superior, inferior, and ventromedial temporal regions, with ventral, medial, and orbital parts of the frontal lobe (Ebeling and von Cramon 1992). The cingulum bundle (CB) follows the cingulate cortex, stretching from orbital and anterior cingulate areas in the frontomedian cortex as well as dorsolateral regions to medial retrosplenial cortex, hippocampus, and parahippocampal gyrus (Schmahmann et al. 2007). What is common to both UF and CB is that they are the only fiber bundles that connect orbital and mesial parts of area 10 with its projection sites. Together, both relative volume of area 10 and the development of its connecting bundles point to a functional elaboration during evolution that may be particularly advanced in humans as compared to nonhuman primates.

3 Function-Based Evidence for a Frontal Gradation Regarding functional models, for instance, Tanji et al. (2007) proposed a hierarchical organization where behavioral categories or schemes in lateral PFC have to be specified in higher premotor areas, i.e., SMA and pre-SMA according to the authors, to build action sequences before they can finally result in concrete individual actions implemented by lateral premotor and primary motor cortices. This view is largely congruent with other hierarchical models of the frontal cortex such as the perception–action cycle by Fuster (1997, 2001) and the cascade model of cognitive control put forward by Koechlin et al. (2003). The latter authors suggest three nested levels of control, a term that might be translated into “constraint application” or “modulation”. The highest of these levels is called “episodic” and exerted particularly by BA 46 in the dorsolateral PFC. Episodic modulation refers to the influence of past events that result in ongoing internal goals. On the next level, contextual modulation refers to the situation in which the stimulus occurs; contextual modulation is proposed to stem from posterior lateral PFC, including BA 9, 45, and 44. Finally, on the last level, sensory modulation is exerted by PMC (BA 6) and refers to the currently attended stimuli. Koechlin’s account may classify as an elaboration of Fuster’s older proposal (1997) that “progressively higher areas modulate behavioral, linguistic, and cognitive actions requiring the integration of progressively more complex and temporally dispersed information”. A slightly different notion but similar in terms of a temporal integration hierarchy following the evolutionary gradation from posterior to anterior frontal areas has been put forward by Jordan Grafman (1995) for review. According to his model, the frontal lobes store “structured event complexes”, which evolve from single units in memory and build progressively complex event

Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain

85

compounds, resulting in a hierarchy on top of which we find “managerial knowledge units” that reside in frontopolar cortex. At first glance, frontal neuroanatomy suggests a posterior-to-anterior gradation in the lateral frontal lobe that largely dovetails with the notion of a neurofunctional gradation of planning and prediction. It is important to consider that a gradation of function does not imply a one way of processing, i.e., strictly serial top-down processing. Rather, a reciprocal and parallel flow of information connects areas specialized for different needs in predictive and planning behavior. In the following lines, the first two nodes in this gradation or stream are briefly reviewed before the special characteristics of the final node – the frontopolar cortex – are considered in detail.

3.1

Premotor Cortex

The function traditionally attributed to the PMC, no matter whether in humans or in monkey, is the preparation and organization of sequential movements (Wise 1985), the synthesis of skilled motor sequences (Luria 1966). Accordingly, in contrast to the posteriorly adjacent primary motor cortex, the PMC subserves not so much the final motor execution as rather the sensorimotor integration and the attentional focus that precedes body movement and action (cf. premotor theory of attention, Rizzolatti et al. 1987). Note that this view does not imply that premotor activity necessarily results in movement execution; rather, PMC stores and maintains precompiled subroutines or “action ideas” (Fadiga et al. 2000), which play a role in perception, attention, and memory. For instance, objects are perceived as manipulable entities whose pragmatic properties are conveyed via parietal to premotor sites (Fagg and Arbib 1998). Precompiled subroutines in PMC can be conceived of as relational representations that guide an action, whereby relations are those between a present state and a goal state of the body and/or the environment. They find expression in the particular neural tuning of PMC neurons to different stages of, for instance, a reaching and grasping action, including parameters of goal, target, body part, body reference frame, temporal stage in the course of the movement, and many further factors [for overview, see Schubotz (2004)]. Note that subroutines are not “motor” by nature; rather, investigations in the monkey are more reconcilable with a “sensorimotor”, “supramodal”, or “amodal” representation format (Fadiga et al. 2000; cf. Jacobs et al. 2001). This could mean that premotor neurons code for steps or transformations between several sequential perceptual states (may they be visual, auditory, tactile, or combinations of them) that can be either achieved by body movement (re-afferences) or perceptually (afferences). Hence, instead of speaking of “precompiled subroutines”, “events” (in the seconds range) provides a more suitable and far more generic term for information stored in PMC (for elaboration of this view, cf. Schubotz 2007). While events coded by PMC neurons are limited with regard to their complexity, they have been

86

R.I. Schubotz

suggested to be selected and linked by the pre-SMA and SMA to make up more complex sequences and higher-order actions (Shima and Tanji 2000). In view of a functional gradation in the frontal lobes, it is interesting to consider that preSMA has been suggested to constitute a transition between premotor and PFC.

3.2

Lateral Prefrontal Cortex

Interestingly, while researchers have focused on the lateral PMC role in motor or action planning, and on the PFC role in mnemonic, perceptual, and attentional functions, there is abundant evidence that either structure is engaged in both mnemonic and attentional as well as action-related functions. This may not come as a surprise, as action and attention are intimately interrelated (Allport 1987; Gibson 1979; Hommel et al. 2001). So the question seems to be rather one about the relative contribution of prefrontal versus premotor areas in attention and memory that can, but do not have to, result in action. Frontal functions are often subsumed under the term “cognitive control”, meaning the ability to coordinate thoughts, memory, and actions in relation with internal goals (e.g., Koechlin et al. 2003; Badre and Wagner 2007). The somewhat problematic concept of “cognitive control” may be translated with “episodic and contextual modulation of thoughts and action”. While Koechlin et al. (2003) describe three nested levels of control in their cascade model that relate to both dorsolateral and ventrolateral PFC, Badre and Wagner (2007) specify two subfunctions as exerting cognitive control within the ventrolateral PFC: strategic memory retrieval in anterior sites (BA 47) and selection among competing memories in posterior sites (BA 45). Authors apply these functions to both semantic and episodic memory. However, they generalize this account to the notion of a functional gradation along the rostro-caudal axis reflecting a processing hierarchy en route to action (Badre 2007; Petrides 2005). The essence here is that the more posterior the area the more it contributes to constrain the immediate action requirements or options, whereas more anterior sites are more and more content-independent and associated with high-level goals (cf. Buckner 2003). Addressing the action planning function of the lateral PFC in a review, Tanji et al. (2007) present evidence for a prefrontal role in representing pursued results or goals of actions rather than movement parameters. They suggest PFC in behavioral planning that requires information processing at the conceptual level. Due to its efferent connections to higher-order motor areas, i.e., SMA and PMC, this area is taken to aid the regulation of volitional behavior. This view is based on a wealth of studies in lateral prefrontal neurons during instructed delay periods. These neurons are found both in dorsolateral as well as ventrolateral PFC and show to be tuned to behavioral task factors such as the currently valid task or rules (Hoshi et al. 1998; White and Wise 1999; Wallis et al. 2001), the memory load (Petrides 2000), the rank order of serial movements (Averbeck et al. 2002, Ninokura et al. 2003), taskrelevant categories of form (Freedman et al. 2001) or quantity (Nieder et al. 2002), and subgoaling schedules of multistep actions (Saito et al. 2005). These aspects

Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain

87

build an intermediate step in goal-directed thinking and acting, following motivation and initiation of action or thoughts on the one hand, and preceding their possible behavioral implementation on the other hand.

3.3

Frontopolar cortex

Both dorsolateral PFC (BA 9/46) and ventrolateral PFC (BA 45) are connected further anteriorly to the frontopolar cortex. On the basis of its anatomical connectivity, Petrides (2005) proposed lateral area 10 to be engaged in “hyper-monitoring”, i.e., to monitor the monitoring in dorsolateral PFC, constituting a yet more “abstract level of cognitive control along the rostral–caudal axis of lateral frontal cortex that would be critical in multitasking and high-level planning. Thus, area 10 may be thought of as being the highest level in the rostral–caudal hierarchy of lateral frontal control processes [...]”. (p. 790). As proposed above, the somewhat problematic conceptual connotations of terms such as “monitoring” or “cognitive control” may be mitigated by translating them into modulatory influence based on higher integrated information. But what is the nature of the functional contribution of area 10 to planning and prediction? This question is important the more so as one considers that lateral area 10 projects not only to the dorsolateral PFC (for which Petrides discusses a “monitoring” function) but also to the ventrolateral PFC, for which retrieval and selection from semantic and episodic memory have been proposed (Badre and Wagner 2007). Thus, whatever conceptual framework accounts for area 10 function, it has to include both its dorsolateral and its ventrolateral prefrontal interactions. In a seminal review on the functional contribution of area 10 to cognition, Ramnani and Owen (2004) put forward a functional description intended to subsume the bunch of existing perspectives on area 10 function, each of one was found too narrow to account for findings from other paradigms. These comprise introspectively processing of internal mental states and events, retrieval from episodic memory and prospective memory, cognitive branching (i.e., the ability to hold in mind goals while exploring and processing secondary goals in planning and reasoning), and relational integration (i.e., the simultaneous consideration of multiple relations between objects or thoughts). In the list, the authors strikingly3 do not consider a further highly relevant functional application of area 10 and its projection sites: mentalizing or “theory of mind”, the capacity of having a sense of one’s own and another’s mind. A typical test for this function is the concurrent representation of one’s own mental states and that of another (fictive) person. Together with the so-called temporo-parietal junction, area 10 is 3 While Ramnani and Owen do not explicitly present their review as addressing only a frontopolar subregion, functional concepts they consider clearly focus on the lateral area 10 while neglecting the mesial and orbital part of it.

88

R.I. Schubotz

the structure that is most frequently observed in this and other social judgment tasks [for recent meta-analysis, see Van Overwalle (2009)]. As often in functional neuroanatomy, the whole trick is to find a functional description that accounts for a limited set of different applications, just as a Suisse penknife is for sawing, cutting, drilling, opening, and so on. Ramnani and Owen propose that area 10 “integrates the outcomes of two or more separate cognitive operations in the pursuit of a higher behavioral goal” (p. 184), a formula that could account for the ability of mentalizing as well. Among the functions listed under this formula, the issue of episodic memory (retrieval) has undergone some essential elaboration after Ramnani’s and Owen’s paper. This issue is of particular interest for the current purpose as it is directly related to prediction and planning. Thus, episodic memory is constructive by nature and serves not only memory retrieval but likewise the creation of mental models of future events or “mental time travel” (Schacter and Addis 2007; Schacter et al. 2007; Suddendorf and Corballis 2007). The view that frontal sites are for both past and future thinking is not brand new. For instance, Ingvar (1985) already proposed that the frontal cortex retains “memories of the future” that form the basis for anticipation and planning in both action and cognition. Recent neuroimaging findings corroborate the view that episodic recall and thinking about future events draws on the same network (Okuda et al. 2003; Addis et al. 2007; Szpunar et al. 2007). Importantly, mental time travel and theory of mind abilities have been considered to be both indications of the more general competence of frame-of-reference awareness, especially the simultaneous representation of contradicting frames of reference (Suddendorf and Corballis 2007; Grant et al. 2004; Guajardo et al. 2009), which in turn comes close to an alternative formula to Ramnani’s and Owen’s. The frames-of-reference account is also alluded to by related concepts such as “selfprojection” (Buckner and Carroll 2007) or “scene building” (Hassabis and Maguire 2007). Although the notions of “self-projection”, “scene building”, and “frame of reference awareness” may be perfect to describe specific behavioral phenomena, they rather hinder a unified viewpoint on frontopolar function. All of these accounts on frontopolar function seem to have in common that several (and, if applied concurrently, conflicting) realities or propositions have to be handled at the same time: it seems to be about “here X, there Y”, “now X, once (or then) Y”, “me thinking X, you thinking Y”, “if A then X, if B then Y”, and so on. This description is generic to the extent that, when applied to the domain of planning and prediction, area 10’s contribution is related to relational representations between different components and levels of the plan or prediction.

4 Phylogenetic Elaboration of Frontopolar Cortex: What for? Having outlined the functional and structural profile of frontopolar cortex, we can now formulate a perspective in which way this region may feed into the lateral frontal cascade when we engage in prediction and planning. Roughly, it seems

Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain

89

that area 10 is a “Janus-faced” area that, on the one hand, is part of a mesial network comprising posterior cingulate, anterior temporal, and hippocampal areas, and, on the other hand, is part of a lateral network that includes the lateral frontal cortices and their parietal and temporal projection sites. Hence, a working hypothesis could hold that area 10 provides long-term memory-based constraints during situation-adapted planning and prediction. Such constraints could entail a generic hierarchical structure of a complex action, but also combined weightings by longterm knowledge about episodic as well as social experiences. This view entails a tight communication between the mesial and the lateral area 10. Considering backprojections, lateral PFC conveys to area 10 which information is currently (task-) relevant and hence constrains retrieval and integration of retrieved information from long-term memory. Strikingly, this hypothesis has not yet been considered in the literature. If it is applied to the issue of evolutionary progress in planning and prediction, it is the functional contribution of area 10 and its phylogenetic and ontogenetic development and elaboration that is pushed in the focus of interest. When considering connections to and from frontopolar cortex, one may say that the mesial network of area 10 is fully operative only after its major connecting bundles, UF and CB, have matured. In contrast, for the function of the lateral network of area 10, the earlier maturation of the extreme capsule plays a crucial role (Schmahmann et al. 2007). Thus, it seems that while lateral prefrontal functions are in place at some time during adolescence, their top-down modulation by long-term memory through area 10 fully establishes with a considerable temporal delay. In line with the late maturation of underlying fiber bundles, functions such as episodic memory or mental time travel emerge relatively late during child development, at age 4–5 (Atance 2008), but there is evidence that these functions are not fully developed until early adulthood (Atance and O’Neill 2005; Suddendorf and Busby 2005; Atance and Meltzoff 2006). A function that evolves at the same time, and is correlated age-independent to mental time travel within the individual, is the mentalizing capability (Bischof-K€ ohler 2000). Like a phylogenetic reflection of this ontogenetic progress, the notion that nonhuman animals are capable of episodic memory is highly controversial (Dere et al. 2006; Hampton and Schwartz 2004; Schwartz and Evans 2001; Raby et al. 2007). The same is true for mentalizing abilities (Penn and Povinelli 2007; Call and Tomasello 2008) and mental time travel (Suddendorf et al. 2009). Focusing here on the latter, the so-called BischofK€ ohler hypothesis holds that nonhuman animals cannot anticipate future need states they do not directly experience (Bischof-K€ ohler 1985). This is what Suddendorf and Corballis (2007) define as the capacity of “mental time travel”: (projecting the self into the future, as measured by) the ability to act in the present in anticipation of a need or state that is not currently experienced. As a compromise, Bar and colleagues proposed that foresight may be a gradual continuum that is present in animals, though, of lower complexity (Bar 2007a). Nonhuman animals make predictions that are based on associations, but they lack other mechanisms required for more elaborated prediction; alternatively, simple associations may suffice, when accumulated to a certain amount, to generate complex prediction (Bar 2007b).

90

R.I. Schubotz

However, others have raised a fundamental problem in this discussion, implying that the issue of human uniqueness with regard to the ability of mental travel is not empirically addressable, either because it is only introspectively accessible or because it is reportable only by narration, and nonhuman animals lack linguistic capacities. As a speculation that seems justified against the background of the data discussed so far, the relevant step toward human planning and prediction may result from the development and enhancement of the communication between mesial frontopolar cortex, as part of a network underlying long-term memory, and lateral frontopolar cortex, as part of a network underlying rule-driven, goal-directed acting and thinking. Accordingly, human planning and prediction behavior would be endowed with generalized experiences regarding, e.g., oneself and social embeddings or generic event and action structures. Moreover, this view would suggest that it is specific to humans that their decisions and behaviors can adhere to plans and predictions that operate in the long run. Thus, the frontopolar development in humans may optimize the usage of generalized experience to interpret the specific patterns of current events, and surface in an enhanced capacity to snap an outstanding chance or recognize a rare risk – faculties that Steven Wise recently proposed to be conveyed by human PFC in general (Wise 2008). Going back to the cerebral basis of the mental and behavioral capacities that were invoked, it should be stressed that area 10 is existent not only in human brains but also in that of nonhuman mammals. However, a major problem here is that to date there are no studies addressing frontopolar cortex function in nonhuman primates, with one recent exception (Tsujimoto et al. 20104). The simple truth that area 10 is not new in humans implies that there must be first traces of its functional impact in macaques and lesser mammals as well, even though its functional impact in human cognition may have undergone a dramatic change and possibly may have resulted in entirely new capabilities and behaviors. These changes may be simply effected by combinatorial explosion (more and more of the same) or by changes on a level that has not been considered in this chapter, i.e., differences in gene expression, new molecular pathways, and novel cellular interactions that are suggested to drive the enlargement and species-specific elaboration of the cortex and resulting evolutionary advances (cf. Rakic 2009). It is not before these findings are fully appreciated and integrated that one approaches a more appropriate perspective on area 10’s impact on human cognition and its putative uniqueness.

4 Tsujimoto and colleagues provide evidence in favor of a frontal pole role in monitoring or evaluating self-generated decisions. However, one single study cannot provide a solid basis of, but only a first building block for, a functional account for area 10 in macaques.

Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain

91

5 Frontal Engagement in Prediction: Some Experimental Endeavors Turning from the top of the functional gradation within the frontal lobes to its ground, the line of thoughts so far unrolled may convey the impression that abstractness, as it is reflected in the integration over many events and concrete instances, is a property of representations in the frontopolar cortex, and declines stepwise to the level of concreteness when one goes further back in the frontal lobes and reaches the PMC. However, the picture that emerges from both human and animal research is not that simple. In humans, there is evidence that premotor areas underlie some abstract cognitive functions that clearly exceed motor planning and preparation (Schubotz and von Cramon 2003; Schubotz 2007). These cognitive functions are characterized by the need of prediction in the absence of any real or imagined motor output. Strikingly, it has been shown that also in the macaque, premotor neurons can code for highly abstract task levels (Wallis and Miller 2003; Muhammad et al. 2006; Gail et al. 2009). Yet it remains an open issue to what extent and which abstract cognitive functions are realized by the PMC of macaques or other nonhuman primates, not least because developing experimental setups that are interesting enough for these animals to engage in highly abstract tasks is a challenging, often impossible endeavor, even when based on reward protocols. We conducted a series of studies in human subjects in order to further our understanding of cognitive planning and prediction and its neural basis in the frontal lobes. More specifically, the aim of these studies was to specify the characteristic functional profiles exposed by frontopolar, prefrontal, and premotor areas when humans engage in prediction and planning. In the first study (Golde et al. 2010), we used functional magnetic resonance imaging (fMRI) to address relational reasoning in healthy young subjects. We sought to test for two not mutually exclusive hypotheses as derived from the literature partially outlined above. The first one held that PMC subserves the implementation of simple and single rules, whereas the frontal polar cortex subserves the integration of multiple and complex relations (cf. Christoff et al. 2001; Kroger et al. 2002). According to the second hypothesis, PMC should be sensitive to the relationship between objects and the way they are manipulated, whereas frontal polar cortex should be either preferably engaged for relations between abstract figurative material or domain-independent. In our study, we employed a modified version of the Raven Progressive Matrices (Raven 1938). These matrices consist of a set of figures arranged in rows and columns. In the upper part of each stimulus display in our version, five individual stimuli (graphical images or photographs) and a wildcard – always in the lower right position – formed a 3 2 matrix. Below the matrix, four slightly smaller stimuli were presented as answer alternatives. The subjects’ task was to find the graphical image or photograph that would complete the matrix correctly. Matrices differed with regard to the stimulus material and to the number and type of rules they were governed by. A balanced 2 2 2 factorial design was implemented

92

R.I. Schubotz

Fig. 2 Example for stimuli presented in Experiment 1. The first row shows matrices ruled by simple rules; the second row those ruled by complex rules (factor Rule Complexity). The first and the third columns depict matrices ruled by one rule; the second and the fourth those ruled by two combined rules (Factor Rule Number). Finally, the factor Domain was implemented by abstract figures (left) and action photos (right)

including the factors Domain (Action Photos and Abstract Figures), Rule Number (1 and 2), and Rule complexity (low and high). The Domain factor was used to test the first hypothesis, and both Rule factors to test the second. Matrices were constructed by the combination of these three factors (Fig. 2). Simple and complex rules were implemented by using “quantitative pairwise progression” and “distribution of three values”, respectively (Carpenter et al. 1990). The former and simpler rule consists in the repeated application of one single transformation command within matrix rows (e.g., “add one more item”) and entails a strict sequential progression from the right to the left entry of the matrix (or vice versa). In contrast, the latter and more complex rule amounts to the application of three transformation commands that differed within matrix rows (horizontally) but amounted to the same set between matrix rows (vertically). As a result, findings corroborated the first hypothesis but not the second. Thus, PMC was found to subserve the implementation of simple and single rules, whereas the frontal polar cortex subserved the integration of multiple and complex relations (cf. Christoff et al. 2001; Kroger et al. 2002). In contrast, we found no effect of the stimulus material in either PMC or frontal polar cortex, but only in the lateral PFC, which was more engaged for action than for abstract matrices (Fig. 3). Note that solving the Raven matrices drew on extended networks in the brain including premotor, prefrontal, and frontopolar areas. That is, we found that subregions in the frontal lobe are preferentially engaged for certain types of matrices, but generally they all make significant contributions to the task. Findings suggest that the supposed posterior-to-anterior gradation in the frontal lobes is not so much driven by the level of concreteness or abstraction of a task but rather by the complexity of underlying computations, more specifically by the requirement to integrate multiple relations in parallel. While the PMC showed enhanced activity when first-order relations were to be generated (here: relations between photographs or figures), the frontopolar cortex was enhanced when second-order relations were generated (i.e., relations of relations).

Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain

93

Fig. 3 Main results of Experiment 1. Complex rules were found to enhance activation at premotor sites (blue zone), whereas action stimuli lead to higher lateral prefrontal (orange zone) activation as compared to abstract stimuli. When two rules had to be integrated, activation increased in the entire lateral frontal lobe including lateral area 10 (green zone)

In a second fMRI study (K€ uhn et al. in preparation), we sought to build on this finding and tested the hypothesis of a posterior-to-anterior gradation for hierarchically more complex plans. Subjects performed in a perceptual sequence learning task. Learning was assessed by decreasing error rates of selecting the correct one out of two concurrently presented items. Note that the selection by button press did not amount to an ordered sequence, ruling out the confound of motor learning. Sequences comprised of 16 digits and were structured hierarchically in a 2 by 2 by 2 by 2 fashion. Thus, there were chunks of two digits building a first-order chunk, two of them building a second-order chunk, and so on. Structure was imposed by repetitions, inversions, and transpositions, resulting in sequences such as 61-61-1616-47-47-74-74. In order to disentangle the effects of first-order, second-order, and third-order chunking, a scrambling procedure was implemented. Thus, subjects learned a sequence in the first block of the experiment; in the second block, this sequence was presented in scrambled order such that subjects were required to learn the new structure with regard to the second or to the second- and the thirdorder of chunking. By contrasting blocks, neural correlates of first-, second- and third-order chunking could be isolated. The procedure of learning new and scrambled sequences was iterated several times during the fMRI session. On the basis of the earlier findings on artificial grammar learning (Friederici et al. 2006; Bahlmann et al. 2008), we expected the frontal opercular cortex at the level of the most posterior PMC to be engaged for low-level chunking, and, based on macaque research, SMA to be engaged for chunking of chunks (Shima and Tanji 2000).

94

R.I. Schubotz

Fig. 4 In Experiment 3, different levels of integrating (chunking) were investigated using fMRI. Learning first-order chunks (e.g., 6-1) was found to activate the left precentral frontal operculum (posterior frontal lobe), second-order chunks (e.g., 6-1-6-1) the dorsolateral prefrontal cortex, and third-order chunks (e.g., 6-1-6-1-1-6-1-6) the mesial Brodmann area 9 (BA 9m), pregenual anterior cingulate cortex (ACC), and supplementary motor area (SMA), i.e., posterior as well as anterior frontal sites

An open question was whether more anterior prefrontal sites would enter with increasing chunking levels. Results corroborated our hypotheses: frontal operculum was active for first-order chunks and SMA for third-order chunks. We found no significant correlates for second-order chunks. Finally, both pregenual anterior cingulated cortex and mesial BA 9 were additionally recruited for third-order chunking (Fig. 4). Together, the first two studies point to a functional differentiation within the frontal lobe, with low-level integration of relations at the premotor level and high-level integration of relations at the frontopolar level. Of course, they cannot prove a functional gradation; they solely are in line with this notion. In two further studies, we addressed slightly different issues that also derive from the idea of a functional gradation in the frontal lobes. In a patient study (Haarmann et al. in preparation), we sought to pinpoint the functional impact of PMC for prediction. As mentioned above, premotor areas have been shown to be engaged when subjects perform in a prediction task. These predictions refer to perceptions that may occur within a time window of several seconds. Functional MRI has demonstrated premotor activation during such tasks but cannot prove a causal relation between function and activity. In contrast, circumscribed lesions can indicate that a function breaks down in case that a functional network is disrupted at the lesion site. From the literature it seems that premotor functions are often considered to be confined to the application of plans that are generated elsewhere in the brain, typically evoking prefrontal loops. This view alludes to a gradation in the frontal lobes from the generation or initiation of a plan to its organization and finally its execution. Accordingly, PMC may represent

Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain

95

the sequential structure of a plan but may not be involved in active prediction, i.e., the generation of explicit expectations itself. If so, PMC should be equally important for sequencing tasks directed toward the future as well as for those directed toward the past. This difference relates to the dichotomy of prospective and retrospective memories, which are considered two sides of the same coin and not independent constructs (Burgess and Shallice 1997). Patients with a lesion in the ventrolateral PMC were compared to those with a lesion in the frontopolar cortex when performing in two tasks, a serial prediction task (SPT, Schubotz 1999) and an n-back task (Gevins and Cutillo 1993). In the SPT, subjects are presented with a structured sequence of stimuli and try to predict upcoming stimuli as announced by the first stimuli in a trial. Subsequently, they have to indicate by button press whether the sequence ended as predicted or not (which is the case in 50% of all trials). In the n-back task, unstructured sequences are presented. Subjects are asked to indicate via button press whether the current stimulus is identical to the stimulus presented one (1-back), two (2-back), or n trials before. In this study, digits from 0 to 9 were used as stimulus material. In order to make both tasks more comparable, patients had to respond to each stimulus, indicating whether it was the predicted stimulus in case of the SPT or whether it matched the stimulus presented n trials before in case of the n-back task. Motor deficits were individually balanced on the basis of a motor control condition. Performance was assessed by two classes of reaction times, those of hits (for stimuli that do not fulfill the task condition) and those of correct rejections (for sequence violations and stimulus matches) and by accuracy. Correct rejections and accuracy revealed no interactions between patient group and task. In contrast, for reaction times of hits, we found interactions for GROUP LESION (F (1,18) ¼ 5.97; p ¼ 0.025) and GROUP LESION TASK (F (1,18) ¼ 10.8; p ¼ 0.004) (Fig. 5). Moreover, there was a significant main effect for the between-group factor LESION (F (1,18) ¼ 5.26; p ¼ 0.034), reflecting longer reaction times in the PMC than in the frontopolar group (405 vs. 295 ms). Importantly, the two-way interaction showed that PMC patients were impaired compared to their healthy controls (534 vs. 276 ms), whereas frontopolar patients were not (322 vs. 269 ms). Moreover, the three-way interaction showed that this impairment was solely caused by the SPT task (738 ms in PMC patients vs. 334 ms in PMC controls), whereas performance in n-back was normal (331 vs. 281 ms). Findings indicate that the ventrolateral PMC is causally relevant for prediction in the short term. In contrast, holding a sequence in working memory, as required by the n-back task, was unimpaired. Regarding gradation architecture of frontal lobe function, these data implicate that forward behaviors, i.e., planning and prediction, are not exclusive domains of the prefrontal or frontopolar cortex, but emerge also at the most posterior sites of the frontal lobes in PMC; posterior-toanterior differentiation may be rather related to the amount of integration required in planning and prediction. In the fourth experiment (Bubic et al. 2009), fMRI was used to tap the difference between pattern-matching-based and rule-based prediction. This aim was motivated

96

R.I. Schubotz

Fig. 5 Lesions in the lateral premotor cortex (top left), but not in the frontopolar cortex (top right), result in severe and specific deficits in short-term prediction. Experiment 3 revealed that the ability to detect deviants in a predictable digit sequence (serial prediction task, SPT) was impaired in patients with premotor lesions (blue bars), whereas performance was normal in a control task of comparable difficulty in healthy subjects (n-back). Performance was unimpaired in patients with a frontopolar lesion (green bars). Gray bars depict reaction times of the corresponding healthy control subjects

by the notion that abstractness of representation, i.e., level of integration over cases, increases when going from posterior to polar frontal sites. To this end, a token SPT and a type SPT were compared. The terms “token” and “type” are used to distinguish between two kinds of mappings between stimuli, one based on exact appearance (stimulus A ¼> stimulus A) and the other on pattern classification (stimulus A ¼> stimulus A’). In the token SPT, a structured sequence of visual stimuli is presented and repeated several times. As outlined above, subjects have to indicate in a forced-choice response whether the sequence ended as expected or not (probability 1:1). In the type SPT, task and sequence structure remains the same, but subjects cannot rely on a one-to-one pattern matching, because the structure of the sequence is defined on the basis of stimulus classes (Fig. 6). Thus, one may expect a “circle in a square” at the second position of a sequence without knowing how big the circle will be, where it will appear in the square, or whether the square will be a equilateral or not. Prediction of type sequences thus requires to detect the classification rule and to apply it for prediction. Both type and token SPT were found to engage the premotor–parietal network typically found for prediction (e.g., Schubotz 2004). In contrast to the token ST, however, the type SPT yielded an additional recruitment of the directly adjacent lateral PFC. This activation cooccurred with bilateral activation in fusiform gyrus, a structure involved in higher-order object perception; this pattern may reflect

Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain

97

Fig. 6 Performance in a serial prediction task (SPT) is normally reflected by activation in a premotor–parietal network. Subjects have to indicate at the end of each trial whether the sequence consisted of three repetitions of a triplet of stimuli (1-2-3, 1-2-3, 1-2-3, as shown in the first and in the third row) or contained a sequential violation (1-2-3, 1-2-3, 2-1-3, as shown in the second and in the fourth row). In Experiment 4, subjects performed either in a standard SPT (token-SPT, blue figures depicted in the upper two rows), or in a modified version, the type-SPT (green figures depicted in the lower two rows) where stimuli had to be classified. For example, the first stimulus in a sequence of three had to be a little circle encompassed by a bigger square, but the appearance could vary between repetitions within a trial. This requirement to integrate over an open class of instances called for additional engagement of lateral prefrontal sites

prefrontal sites to exert a top-down modulation of object perception, a modulation that conveys the currently valid classification rule. Findings nicely demonstrate that rule-based integration over an open class of instances calls for prefrontal support in a task that otherwise gets along with premotor loops. In a fifth fMRI experiment (Abraham et al. 2008), the view that area 10 subserves relational reasoning was further considered. One of the functions of mesial area 10 and its associated network that received most attention is mentalizing or the capacity to deal with the mental states of other persons. How is this capacity related to the notion of relational reasoning? On an abstract level of description one may say that, e.g., understanding that someone believes that cows are mammals means to realize a relation between this person and the proposition cows are mammals. Likewise, understanding that someone wants a cup of tea is to see that he or she has a specific relation to a cup of tea. These mental relations are expressed in weighted probabilities to behave and decide in a specific manner. In our study, we compared reasoning about mental relations between persons with reasoning about spatial relations between persons or objects, both at two levels of

98

R.I. Schubotz

Fig. 7 In Experiment 5, thinking about mental relations (conditions M3 and M2), but not thinking about spatial relations between objects (conditions O2 and O3) engaged mesial area 10 and 9. Interestingly, also thinking about spatial relations between persons triggered this activity, but only when concerning two, not three persons. The latter finding may indicate that thinking of two persons triggers spontaneous speculations about their social relationship as well, even if not required in the task. The bar chart depicts the signal change (%sc) in the slightly lateralized activation on the lefthand side. Boxes contain examples for stimulus material used in conditions M3, P3, and O3

complexity (second- and third-order relations). Mesial area 10 was found to be engaged in mental relational reasoning, and activation increased with relational complexity (Fig. 7). Interestingly, reasoning about spatial relations between two persons but not between three persons or between objects caused similar activations, indicating that considering spatial relations between two persons may trigger spontaneous considerations about their mental relation as well. Results corroborate the role of area 10 in relational reasoning, and more specifically that of mesial area 10 in relations that refer to social knowledge. Summarizing the series of experiments, it shows that prefrontal as well as premotor areas are engaged in prediction and planning, or more generally, in relational thinking that can be directed toward future decisions and behavior. Abstract material is not exclusively processed in frontopolar or prefrontal sites but calls also for premotor engagement. This may not be too surprising when one considers that even symbols have a physical body, and that the cortex can adapt to almost every stimulus as long as predictable structures can be identified (cf. Doya 1999). Yet, findings are in line with the notion of a functional trend in the frontal lobes. Thus, findings point to a tendency toward additional prefrontal engagement for integration of rules that drive relational reasoning, integration of multiple steps

Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain

99

in hierarchically complex plans, and integration of token-relations to type-relations in generalized prediction. Finally, thinking about relations that bear a personal impact or social meaning draws on the mesial network that area 10 is embedded in, whereas other relations engage the lateral system of area 10.

6 Concluding Remarks This chapter has focused on the frontal role in planning and prediction, but this was not meant to claim that planning and prediction are at the core of functional description when frontal function is addressed. Rather, these are applications that derive from the generic mechanism describing frontal function, which the author supposes to be relational reasoning. Relational reasoning entails the dealing with transient or persisting spatial, causal, logical, temporal, social, or other relations between, e.g., objects, persons, places, propositions, or events, the latter two of which providing examples for relations of relations (i.e., higher-order relations); it is applicable to planning and prediction and all other kinds of behavior and thinking that rely on and build up relations. This description is partially congruent with the concept of frontal function that has been put forward by Robin and Holyoak (1995) according to which the PFC is involved in the acquisition and use of explicit relational knowledge in the service of a goal. However, it seems that this notion does not exclusively apply to the PFC, as the PMC can be easily included in a slightly more generalized functional description, as proposed above. Moreover, knowledge or representations may not need to be explicit, especially when orbitofrontal cortex is included in this description (Volz and von Cramon 2009). Finally, the notion of goal-directedness may be sometimes misleading because also thinking can have a goal (e.g., to come to a decision, to clarify coherence of one’s own thoughts and believes, to generate a new idea, etc.), but the term “goal” is typically used to refer to concrete overt behavior. Acknowledgments I want to cordially thank Marc Tittgemeyer for helpful and inspiring input on area 10 structure and function, D. Yves von Cramon for sharing his exceptional knowledge of and fascination in the brain, Katja Kornysheva and Kirsten Volz for prudent comments on ideas and the manuscript, Maria Golde, Anne K€ uhn, Andreja Bubic, Anna Abraham, and Felix Haarmann for fruitful experimental collaboration, and Andrea Gast-Sandmann for her great spontaneous and elaborate support when creating Fig. 1.

References Abraham A, Werning M, Rakoczy H, von Cramon DY, Schubotz RI (2008) Minds, persons, and space: an fMRI investigation into the relational complexity of higher-order intentionality. Conscious Cogn 17(2):438–450 Addis DR, Wong AT, Schacter DL (2007) Remembering the past and imagining the future: common and distinct neural substrates during event construction and elaboration. Neuropsychologia 45(7):1363–1377

100

R.I. Schubotz

Allman J, Hakeem A, Watson K (2002) Two phylogenetic specializations in the human brain. Neuroscientist 8:335–346 Allport A (1987) Selection for action: some behavioral and neurophysiological considerations of attention and action. In: Heuer H, Sanders AF (eds) Perspectives on perception and action. Lawrence Erlbaum Associates, Hillsdale, NJ, pp 395–419 Armstrong E, Schleicher A, Omran H, Curtis M, Zilles K (1995) The ontogeny of human gyrification. Cereb Cortex 5:56–63 Atance CM (2008) Future thinking in young children. Curr Dir Psychol Sci 17:295–298 Atance CM, Meltzoff AN (2006) Preschoolers’ current desires warp their choices for the future. Psychol Sci 17(7):583–587 Atance CM, O’Neill DK (2005) The emergence of episodic future thinking in humans. Learn Motiv 36(2):126–144 Averbeck BB, Chafee MV, Crowe DA, Georgopoulos AP (2002) Parallel processing of serial movements in prefrontal cortex. Proc Natl Acad Sci USA 99(20):13172–13177 Badre D (2007) Ventrolateral prefrontal cortex and controlling memory to inform action. In: Bunge SA, Wallis JD (eds) The neuroscience of rule-guided behavior. Oxford University Press, New York, pp 365–389 Badre D, Wagner AD (2007) Left ventrolateral prefrontal cortex and the cognitive control of memory. Neuropsychologia 45(13):2883–2901 Bahlmann J, Schubotz RI, Friederici AD (2008) Hierarchical artificial grammar processing engages Broca’s area. Neuroimage 42:525–534 Bar M (2007a) The continuum of “looking forward”, and paradoxical requirements from memory. Behav Brain Sci 30(3):315–316 Bar M (2007b) The proactive brain: using analogies and associations to generate predictions. Trends Cogn Sci 11(7):280–289 Barton RA, Harvey PH (2000) Mosaic evolution of brain structure in mammals. Nature 405 (6790):1055–1058 Bischof-K€ohler D (1985) Zur Phylogenese menschlicher motivation [On the phylogeny of human motivation]. In: Eckensberger LH, Lantermann ED (eds) Emotion und Reflexivit€ at (pp. 3–47). Urban & Schwarzenberg, Baltimore, MD Bischof-Ko¨hler D (2000) Kinder auf Zeitreise. Theory of Mind, Zeitversta¨ndnis und Handlungsorganisation. Verlag Hans Huber, Bern Bubic A, von Cramon DY, Schubotz RI (2009) Motor foundations of higher cognition: similarities and differences in processing regular and violated perceptual sequences of different specificity. Eur J Neurosci 30(12):2407–2414 Buckner RL (2003) Functional-anatomic correlates of control processes in memory. J Neurosci 23 (10):3999–4004 Buckner RL, Carroll DC (2007) Self-projection and the brain. Trends Cogn Sci 11(2):49–57 Burgess P, Shallice T (1997) The relationship between prospective memory and retrospective memory: neuropsychological evidence. In: Gathercole S, Conway M (eds) Cognitive models of memory. Psychology, London, UK Call J, Tomasello M (2008) Does the chimpanzee have a theory of mind? 30 years later. Trends Cogn Sci 12(5):187–192 Carpenter PA, Just MA, Shell P (1990) What one intelligence test measures: a theoretical account of the processing in the Raven Progressive Matrices Test. Psychol Rev 97:404–431 Christoff K, Prabhakaran V, Dorfman J, Zhao Z, Kroger JK, Holyoak KJ, Gabrieli JD (2001) Rostrolateral prefrontal cortex involvement in relational integration during reasoning. NeuroImage 14:1136–1149 Dere E, Kart-Teke E, Huston JP, Silva MAD (2006) The case for episodic memory in animals. Neurosci Biobehav Rev 30(8):1206–1224 Devinsky O, Morrell MJ, Vogt BA (1995) Contributions of anterior cingulate cortex to behaviour. Brain 118(1):279–306

Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain

101

Doya K (1999) What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Netw 12(7–8):961–974 Duvernoy HM (1999) The human brain. Surface, blood supply, and three-dimensional sectional anatomy, 2nd edn. Springer, Wien Ebeling U, von Cramon D (1992) Topography of the uncinate fascicle and adjacent temporal fiber tracts. Acta Neurochir 115:143–148 Fadiga L, Fogassi L, Gallese V, Rizzolatti G (2000) Visuomotor neurons: ambiguity of the discharge or ‘motor’ perception? Int J Psychophysiol 35(2–3):165–177 Fagg A, Arbib M (1998) Modeling parietal–premotor interactions in primate control of grasping. Neural Netw 11:1277–1303 Freedman DJ, Riesenhuber M, Poggio T, Miller EK (2001) Categorical representation of visual stimuli in the primate prefrontal cortex. Science 291(5502):312–316 Friederici AD, Bahlmann J, Heim S, Schubotz RI, Anwander A (2006) The brain differentiates human and non-human grammars: functional localization and structural connectivity. Proc Natl Acad Sci USA 103:2458–2463 Fuster JM (1997) The prefrontal cortex-anatomy physiology, and neuropsychology of the frontal lobe. Lippincott-Raven, Philadelphia Fuster JM (2001) The prefrontal cortex – an update: time is of the essence. Neuron 30(2):319–333 Gail A, Klaes C, Westendorff S (2009) Implementation of spatial transformation rules for goaldirected reaching via gain modulation in monkey parietal and premotor cortex. J Neurosci 29(30):9490–9499 Gevins A, Cutillo B (1993) Spatiotemporal dynamics of component processes in human working memory. Electroencephalogr Clin Neurophysiol 87:128–143 Gibson JJ (1979) The ecological approach to visual perception. Houghton Mifflin Company, Boston, MA Giedd JN, Blumenthal J, Jeffries NO, Castellanos FX, Liu H, Zijdenbos A, Rapoport JL (1999) Brain development during childhood and adolescence: a longitudinal MRI study. Nat Neurosci 2:861–863 Golde M, von Cramon DY, Schubotz RI (2010) Differential role of anterior prefrontal and premotor cortex in the processing of relational information. Neuroimage 49(3):2890–2900 Grafman J (1995) Similarities and distinctions among current models of prefrontal cortical functions. Ann NY Acad Sci 769:337–368 Grant CM, Riggs KJ, Boucher J (2004) Counterfactual and mental state reasoning in children with autism. J Autism Dev Disord 34(2):177–188 Guajardo NR, Parker J, Turley-Ames K (2009) Associations among false belief understanding, counterfactual reasoning, and executive function. Br J Dev Psychol 27(3):681–702 Haarmann F, von Cramon DY, Seiffert S, Schubotz RI (in preparation). Premotor lesions affect serial prediction but not n-back Hampton RR, Schwartz BL (2004) Episodic memory in nonhumans: what, and where, is when? Curr Opin Neurobiol 14:192–197 Hassabis D, Maguire EA (2007) Deconstructing episodic memory with construction. Trends Cogn Sci 11(7):299–306 Hommel B, M€usseler J, Aschersleben G, Prinz W (2001) The Theory of Event Coding (TEC): a framework for perception and action planning. Behav Brain Sci 24:849–937 Hoshi E, Shima K, Tanji J (1998) Task-dependent selectivity of movement-related neuronal activity in the primate prefrontal cortex. J Neurophysiol 80(6):3392–3397 Ingvar DH (1985) “Memory of the future”: an essay on the temporal organization of conscious awareness. Hum Neurobiol 4(3):127–136 Jacobs B, Schall M, Prather M, Kapler E, Driscoll L, Baca S, Jacobs J, Ford K, Wainwright M, Treml M (2001) Regional dendritic and spine variation in human cerebral cortex: a quantitative Golgi study. Cereb Cortex 11(6):558–571 Kaas J (2006) Evolution of nervous systems. Academic, London

102

R.I. Schubotz

Kaiser M (2007) Brain architecture: a design for natural computation. Philos Transact A Math Phys Eng Sci 365(1861):3033–3045 Koechlin E, Ody C, Kouneiher F (2003) The architecture of cognitive control in the human prefrontal cortex. Science 302(5648):1181–1185 Kondo H, Saleem KS, Price JL (2003) Differential connections of the temporal pole with the orbital and medial prefrontal networks in macaque monkeys. J Comp Neurol 465(4):499–523 Kroger JK, Sabb FW, Fales CL, Bookheimer SY, Cohen MS, Holyoak KJ (2002) Recruitment of anterior dorsolateral prefrontal cortex in human reasoning: a parametric study of relational complexity. Cereb Cortex 12:477–485 K€ uhn AB, Koch I, von Cramon DY, Schubotz RI (2010) Learning chunking hierarchies in nonmotor (cognitive) sequences. J Cogn Neurosci 22(Suppl):173 K€uhn AB, von Cramon DY, Koch I, Schubotz RI (in preparation). Hierarchical chunking in perceptual sequences – evidence from fMRI Lebel C, Walker L, Leemans A, Phillips L, Beaulieu C (2008) Microstructural maturation of the human brain from childhood to adulthood. Neuroimage 40(3):1044–1055 Luria AR (1966) Higher cortical functions in man. Basic Books, Oxford, England Muhammad R, Wallis JD, Miller EK (2006) A comparison of abstract rules in the prefrontal cortex, premotor cortex, inferior temporal cortex, and striatum. J Cogn Neurosci 18(6): 974–989 Nieder A, Freedman DJ, Miller EK (2002) Representation of the quantity of visual items in the primate prefrontal cortex. Science 297(5587):1708–1711 Ninokura Y, Mushiake H, Tanji J (2003) Representation of the temporal order of visual objects in the primate lateral prefrontal cortex. J Neurophysiol 89(5):2868–2873 Okuda J, Fujii T, Ohtake H, Tsukiura T, Tanji K, Suzuki K, Kawashima R, Fukuda H, Itoh M, Yamadori A (2003) Thinking of the future and past: the roles of the frontal pole and the medial temporal lobes. Neuroimage 19:1369–1380 Ong€ur D, Ferry AT, Price JL (2003) Architectonic subdivision of the human orbital and medial prefrontal cortex. J Comp Neurol 460(3):425–449 Passingham RE, Stephan KE, K€ otter R (2002) The anatomical basis of functional localization in the cortex. Nat Rev Neurosci 3(8):606–616 Penn DC, Povinelli DJ (2007) On the lack of evidence that non-human animals possess anything remotely resembling a ‘theory of mind’. Philos Trans R Soc Lond B Biol Sci 362(1480): 731–744 Petrides M (2000) Dissociable roles of mid-dorsolateral prefrontal and anterior inferotemporal cortex in visual working memory. J Neurosci 20(19):7496–7503 Petrides M (2005) Lateral prefrontal cortex: architectonic and functional organization. Philos Trans R Soc Lond B Biol Sci 360(1456):781–795 Petrides M, Pandya DN (2004) The frontal cortex. In: Paxinos G, Mai JK (eds) The human nervous system, 2nd edn. Elsevier Academic Press, San Diego, pp 950–972 Petrides M, Pandya DN (2007) Efferent association pathways from the rostral prefrontal cortex in the macaque monkey. J Neurosci 27(43):11573–11586 Petrides M, Pandyam DN (1994) Comparative architectonic analysis of the human and the macaque frontal cortex. In: Boller F, Grafman J (eds) Handbook of neuropsychology, vol 9. Elsevier, Amsterdam, pp 17–58 Pollmann S, Manginelli AA (2009) Anterior prefrontal involvement in implicit contextual change detection. Front Hum Neurosci 3:28 Raby CR, Alexis DM, Dickinson A, Clayton NS (2007) Planning for the future by western scrubjays. Nature 445(7130):919–921 Rakic P (2009) Evolution of the neocortex: a perspective from developmental biology. Nat Rev Neurosci 10(10):724–735 Ramnani N, Owen AM (2004) Anterior prefrontal cortex: insights into function from anatomy and neuroimaging. Nat Rev Neurosci 5(3):184–194 Raven JC (1938) Progressive matrices: a perceptual test of intelligence. H.K. Lewis, London

Long-Term Planning and Prediction: Visiting a Construction Site in the Human Brain

103

Rilling J (2006) Human and nonhuman primate brains: are they allometrically scaled versions of the same design? Evol Anthropol 15:66–77 Rizzolatti G, Riggio L, Dascola I, Umilta C (1987) Reorienting attention across the horizontal and vertical meridians: evidence in favor of a premotor theory of attention. Neuropsychologia 25(1A):31–40 Robin N, Holyoak KJ (1995) Relational complexity and the functions of prefrontal cortex. In: Gazzaniga MS (ed) The cognitive neurosciences. MIT, Cambridge, MA, pp 987–997 Rushworth MF, Walton ME, Kennerley SW, Bannerman DM (2004) Action sets and decisions in the medial frontal cortex. Trends Cogn Sci 8(9):410–417 Saito N, Mushiake H, Sakamoto K, Itoyama Y, Tanji J (2005) Representation of immediate and final behavioral goals in the monkey prefrontal cortex during an instructed delay period. Cereb Cortex 15(10):1535–1546 Sanides F (1962) Die Architektur des menschlichen Stirnhirns. Monogr Ges Neurol Psychatr 98:1–203 Schacter DL, Addis DR (2007) On the constructive episodic simulation of past and future events. Behav Brain Sci 30(3):331–332 Schacter DL, Addis DR, Buckner RL (2007) Remembering the past to imagine the future: the prospective brain. Nat Rev Neurosci 8(9):657–661 Schmahmann JD, Pandya DN, Wang R, Dai G, D’Arceuil HE, de Crespigny AJ, Wedeen VJ (2007) Association fibre pathways of the brain: parallel observations from diffusion spectrum imaging and autoradiography. Brain 130(3):630–653 Schoenemann PT, Sheehan MJ, Glotzer LD (2005) Prefrontal white matter volume is disproportionately larger in humans than in other primates. Nat Neurosci 8(2):242–252 Schubotz RI (1999) Instruction differentiates the processing of temporal and spatial sequential patterns: Evidence from slow wave activity in humans. Neurosci Lett 265:1–4 Schubotz RI (2004) Human premotor cortex: Beyond motor performance. MPI Series in Human Cognitive and Brain Sciences, vol. 50, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig Schubotz RI (2007) Prediction of external events with our motor system: towards a new framework. Trends Cogn Sci 11(5):211–218 Schubotz RI, von Cramon DY (2003) Functional-anatomical concepts on human premotor cortex: Evidence from fMRI and PET studies. NeuroImage 20:S120–S131 Schwartz BL, Evans S (2001) Episodic memory in primates. Am J Primatol 55(2):71–85 Semendeferi K, Damasio H (2000) The brain and its main anatomical subdivisions in living hominoids using magnetic resonance imaging. J Hum Evol 38(2):317–332 Semendeferi K, Damasio H, Frank R, Van Hoesen GW (1997) The evolution of the frontal lobes: a volumetric analysis based on three-dimensional reconstructions of magnetic resonance scans of human and ape brains. J Hum Evol 32(4):375–388 Semendeferi K, Armstrong E, Schleicher A, Zilles K, Van Hoesen GW (2001) Prefrontal cortex in humans and apes: a comparative study of area 10. Am J Phys Anthropol 114(3):224–241 Sherwood CC, Holloway RL, Semendeferi K, Hof PR (2005) Is prefrontal white matter enlargement a human evolutionary specialization? Nat Neurosci 8(5):537–538 Shima K, Tanji J (2000) Neuronal activity in the supplementary and presupplementary motor areas for temporal organization of multiple movements. J Neurophysiol 84:2148–2160 Smaers JB, Schleicher A, Zilles K, Vinicius L (2010) Frontal white matter volume is associated with brain enlargement and higher structural connectivity in anthropoid primates. PLoS One 5(2):e9123 Suddendorf T, Busby J (2005) Making decisions with the future in mind: developmental and comparative identification of mental time travel. Learn Motiv 36(2):110–125 Suddendorf T, Corballis MC (2007) The evolution of foresight: what is mental time travel, and is it unique to humans? Behav Brain Sci 30(3):299–313 Suddendorf T, Corballis MC, Collier-Baker E (2009) How great is great ape foresight? Anim Cogn 12(5):751–754

104

R.I. Schubotz

Sujazow O, Sch€utte C, Derrfuss J, Melzer C, Vogeley K, Amunts K, von Cramon D Y., Tittgemeyer M (in press) Connectivity-based cortex parcellation of the anterior prefrontal cortex. NeuroImage (Suppl) Szpunar KK, Watson JM, McDermott KB (2007) Neural substrates of envisioning the future. Proc Natl Acad Sci USA 104(2):642–647 Tanji J, Shima K, Mushiake H (2007) Concept-based behavioral planning and the lateral prefrontal cortex. Trends Cogn Sci 11(12):528–534 Toro R, Perron M, Pike B, Richer L, Veillette S, Pausova Z, Paus T (2008) Brain size and folding of the human cerebral cortex. Cereb Cortex 18:2352–2357 Tsujimoto S, Genovesio A, Wise SP (2010) Evaluating self-generated decisions in frontal pole cortex of monkeys. Nat Neurosci 13(1):120–126 Tulving E, Kim A (2007) The medium and the message of mental time travel. Behav Brain Sci 30:334–335 Van Overwalle F (2009) Social cognition and the brain: a meta-analysis. Hum Brain Mapp 30:829–858 Van Veen V, Carter CS (2006) Error detection, correction, and prevention in the brain: a brief review of data and theories. Clin EEG Neurosci 37(4):330–335 Vogt C, Vogt O (1919) Allgemeinere Ergebnisse unserer Hirnforschung. J Psychol Neurol 25:279–462 Volz KG, von Cramon DY (2009) How the orbitofrontal cortex contributes to decision making – a view from neuroscience. Prog Brain Res 174:61–71 Walker AE (1940) A cytoarchitectural study of the prefrontal area of the macaque monkey. J Comp Neurol 73:59–86 Wallis JD, Miller EK (2003) From rule to response: neuronal processes in the premotor and prefrontal cortex. J Neurophysiol 90:1790–1806 Wallis JD, Anderson KC, Miller EK (2001) Single neurons in prefrontal cortex encode abstract rules. Nature 411(6840):953–956 White IM, Wise SP (1999) Rule-dependent neuronal activity in the prefrontal cortex. Exp Brain Res 126(3):315–335 Wise S (1985) The primate premotor cortex: past, present, and preparatory. Annu Rev Neurosci 8:1–19 Wise SP (2008) Forward frontal fields: phylogeny and fundamental function. Trends Neurosci 31(12):599–608 Yakovlev PI, Lecours AR (1967) The myelogenetic cycles of regional maturation of the brain. In: Minkowski A (ed) Regional development of the brain in early life. Blackwell Scientific Publications, Oxford, pp 3–70 Zilles K, Armstrong E, Schleicher A, Kretschmann HJ (1988) The human pattern of gyrification in the cerebral cortex. Anat Embryol (Berl) 179:173–179 Zilles K, Armstrong E, Moser KH, Schleicher A, Stephan H (1989) Gyrification in the cerebral cortex of primates. Brain Behav Evol 34:143–150

Emotion Expression: The Evolutionary Heritage in the Human Voice Elisabeth Scheiner and Julia Fischer

Abstract The human voice is not only the carrier of spoken language, but also an important medium to express the emotional state of the speaker. This chapter reviews the structure of vocal emotion expression in humans and other mammal species, the influence of experience on the expression of emotions in preverbal human children, and the effects of culture on the recognition of emotion expression in the voice by adult humans. In general, the emotion expression in humans and other mammals follows largely similar patterns, supporting the view that nonverbal vocal expression of emotions in humans is part of our species’ evolutionary heritage. Further evidence for this assumption comes from acoustic analyses of normally hearing compared to profoundly hearing impaired infants, which revealed similar patterns of emotion expression within specific utterances, suggesting that auditory experience is not a prerequisite to develop the typical expression patterns. The largest differences between hearing and hearing impaired infants concerned the sequential composition of calls and the onset of babbling. Despite these general principles, the cultural background possibly contributes to emotion expression and emotion perception in adult humans. Listeners from different cultural backgrounds revealed similar patterns of recognition and confusion of play-acted and authentic emotions. At the same time, however, cultural effects could be found in the differential biases regarding the misattribution of different emotions. In sum, the emotion expression in the human voice can best be conceived as a complex interaction between innate patterns and cultural conventions.

1 Background Imagine the phone rings and your best friend calls you. Just from the tone of the voice, you will be able to quickly assess whether your friend is excited or relaxed, or maybe even depressed. Paying attention to the emotion expression of others is an

E. Scheiner and J. Fischer (*) Cognitive Ethology Lab, German Primate Center, Kellnerweg 4, 37077 G€ottingen, Germany e-mail: [email protected]; [email protected]

W. Welsch et al. (eds.), Interdisciplinary Anthropology, DOI 10.1007/978-3-642-11668-1_5, # Springer-Verlag Berlin Heidelberg 2011

105

106

E. Scheiner and J. Fischer

important part of social live. We monitor not only the verbal behavior and overt voluntary actions of others, but also observe their facial and bodily expressions, their gestures, and the quality of their voice. In this realm, the expressions of emotions are of particular importance. Assessing the emotional state of others around helps us to evaluate the situation we are in, to hypothesize about the future behavior of others, and to plan our own actions. Nonverbal communication of emotions is an integral part of our everyday life and has attracted considerable scientific interest for quite some time. Charles Darwin (1872), for instance, used the comparison of the emotional expressions of animals and humans to support his theory of evolution. He advanced the view that human facial expressions of emotions are innate, and an evolutionary heritage of our nonhuman ancestors. Subsequent research on the facial expressions of nonhuman primates supported this hypothesis (Preuschoft 1992; Thierry et al. 1989). Research on the nonverbal expression of emotion for a long time focused on facial expressions (Ekman and Oster 1979; Izard 1971). Though Darwin (1872) mentioned the importance of the voice as a carrier of emotional information, research on the nonverbal vocal expression of emotions lagged behind (Scherer 1986). The neglect of the vocal-auditory channel was possibly partly due to methodological problems, since it was difficult to store and analyze acoustic signals. However, due to technical advances, the situation has changed, and an increasing number of scientists began to study nonverbal vocal expressions of emotions in humans (Frick 1985; Hammerschmidt and J€urgens 2007; Murray and Arnott 1993; Scherer 1979, 1986; Van-Bezooijen 1984). At the same time, a large number of studies addressed the information content of nonhuman primate vocalizations (Fischer et al. 1995; Marler and Bourne 1977; Ploog et al. 1992; Seyfarth and Cheney 2003; Winter et al. 1973). Although considerable interest was devoted to the question which aspects of nonhuman primate vocalizations fulfilled any of the linguistic criteria developed to describe human speech (Fischer 2010; Hauser et al. 2002), there was a concurrent stream of research assessing variation in nonhuman primate vocalizations in relation to emotion and arousal (Fischer et al. 1995; J€urgens and Hammerschmidt 2006; Marler et al. 1992). The purpose of this chapter is to discuss the following questions: (1) Can the nonverbal vocal expression of emotions in humans largely be described as an evolutionary heritage, with structural similarities to the emotion expression described for our closest living relatives? (2) Irrespective of the evolutionary background, is the encoding and decoding of vocally expressed emotions innate, and if this is not the case, to which degree does individual and cultural learning play a role?

2 Vocalizations as Expressions of Emotions Within an evolutionary framework, emotions have been explained to be evolved, adaptive mechanisms that facilitate an organism’s coping with important events (Darwin 1872; Scherer 1993). Although there is an ongoing debate about the nature

Emotion Expression: The Evolutionary Heritage in the Human Voice

107

of emotions, there is a growing consensus among theorists that emotion needs to be viewed as a multicomponent entity (Frijda 1986; Lazarus 1991; Scherer 1984). The three major components of emotion are neurophysiological response patterns (in the central and autonomic nervous systems), motor expression (in face, voice, and gesture), and subjective feelings. Many theorists also include the evaluation or appraisal of the antecedent event and the action tendencies generated by the emotion as additional components of the emotional process (Frijda 1986; Lazarus 1991; Scherer 1984; Smith and Ellsworth 1985). The structure of vocal expressions of emotions is supposed to be determined by different factors (Johnstone and Scherer 2000). First, physiological processes, such as respiration and muscle tone, may influence vocalizations. For example, increased muscle tension produced by ergotropic arousal can, as a by-product, affect breathing patterns, the functioning of the vocal folds, or the shape of the vocal tract. Second, vocalizations that initially may have evolved as by-products of specific body movements (e.g., pant vocalization that are corollaries of physical effort) can be interpreted by the social environment as predictors of upcoming events (e.g., lunge movements) and thus have the potential to become a signal [for a more detailed discussion, see Fischer (in press)]. Expressive motor behaviors that have predictive value are assumed to increase in stereotypy and exaggeration until they eventually only serve purposes of communication. This process is known as phylogenetic ritualization. Another selective pressure that has been related to signal design is the transmission condition of a given environment (e.g., noise) (Johnstone and Scherer 2000; Schneider et al. 2008). However, this is more important for long-distance communication and thus less important for the expressions of emotions. The subjective component of emotions is only tractable in studies of humans at an age at which they can report their inner life (“self-report data”). However, from an evolutionary standpoint, the conscious experience of feelings is not a prerequisite for the basic functions of emotions. The challenge for studies of subjects who either may not have self-consciousness or lack the ability to express it verbally is to develop assays that help us to independently assess and categorize the emotional state the subjects are in (e.g., LeDoux 1994, 1996). For the time being, we will be agnostic with regard to the interaction between conscious thought processes as a component of the subjective experience of emotion, and will use the term “emotion” regardless of whether we are referring to people or animals. Likewise, we will not distinguish between “emotion” and “affective state”, but will use these terms synonymously. Todt (1986) proposed to distinguish between three different internal components, namely, emotion, arousal, and motivation. While emotions refer to the evaluative component, arousal refers to an increased wakefulness that can lead to higher sensory acuity as well as a higher responsiveness. The motivational state, finally, refers to tendencies in behavior. These can be very basally related to the maintenance of homeostasis like quenching thirst or satiating hunger, but also more generally to refer to the tendency to exhibit some type of social behavior, such as aggression or affiliation. In emotion research, particularly in the analysis of vocal expressions, all three components may affect the structure of vocalizations. Support

108

E. Scheiner and J. Fischer

for this view comes from studies concerning the neurobiology of vocal production. The vocal pathway in terrestrial mammals (and other taxa) involves a number of different subsystems, contributing to different degrees in the initiation of vocalization and the structural properties of the calls. In a recent review, J€urgens (2009) proposed two separate pathways involved in the control of vocalizations. The first runs from the anterior cingulate cortex via the midbrain periaqueductal gray (PAG) into the reticular formation of pons and medulla oblongata, and from there to the phonatory motoneurons. The anterior cingulate cortex is involved in the volitional control of call onset in nonhuman primates (Sutton et al. 1974) as well as humans (J€urgens and Von Cramon 1982). The midbrain PAG serves as a collector or relay station for the descending vocalization-controlling pathways, integrating incoming information and triggering specific innate vocal patterns. The PAG has therefore been ascribed a gating function (J€ urgens 2009). Electrical stimulation of this area elicits vocalizations in several species and PAG lesioning in a number of species – including squirrel monkeys, macaques, cats, rats, and humans – causes muteness [reviewed in J€urgens (1994)]. The second vocalization control pathway runs from the motor cortex via the reticular formation to the phonatory motoneurons. This pathway has been shown to include two feedback loops, one involving the basal ganglia and the other involving the cerebellum (J€ urgens 2009). A comparison of vocalization pathways among terrestrial mammal species has revealed that only humans exhibit a direct pathway from the motor cortex to the motoneurons controlling the larynx muscles. In contrast, connections between the limbic cortex and the motoneurons constitute an ancestral trait found in many nonhuman species [for reviews see J€urgens (2002, 2009)]. The most important derived feature in the human lineage appears to be the evolution of the direct pathway from the motor cortex to the motoneurons, enabling volitional control over the oscillations of the vocal folds. Together with the intricate coordination of breathing and articulation, this feature allows for the precise control over speech production.

3 Emotional Vocalizations in Nonhuman Mammals In order to analyze how emotions are encoded in the voice, one first needs to know which emotion an individual experiences before changes in the voice related to this emotion can be characterized. As noted above, it may be difficult to obtain reliable or independent information about the affective state of an individual, particularly with animals. Therefore, researchers have to find other ways to infer the emotional state of animals. It has been argued that at least certain subjective feelings, like for example an animal’s sensation of pain, and different aversive states, can be studied indirectly through features of the animal’s behavior (Darwin 1872; Dubner 1994). Ultimately, the arguments for using indirect measures come from anatomical homology with

Emotion Expression: The Evolutionary Heritage in the Human Voice

109

humans and the assumption that higher vertebrates share similar sensory apparatuses and therefore are likely to experience similar sensations, show similar physiological reactions, and change their behavior in similar ways (Panksepp 1994; Stafleu et al. 1992). On the basis of this argument one assumes for example that animals experiencing pain should show reluctance to come in contact with a potentially painful stimulus; or, if animals are stressed, this should have effect on certain physiological indicators, such as a raised concentration of plasma cortisol or corticosterone and an increased heart rate (Bennett and Perini 2003). On the background of these assumptions, one approach in animal studies was to identify contexts that appeared to elicit a certain kind of emotional reaction with a high probability. Yet, it is not possible to ascribe a specific emotion (like joy, anger, fear, and sadness) to a nonhuman individual. Instead one relies on relatively broad categories of aversive versus nonaversive or pleasant internal states, which are often named differently in the literature (i.e., stress, neediness, urgency, distress, and aversion) according to the different research goals. There are a number of studies exploring differences in vocalizations of animals with respect to painful procedures. Weary and colleagues, for instance, compared vocalizations of piglets (Sus scrofa) that were subjected to castration without anesthetic, or restrained identically but not castrated (i.e., sham-castrated). Piglets that were castrated produced significantly more high-frequency calls than sham castrates, suggesting that the increased rate of high calls is a reliable indicator of pain due to castration (Weary and Fraser 1995). Similarly, vocalizations of cattle (Bos primigenius f. taurus) that were branded showed a greater frequency range in the fundamental frequency, a higher maximum frequency, and a higher peak sound level than animals that were sham branded using a cold iron (Watts and Stookey 1999). Other studies use social isolation to induce stress in animals. Domestic pigs subjected to stress through social isolation, for instance, revealed a positive correlation between the plasma level of adrenaline and the rate of “squeal grunts” and a negative correlation between plasma cortisol and normal “grunts”. Squeal grunts differ from normal grunts by shorter duration, higher frequency range, higher fundamental frequency, and higher peak frequency (Schrader and Todt 1993). Vocalizations of piglets that had been removed from their mothers immediately before suckling compared to those from piglets removed after suckling were louder, longer, and had higher peak frequencies. In addition, the more “needy” piglets gave more calls (Weary and Fraser 1995). Additionally sows showed stronger responses to isolation calls with greater levels of piglet need, indicating that the calls provide reliable information about offspring need (Weary and Fraser 1995). Isolation calls have also been studied with nonhuman primates. Infant capuchin monkeys (Cebus apella) react with an increase in call rate when isolated from their mother and their group mates (Byrne and Suomi 1999); a similar correlation between call rate and increasing distance between mothers and infants was found in chacma baboons (Papio cynocephalus ursinus) (Rendall et al. 1999, 2000). Squirrel monkey (Saimiri sciureus) infants separated from their mothers gave longer calls at greater separation distances from their natal group members, and in their vocalizations a high-frequency

110

E. Scheiner and J. Fischer

element was prolonged (Masataka and Symmes 1986). In common marmosets (Callithrix j. jacchus) different acoustic parameters of the isolation calls, e.g., call duration, peak frequency, and frequency range increased with decreasing sensory information about mates (Schrader and Todt 1993). A third group of studies use alarm calls to explore the influence of different affective states on the vocalizations of animals, because alarm calls can be elicited experimentally and evoke unambiguous responses that are characterized by a comparable state of arousal. Social mongooses (Suricata suricatta), for example, vary the call rate and the acoustic structure of their calls in relation to the perceived urgency in face of a predator (Manser 2001). A nearby predator elicits more calls that are shorter and noisier than a predator that is far away. After a disturbance at their sleeping site, Barbary macaques (Macaca sylvanus) uttered series of shrill barks. Immediately after the disturbance, these calls had a higher peak frequency than calls uttered at the end of the call series (Fischer et al. 1995). A similar increase in pitch was found in red-fronted lemurs (Eulemur fulvus rufus). These animals give special calls, so called woofs, to terrestrial predators, but they also utter woofs in other situations of arousal, as intergroup encounters. Red-fronted lemur woof calls given during intergroup encounters were characterized by higher frequencies than those given in response to a playback of a barking dog, suggesting that animals engaged in intergroup encounters experience higher arousal than during the playbacks. Playback experiments with modified woofs revealed that red-fronted lemurs showed stronger responses to the calls with increased frequencies (Fichtel and Hammerschmidt 2002). Apparently, this change in structure is salient for conspecifics, and information about the senders’ affective state is indeed encoded in call structure and provides meaningful information for the listener. Similar results were found in a study in the squirrel monkey in which the degree of aversiveness/pleasantness of the concomitant emotional state accompanying a specific vocalization was measured more directly (J€urgens 1979). Different calls were elicited by activating specific brain regions responsible for controlling these calls. Activation was carried out by electrical stimulation of stereotactically implanted intracerebral electrodes. The animals were able to switch on and off the vocalizing-eliciting stimulation themselves. In this way, a quantitative measure indicating the degree to which the stimulation was avoided or sought could be obtained. With this method, the aversive or pleasant quality of the affective state underlying the production of specific calls could be determined. A following acoustic analysis of the vocalizations elicited in this study revealed that the calls expressing different degrees of aversion differ in their acoustic structure (Fichtel et al. 2001). Here again, more aversive calls showed higher pitch (characterized by peak frequency, distribution of frequency amplitudes, and dominant frequency), higher frequency range, and a higher amount of nonharmonic energy. Out of the numerous pitch-related parameters, peak frequency seemed to be the most important acoustic variable, because this variable was the only one which differed in all call categories in relation to the degree of aversion. Moreover, a playback study on squirrel monkeys indicated that subtle shifts in the peak frequency were perceptually salient to the listeners (Fichtel and Hammerschmidt 2003).

Emotion Expression: The Evolutionary Heritage in the Human Voice

111

In sum, there are several variables that characterize vocalizations of different nonhuman mammals uttered in negative emotional state studies. The kind and number of the aversion-related variables, of course, is dependent on the investigated species and the research paradigm. Two characteristics of high aversive states, however, were found in many studies on different species: increased call rate and an energy shift toward higher frequencies. The similarity in the vocal expression of aversive versus nonaversive states in different mammal species suggest that they might be phylogenetically ancient and possibly so old that related species share common acoustic features in the vocal expression of these states.

4 Vocal Expression of Emotions in Human Infants According to Keller and Sch€ olmerich (1987), parents interpret the vocalizations of even 2-week-old infants as expressions of emotional states and respond in a differentiated way to different vocalizations. There is no general agreement, however, whether infants during their first months of life are able to express specific emotions in their behavior at all (Lewis 2000; Strongman 1996). Some authors suggest that, in the first months, the expressive behavior is to a large extent random. This assumption is based on the observation that specific vocal or facial patterns often occur without a specific stimulus preceding them. One reason for this low predictability of specific behavior patterns could be a low degree of emotional differentiation in early infanthood. Most authors, however, agree that from the very beginning, there is a differentiation into at least two emotional states: aversive and nonaversive (Giblin 1981; Izard and Malatesta 1987; Lewis 1993; Malatesta-Magai et al. 1991; Sroufe 1979). According to Lewis (1993, 2000), the child is born with tripolar emotional reactions: distress, interest, and pleasure. By 3 months, joy, sadness, and disgust (in primitive, spitting out, form) appear. Anger appears somewhere between 2 and 4 months. In Lewis’ opinion, at this age the cognitive capacity to distinguish between means and ends in order to overcome frustration caused by a blocked goal has developed. At about 7–8 months, children begin to show fear, which requires even more cognitive involvement, certainly the discrimination of different stimuli into predators and nonpredators, for instance. Also in the first 6 months, children show signs of surprise. In other words, they are able to compare their expectancy to the observed events and respond to violations of expectation. At about 2 years, a further cognitive ability becomes important – the capacity for the child to judge its behavior against some standard, either external or internal. Following from this is what Lewis terms the “self-conscious evaluative emotions”, pride, shame, and guilt, for example. For these emotions, a sense of self has to be compared against other standards. Lewis’s model of emotional development assumes completion by the age of three. There may be further elaboration thereafter, but the general structure is established by that age. Other authors argue that a larger number of emotional states can be distinguished from early age on. Malatesta-Magai et al. (1991), for example, believe that emotions

112

E. Scheiner and J. Fischer

are well differentiated and connected to internal states very early in life. They argue that discrete emotion signals can be found in very early infancy; that in emotional expression there is an internal coherence with respect to emotion elicitors, rather than there being randomness or no specificity; and they do not see the need for sophisticated cognitive processes for emotional reactions to occur. According to their Differential Emotions Theory (Izard and Malatesta 1987), newborns’ innate emotions include interest, disgust, physical distress, and startle. By the age of 4 months, most infants have the capacity for anger, surprise, joy, and sadness. Fear appears between month five and month seven. These emotions are called primary emotions because they are formed in the first year of life. The reason for the disagreement concerning the emotional system of infants is that similarly to the animal studies, one depends on indirect inferences. These inferences can be based on contextual information (e.g., certain situations in which certain emotions are probable), or on comparing the facial expressions of infants with the facial expressions of emotions of adults (e.g., Ekman and Oster 1979; Izard et al. 1980). A further possibility to judge the emotional state of infants is to use their parents’ ratings (Wasz-H€ ockert et al. 1968). The early studies that analyzed whether differences in the acoustic structure of infant vocalizations code differences in the underlying emotional state focused mostly on infant cries, since crying is the most prominent vocal behavior of infants (Keller and Sch€olmerich 1987; Muller et al. 1974; Murry et al. 1977; Porter et al. 1986; Wasz-H€ockert et al. 1968). These investigations, however, revealed inconsistent results. Some studies found that acoustically as well as functionally, cries fall into distinct categories (birth, hunger, pain, and pleasure) that can be decoded by listeners (Keller and Sch€ olmerich 1987; Wasz-H€ockert et al. 1968). Other studies suggest that infant crying is a graded signal, mirroring a continuum ranging from low to high arousal or urgency (Brennan and Kirkland 1982; Porter et al. 1986; Protopapas and Eimas 1997; Zeskind et al. 1985) and that listeners are not able to identify the eliciting stimulus (Muller et al. 1974; Murry et al. 1975) or the accompanying emotional state Papousˇek (1992). Papousˇek (1994) reports that with increasing negative arousal, but also with increasing positive arousal, infants cry more frequently, and their cries show an increasing duration, fundamental frequency, and intensity. Vocalizations of distress and complacency, on the other hand, can be differentiated by the relative amount of spectral energy above 1,000 Hz, which is higher in distress vocalizations. There are only few studies that investigated the expression of emotion in other infant vocalizations than cries. To fill this gap in knowledge, Scheiner conducted a longitudinal study of the preverbal vocalizations of normally hearing and profoundly hearing-impaired infants (Scheiner et al. 2002, 2004, 2006). The goals of this study were, first, to describe and compare the preverbal vocalizations of normally hearing and hearing-impaired infants, the emergence of their vocal types, and the developmental changes during the first year of life. Second, the study focused on the question how different emotions influence the structure of preverbal vocalizations and whether there are differences between both groups of infants. Scheiner and colleagues used a combination of parent ratings and contextual information to judge

Emotion Expression: The Evolutionary Heritage in the Human Voice

113

Fig. 1 Spectrogram of infant vocalizations

the emotional state of the infants. The infants were recorded regularly in their familiar surroundings in 11 different situations of normal infant life. For each recording, the assumed emotional state of the infants was noted. The emotional state of the infants was judged by their parents, which could use all available sources of information – voice, mimics, body gesture, and contextual information – for their decision. The parents could choose between the terms joy, contentment, interest, surprise, unease, anger, and pain. These emotions were assumed to occur regularly enough without intervention to allow systematic recordings. The results of this study have shown that the nonverbal vocalizations of infants in their first year of life can be classified into 12 call types based on their acoustical structure (Fig. 1). All call types were uttered in positive as well as in negative emotional context. Both normally hearing (NH) and hearing-impaired (HI) infants shared the same vocal repertoire and there were nearly no hearing-related differences in the acoustic structure of the different vocal types. There were also no differences in the time of emergence of the individual preverbal call types between NH and HI infants; the only exception was the emergence of babbling. Babbling emerged later or not at all in the vocal repertoire of the HI infants; this observation is in line with results of other studies (Eilers and Oller 1994; Oller et al. 1985; Stark 1983). Scheiner and colleagues tested the four most common vocal types of normally hearing infants: cry, short cry, coo/wail, and moan with respect to emotion-related differences in acoustic structure. A multivariate GLM procedure testing the differences between vocalizations uttered in positive (contentment and joy) and negative (unease and anger) attributed emotions found significant differences for all four vocal types (Scheiner et al. 2002). In sum, the results indicate that changes from positive to negative emotional states are accompanied by increases in duration, frequency range, and a set of variables describing the pitch of a vocalization, such as

114

E. Scheiner and J. Fischer

the peak frequency. Thus, important information that reflects whether an infant feels good or bad is encoded in the acoustic structure of individual vocal types. In contrast to the clear differences between positive and negative emotional states, the authors found no differences between single positive emotions (joy and contentment) or single negative emotions (unease and anger). There are several possible reasons for this result. As already mentioned, one reason could be that in young infants the emotional system is not as well differentiated as in older children or adults. A second reason for the failure to separate emotions with the same valence might be that not each vocalization uttered in a given emotion is typical for this state. A third reason could be that infants are able to express various emotions, but the parents’ ratings are not very reliable. Uncertainty might arise, for instance, from interferences between the parents’ own mood or expectations and the infant’s behavior, or from incoherence in emotional labeling. To examine the latter possibility, the authors conducted a cross-check analysis. They analyzed the same vocalizations, but instead of testing for differences in acoustic structure related to emotional categories, they tested for differences related to the contexts in which the infants were recorded. This analysis produced the same results as the analysis based on the emotional ratings. We found differences in acoustic structure between vocalizations uttered in positive and negative contexts, but no differences between vocalizations uttered in different positive contexts or in different negative contexts. Therefore, it seems unlikely that a mismatch between the parents’ emotional estimations and the emotion-eliciting context is the main reason for the low discriminability of related emotions in individual vocal types (Scheiner et al. 2002). When Scheiner and colleagues investigated emotion-related differences in the vocalizations of hearing-impaired infants, they found the same results as for normally hearing infants. While it was again no problem to differentiate vocalizations uttered in positive and negative emotional states, it was impossible to distinguish specific positive or negative emotional states by acoustic structure of the vocalizations. Table 1 shows a comparison between the differences regarding hearing ability and emotion. For hearing-impaired infants, they found significant differences only in the least meaningful factor and only for one vocal type, the “cry”. In contrast, differences between positive and negative emotional states were found in several factors. In sum, these results indicate that one of the most important functions of infants’ vocal signaling, to signal their needs and states to their caregivers (Maesteripieri and Call 1996), is not categorically altered by hearing impairment. The analyses described so far only looked at the acoustic structure of single vocal types. However, infants usually do not utter single vocalizations, but streams of vocalizations. Besides the structural differences in single vocalizations, the composition of a vocal sequence might be a source of information about the emotional state and the hearing ability of the infants. An influence of hearing impairment on the temporal organization of vocal sequences is indicated by the finding of M€ oller and Sch€onweiler (1999) who reported that rhythmic patterns in cry bouts and babbling differ between normally hearing and hearing-impaired infants. According to M€oller and Sch€onweiler, the

Emotion Expression: The Evolutionary Heritage in the Human Voice

115

Table 1 Differences in vocal structure related to hearing impairment and emotion. A principal component analysis was performed on the 88 original acoustic variables to reduce the number and the correlations between the different acoustic measurements Factors Explained NH/HI Emotions variance (%) Cry Coo/wail Moan Cry F1: Frequency range, main energy, frequency 20.3 0.054 – – – with the highest amplitude (¼peak frequency/PF) F2: Distribution of frequency amplitudes 9.8 – – – 0.000 (DFA) F3: Fundamental frequency 9.2 – 0.001 0.001 0.010 F4: Energy in the high frequencies 5.4 – 0.000 0.005 F5: Trend modulation of PF 5.3 – 0.033 0.000 0.004 F6: Trend modulation of the first dominant 3.7 – 0.020 0.000 0.021 frequency band F7: Duration, tonality 3.4 – 0.025 0.012 0.000 F8: Location of maximum of DFA or PF 3.4 0.018 – – – The first column describes those principal component analysis factors with an explained variance above 3. The second column shows the explained variance. The third column shows the differences between the cries of normally hearing and hearing-impaired infants. The fourth to sixth column show the differences between the four emotions (joy, contentment, unease, and anger) for the call types: coo/wail, moan, and cry. The tests for differences between emotions were calculated for normally hearing and hearing-impaired infants together. The numbers refer to the p-values of the univariate tests (GLM repeated measure, SPSS 10)

cry bouts of hearing-impaired infants have “lower” rhythmic frequencies than cry bouts of NH infants. The few studies that investigated the influence of hearing impairment on the succession of specific vocal types yielded inconsistent results. Oller et al. (1985) showed that the relative frequency of various vocal types was the same in 1 deaf and 11 normally hearing infants, while Clement and Koopmans-vanBeinum (1995) found that hearing-impaired infants produced some vocal types more often than their hearing peers. First own investigations of the composition of vocal sequences have shown that the influence of the emotional state on vocal sequence composition is not the same in normally hearing and hearing-impaired infants (Scheiner et al. 2006). Vocal sequences uttered by normally hearing infants in different emotions differed significantly in their composition. Though the infants produced all vocal types in positive and in negative emotional states, some vocal types were uttered more frequently in positive and others in negative emotional states. It was even possible to find differences between vocal sequences uttered in the two positive emotional states: joy and contentment. In hearing-impaired infants, though they were principally able to produce all vocal types that differentiated the emotions in normally hearing ones, there were hardly any emotion-related changes in vocal sequence composition. A second result of the sequence comparison was that, if normally hearing and hearing-impaired infants’ vocal sequences differ in their composition, independent of the emotion (Scheiner et al. 2006). The studies by Scheiner and colleagues showed that different emotional states are encoded in nonverbal vocalizations of infants in two ways. Firstly, the composition

116

E. Scheiner and J. Fischer

of vocal sequences seems to be dependent on auditory input, which indicates that auditory learning takes place. Secondly, emotion-related changes in acoustic structure are the same in both infant groups (Scheiner et al. 2004, 2006). Vocalizations uttered in negative emotional states are mainly characterized by increased pitch manifested in several acoustic parameters, like peak frequency, describing energy distribution in the spectrum. Follow-up studies are required to investigate whether the emotion-related differences found in the acoustic structure and in sequence composition are indeed salient for listeners. Additionally, there are many open questions regarding the mechanisms of auditory learning in preverbal infants and the ontogeny of sequence composition in infants. Since the results of this study indicate that the composition of sequences encodes emotions in a more differentiated way than the acoustic structure of single calls, a detailed exploration of the development of sequence composition could possibly lead to more knowledge about the time course of emotional differentiation. As described in the section before, there are several studies on nonhuman mammals supporting the view that peak frequency could be an important acoustic variable to communicate a negative emotional state. The fundamental nature of aversive or pleasant states suggests that their behavioral expressions might be an evolutionary heritage and possibly so old that related species share common acoustic features in the vocal expression of these states. Our findings in normally hearing and hearing-impaired infants support the view that the vocal expression of emotion in humans has deep phylogenetic roots (Hammerschmidt and J€urgens 2007; J€urgens 2003).

5 Vocal Expression of Emotions in Human Adults There are different ways how emotions could be encoded in nonverbal vocalizations of human adults. For instance, different emotional states of the sender could be expressed by different nonverbal vocal types, such as “cries of pain” and “laughter of joy”. These nonverbal vocalizations are equivalent to the inherent nonverbal vocalizations of nonhuman mammals and human infants (Scheiner et al. 2002). Another possibility is that different underlying emotional states influence the acoustic structure of one and the same vocal pattern. For example, one and the same word or sentence can be uttered with different emotional prosody. Since human adults are able to report about their experiences and feelings, the estimation of their emotional state is far easier than the estimation of emotions of nonverbal individuals. Additionally, it is possible to ask human adults to express different emotions arbitrarily, for example, by uttering cries or laughter and by changing the emotional prosody that underlies their speech. Thus, human adults are the main subject of research on the vocal expression of emotions. While the studies on nonhuman mammals and human infants for practical reasons focus mainly on the question how different emotions are encoded in vocal utterances,

Emotion Expression: The Evolutionary Heritage in the Human Voice

117

studies on human adults can additionally investigate the perception of emotions. The primary research questions are thus firstly whether listeners identify emotions from vocal cues, and secondly which specific acoustic structures encode specific emotions (Frick 1985; Juslin and Laukka 2003; Murray and Arnott 1993; Van-Bezooijen 1984)? The identification of emotions has been studied mostly with standard utterances (e.g., numbers, nonsense syllables, single words, and standard sentences) uttered by actors that were asked to vocally portray different emotions (Banse and Scherer 1996; Hammerschmidt and J€ urgens 2007; Leinonen et al. 1997; Van-Bezooijen 1984). Authentic, i.e., spontaneously uttered, emotional speech was rarely used. In an overview of studies on vocal expression of emotions given by Juslin and Laukka (2003), only 12 out of 104 studies used authentic emotional speech samples. Ratings of the vocalization by listeners showed that the listeners were able to infer vocally expressed emotions much better than by chance (Banse and Scherer 1996; Juslin and Laukka 2003). Johnstone and Scherer (2000) report an overall recognition accuracy near 60% for standardized voice samples using actor portrayals, which is about five times higher than expected by chance. To assign specific emotions to particular vocalizations is, nevertheless, error prone. First, there is different recognition accuracy for different emotions. Accuracy, for example, is much higher for hot anger than for disgust (Banse and Scherer 1996). Second, there is confusion between emotions that share certain properties. For example, hot anger and panic fear are often confused by listeners, probably because they are of the same intensity, and anxiety and panic fear are often confused probably because they are of similar quality (Banse and Scherer 1996). It is also difficult to assign a specific acoustic structure to a specific emotion. For example, anger and fear seem to be characterized by an increase in level and variability of the fundamental frequency, increase in high-frequency energy, and range of the fundamental frequency, but joy is characterized by nearly the same acoustic features. Hammerschmidt and J€ urgens (2007) showed that an increase in aversion and an increase in intensity of an emotional state are mainly characterized by similar structural changes of the vocalizations. They found that the best indicator of aversiveness is the ratio of peak frequency (frequency with the highest amplitude) to fundamental frequency, followed by the peak frequency, the percentage of time segments with nonharmonic structure (“noise”), and the frequency range within single time segments. This may explain part of the difficulties and misinterpretations listeners show in perception studies, especially in cases of single calls or words. However, like nonhuman mammal and human infants, human adults seem to encode increasing aversion through the amount of energy in the high frequencies. This evolutionary old code apparently persists in adulthood.

5.1

Influence of Culture

Psychologists have long debated whether emotions – and their expressions – are universal versus products of culture (Ekman and Davidson 1994; Elfenbein and

118

E. Scheiner and J. Fischer

Ambady 2002; Izard 1971; Scherer and Wallbott 1994). As we have already described in the previous sections, studies on nonhuman mammals and human infants have shown that there is an evolutionary old and apparently inherent component in the vocal expression of emotions. Nevertheless, the question remains whether expression and perception of emotions is not additionally shaped by cultural conventions. Matsumoto (1989), for example, argued that although emotions are biologically programmed, cultural factors have a strong influence on the process of learning to control emotional expression and perception. Scherer and Wallbott (1994) conducted a series of cross-cultural questionnaire studies in 37 countries to investigate the influence of culture on emotion experience and found strong evidence for universality and cultural specificity in emotional experience, including both psychological and physiological responses to emotions. In another study, Scherer et al. (2001) compared judgments of German actor’s expressions of emotions by Germans as well as by members of eight other cultures. They found that with increasing geographical distance to the speakers, the recognition accuracy for the emotional expressions decreased. Additionally recognition accuracy was greater for foreign judges whose own language was closer to the Germanic language family. As a meta-analysis on emotion recognition within and across cultures shows, the in-group advantage found by Scherer et al. (2001) for German judges is a typical finding in cross-cultural emotion recognition studies (Elfenbein and Ambady 2002). This meta-analysis included studies that used different types of stimuli – voice, facial photographs, photographs of the body, and video-representing different channels of communication. They found that emotions were universally recognized at better-than-chance levels. Accuracy was higher when emotions were both expressed and recognized by members of the same national, ethnic, or regional group, suggesting an in-group advantage. This advantage was smaller for cultural groups with greater exposure to one another, measured in terms of living in the same nation, physical proximity, and telephone communication. Majority group members were poorer at judging minority group members than the reverse. Crosscultural accuracy was lower in studies that used a balanced research design. Crosscultural accuracy was also lower for studies that used the tone of the voice than it was for other channels. Another factor that had influence on the cross-cultural recognition accuracy was the “manner of expressing stimuli”. Some studies made use of spontaneous emotions, others asked participants to pose emotions or to imitate emotional expressions that had been chosen on the basis of a priori theoretical grounds. The meta-analysis revealed that imitated emotions were recognized more accurately than either posed or spontaneous emotions.

5.2

Acted Versus Authentic Emotional Expressions

This last result points to another element of controversy in emotion research, namely the disagreement concerning the effects of acting on emotional communication (B€anziger and Scherer 2007). This controversy is important insofar as, as

Emotion Expression: The Evolutionary Heritage in the Human Voice

119

mentioned before, research on the vocal expressions of emotions is mainly based on acted behavior. Though actors spend many years perfecting their portrayal of human behavior and emotions, it may not exactly parallel that produced naturally (Wilting et al. 2006). Before continuing to make use of acted behavior in this line of research in general, and specifically with prosodic stimuli, it is important to determine whether any differences between authentic and play-acted emotional expression fall within the variability of authentic expression or lie outside this variability. Davitz (1964) argued that acted and authentic emotional prosody must correspond as, otherwise, acting would not be convincing. If this is true, one would expect no differences in the accuracy of emotion recognition between spontaneous and acted speech, and also no effect of culture. If, however, acting leads to an exaggeration of emotional expression, and thereby to stronger stereotyping (Scherer 1986), then play-acted emotions should be recognized with a higher accuracy than authentic emotions and still irrespective of culture. It is, however, also conceivable that acting reflects a socially learned code (Hunt 1941). In this case, one would predict that acted emotions expressed in a foreign culture are recognized less accurately than those from one’s own culture. Note that the two latter scenarios are not mutually exclusive, thus it is conceivable that one might find complex interactions between these two factors.

6 Cross-cultural Study on Effects of Authenticity on Emotion Recognition With this in mind, we conducted a study that addressed both the effects of authenticity (authentic versus play-acted emotions) and of cultural background on the perception of emotional prosody (Scheiner et al. under review). We focused on four of the most commonly used emotions in research: anger, fear, joy, and sadness (Bryant and Barrett 2008; Ethofer et al. 2009; Vignemont and Singer 2006). German context-free emotional speech samples were presented to subjects of three separate cultures who were asked to judge the authenticity of the stimuli as well as the emotional content. If emotion depiction is largely shaped by cultural conventions, play-acted emotions in particular should be recognized significantly better within the community than by members of other cultures. Conversely, if prosodic expressions are largely universal, similar patterns in emotion and authenticity recognition should be present between cultures.

6.1

Methodological Issues

The authentic speech recordings were selected from the database of the Norddeutscher Rundfunk (Hamburg, Germany) radio station and were all German expressions of fear, anger, joy, or sadness. The recordings were taken from

120

E. Scheiner and J. Fischer

interviews made while individuals were experiencing real emotions in a specific situation or describing their emotional state while discussing a past event. Emotions were ascertained through the content of the text spoken by the individuals, as well as the broadcast context. Only recordings of good recording quality and low background noise were selected. Segments selected as stimuli were up to 4.5 s in length and did not contain any keywords that could allow inference of the expressed emotion. Recognition rates were tested by subjects asked to rate the emotional content of the written segments to ensure that no language hints were provided as to that emotional content. Any segment for which the respective emotion was recognized better than expected by chance was shortened or replaced and tested again. The final dataset consisted of 20 text sections per emotion including 10 male and 10 female speakers, resulting in a total of 80 recordings made by 80 different speakers. These wave files represent the authentic stimuli. An information sheet was prepared for each authentic stimulus, which indicated the sex of the speaker, the emotion expressed, the context of the situation described, and a transcription of the desired segment and a few sentences before and after the extracted segment. The production of the play-acted stimuli was done by 21 male and 21 female actors. Each actor was asked to reproduce at most three of the authentic recordings. Using the respective recording information sheet, the actors were told to express the respective text and emotion in their own way, using only the text, identified context, and emotion (the segment to be used as stimulus was not shown and the actors never heard the original recording). Each actor was allowed to practice as long as needed, could repeat the acted reproduction as often as they required, and the recording selected for experimental use was the repetition each actor denoted as their first choice. To reduce any category effects between authentic and play-acted stimuli, the environment for the play-acted recordings was varied by recording in different locations (outdoors and indoors). Care was nevertheless taken to avoid background noise. The relevant play-acted recordings (wave format, 44.1 kHz) were then edited so they contained the same segment of spoken text as the authentic recordings. The average amplitude of all stimuli was equalized with Avisoft SASLab Pro Recorder v4.40 (Avisoft Bioacustics, Berlin, Germany). The 160 stimuli were divided into two sets of 80 stimuli made up of 5 authentic/ play-acted pairs per sex, per intended emotion. Each set was judged by 20 subjects per culture, totaling 40 German, 40 Romanian, and 40 Indonesian subjects (20 male and 20 female per culture). For each stimulus, subjects were asked to determine, in a forced-choice design, (1) the emotion expressed (emotion-rating), (2) how certain they felt about their emotion-rating, (3) whether the emotion was authentic or play-acted (authenticity-rating), and (4) whether they felt certain about their authenticity-rating. Generalized linear mixed models (GLMMs) with a logit link function and binomial error distribution were used to analyze the influence of different factors on the emotion and authenticity recognition rates. Possible influences arising from the repeated application of stimuli (with different experimental subjects) and the repeated measurements of experimental subjects were taken into consideration by including two random-intercept effects. Additionally, we analyzed the data in terms

Emotion Expression: The Evolutionary Heritage in the Human Voice

121

of the choice theory (Luce 1959, 1963) to evaluate biases in the subjects’ authenticity and emotion ratings, as well as their unbiased ability to discern authenticity and emotions (see supplements for more methodological details). All models were implemented in the R statistical computing environment (R Development Core Team 2008). GLMM was implemented using the glmer function from the lme4 package (Bates 2005). The Akaike information criterion (AIC) was used to select the model that best approximated the data. On the basis of chosen model, a set of experimental hypotheses were specified through the linear combination of model parameters and were simultaneously tested using the glht function from the multcomp package (Hothorn et al. 2008), which adjusted the pvalues for multiple testing. Choice theory was implemented as a baseline-category logit model (Agresti 2007) using the multinom function from the VR package (Venables and Ripley 2002). The final version of the multinomial “mixed” model (based on the response variable distribution and including random-intercept effects) was programmed under WinBUGS (Lunn et al. 2000) using the R2WinBUGS interface package (Sturtz et al. 2005).

6.2

Authenticity Recognition

Of the authentic stimuli, 68 13% (mean SD) and 50 17% of the play-acted stimuli were correctly assigned to their respective categories. Irrespective of whether the subjects recognized the authenticity of the stimuli, they felt certain about 70 16% of their authenticity judgments. A generalized mixed model analysis (GLMM) revealed that authenticity recognition probability was significantly increased when subjects felt certain, but only for authentic stimuli and for German subjects (Fig. 2). Authenticity recognition probability was significantly decreased for sad stimuli as compared with the other emotions. Post hoc tests (glht function, multcomp package; R) confirmed the difference in the recognition rates for authentic and play-acted stimuli (p < 0.001), a difference that holds for all countries and for all intended emotions. The post hoc tests did not reveal any significant differences based on gender or country. However, assessing recognition accuracy by a simple count of hit rates, without regard for false alarms or bias can be misleading (Wagner 1993). We therefore also analyzed the authenticity ratings in terms of the choice theory. By this analysis, we calculated (1) the participants’ relative bias, which is a measure for the subjects’ tendency to preferentially choose one of the possible responses independently of whether they truly recognized the authenticity or not, and (2) the dissimilarity, which describes the subjects’ unbiased ability to discriminate authentic and playacted stimuli. The analysis revealed that the subjects had a strong bias toward choosing the response “authentic” in the authenticity ratings. The post hoc analysis confirmed only the difference in the bias between the certain and uncertain conditions: when the subjects were certain about their authenticity judgments, they showed

122

E. Scheiner and J. Fischer D

R

I

0.6

sure

Probability

0.8

0.4 l

0.2

l

0.0 authentic play−acted

unsure

Probability

0.8 0.6 l

0.4 0.2 0.0 A

F J S Intended emotion

A

F J S Intended emotion

A

F J S Intended emotion

Fig. 2 Probability of correct authenticity recognition. Probability of correct authenticity recognition by intended emotion (A anger, F fear, J joy, s sadness) and authenticity (authentic or playacted). The data are split by cultural affiliation (D – Germany, R – Romania, I – Indonesia). Data were analyzed in terms of a GLMM (glmer function, lme4 package, R). Given are the medians and the 95% credible intervals. The probability of correct authenticity recognition by chance is 0.5 as indicated by the dashed horizontal lines

a significantly higher bias toward choosing the response “authentic” than when they were uncertain about their authenticity judgments (p < 0.001). The bias did not differ significantly between the subjects of the different countries, but there was a tendency toward a greater bias for Romanians than for Indonesians (p ¼ 0.062). The (unbiased) dissimilarity values were low in all countries, implying a generally low discriminatory capability between authentic and play-acted vocal expressions of emotions. However, the analysis showed that dissimilarity was significantly influenced by the certainty of subjects concerning their authenticity judgments, by country, by expressed emotion, and by sex of the subjects. Post hoc tests revealed that sensitivity was significantly increased when the subjects were certain about their authenticity judgments versus when they were uncertain (p < 0.001) and that German subjects showed higher sensitivity than Romanian and Indonesian subjects (Romanian–German: p < 0.001; Indonesian–German: p ¼ 0.012). The post hoc tests did not confirm any gender-related differences in discrimination ability for any country.

Emotion Expression: The Evolutionary Heritage in the Human Voice

6.3

123

Emotion Recognition

In total, the subjects had a hit rate of 40 21% (mean SD) in the emotion ratings. Irrespective of whether they recognized the intended emotion, the subjects felt certain about 63 16% of their emotion judgments. The emotion recognition ratings in general showed similar patterns in the three countries (Fig. 2). The GLMM revealed that the rate of correct emotion recognition was influenced by the following factors: intended emotion, listener certainty, stimulus authenticity (authentic/play-acted), and listener country of origin. Post hoc tests revealed that the emotion recognition rates were indeed significantly higher when the subjects were certain than when they were uncertain about their emotion judgments (p < 0.001). German subjects scored higher than Romanian (p < 0.001) or Indonesian (p < 0.001) subjects. Furthermore, the post hoc tests showed that despite subjects’ difficulties in discerning authentic and play-acted emotions, and their bias toward categorizing stimuli as authentic, authenticity affected emotion recognition in such a way that anger was recognized more frequently when play-acted (p < 0.001) and sadness was recognized at higher rates when authentic (p < 0.001). Authenticity did not influence the emotion recognition rates for fear and joy. There were no significant differences between the recognition rates for the different emotions, except that fear was recognized significantly less frequently than anger (p < 0.001), joy (p ¼ 0.035), and sadness (p < 0.001). As with authenticity, we analyzed the emotion ratings in terms of the choice theory. The response bias for emotion judgments was calculated with respect to cultural affiliation and stimulus authenticity. The subjects of all three countries showed a bias toward rating play-acted stimuli as angry. This bias was higher for German than for Romanian or Indonesian participants. German subjects also showed a bias toward rating authentic stimuli as angry, while Romanian and Indonesian subjects preferentially chose sadness, and had a bias against choosing “anger” when rating authentic stimuli. There were no remarkable differences between authentic and play-acted stimuli concerning the responses “joy” and “fear”. The bias was either against choosing “joy” and “fear” as responses or, in the case of Indonesian subjects, there was almost no bias.

6.4

Discussion

Though subjects found it relatively difficult to determine authenticity, this factor had a clear impact on the recognition of the vocal expressions of anger and sadness across all three cultures: anger was recognized more frequently when play-acted and sadness was recognized at higher rates when authentic. However, the unbiased dissimilarity values showed that the subjects’ ability to differentiate acted from spontaneous stimuli did not differ significantly between these emotions. The

124

E. Scheiner and J. Fischer

authenticity-related differences in the emotion recognition rates instead appear to be derived from emotion and country-related rating biases in the subjects’ assessments. One possible explanation for these implicit effects of authenticity is that they are derived from an interaction between stimulus-inherent features and cultural expectations about the frequency of specific emotional expressions. Possibly, the actors expressed the intended emotions more intensely, as compared to the original speakers, since their focus was directed specifically at distinct expression of emotion (Laukka et al. 2007). It is therefore likely that the actors’ expressions were generally more intense or aroused than their original counterparts. We know from other decoding studies (Hammerschmidt and J€urgens 2007) that high arousal can often be mistaken for negative valence by subjects. Thus, the more aroused play-acted stimuli might tend to be recognized as “anger” independent of the emotion the actors intended to express. Conversely, the relatively low arousal level of spontaneous emotional expressions might be confused with sadness, which is generally seen as an emotion with low arousal (Russell 1980). However, this explanation, which refers to features of the stimuli themselves, does not thoroughly explain the differences in biases for authentic stimuli between German subjects and others. This could be caused by the cultural variability inherent between the individualistic German and the collectivistic Romanian and Indonesian societies (Hofstede 1980) from which our subjects were selected. Individualistic cultures are expected to reinforce the expression of negative emotions with higher arousal such as anger, contempt, and disgust, as compared to collectivistic cultures, which seem to reinforce emotions such as happiness, surprise, fear, and sadness (Matsumoto 1989). Thus, based on their everyday experiences, the German participants may have expected a higher likelihood of being confronted with expressions of anger, regardless of the stimulus type presented. Conversely, the collectivistic Romanian and Indonesian participants would have expected to be confronted with sadness. However, in the case of assumed acting, the high arousal of the emotional expressions predominates over cultural expectations, thereby returning to the more basic universal bias toward anger. Of additional import are the low recognition accuracies we found in this study (as compared to other studies on emotional prosody), as well as the finding that fear stimuli were not recognized above chance at all. Scherer et al. (2001) reported a mean accuracy percentage for vocally expressed emotions of 67% for Western and 52% for Indonesian countries. In most studies on vocal expressions of emotions, anger and sadness are recognized best, followed by fear (Johnstone and Scherer 2000). Joy shows mixed accuracy in various studies, possibly due to differences with respect to whether quiet or elated joy is being expressed (Johnstone and Scherer 2000). The generally lower recognition rates in this study were likely caused by the fact that stimuli were not preselected according to their distinctiveness as in other studies (Banse and Scherer 1996), in combination with the reallife context from which our original recordings were acquired. This additionally provides evidence for the influence of preselecting play-acted stimuli in this field of research.

Emotion Expression: The Evolutionary Heritage in the Human Voice

125

Despite the recognition accuracies, this study clearly shows that emotion recognition rests on a complex interplay between human universals and cultural specificities and aids in the clarification of what influence this interaction has on human communication. In addition, the implicit effects of authenticity processing show that the design of future studies on vocal emotion recognition, in both psychology and human–machine interfacing, must take these interactions into account (Cowie et al. 2001; Pantic and Rothkrantz 2003).

7 Conclusion Emotion expression in the human voice can best be conceived as a complex interaction between innate patterns and cultural conventions. The innate patterns appear to be part of our evolutionary history as the principles that govern emotion expression in nonhuman primates and humans are largely equivalent. Similar coding principles of emotion expressions were also found in human infants, irrespective of whether these were hearing or not. This suggests that auditory input does not play a role in shaping emotion expression in these infants, at least not in these early stages of their lives. Finally, the rating study on authentic and play-acted emotions revealed that similar patterns of recognition and confusion could be found across cultures, indicating that general patterns can be found. At the same time, however, cultural effects could be found in the differential biases regarding the misattribution of different emotions. Thus, cultural aspects that are possibly related to the collectivistic or individualistic nature of the society impact on the differential expectations that people in different cultural communities generate. This body of research reveals that the voice is a medium that carries rich information and that the perception and processing of this information rests on both innate dispositions and experience.

References Agresti A (2007) An introduction to categorial data analysis. Wiley, Hoboken, NJ Banse R, Scherer KR (1996) Acoustic profiles in vocal emotion expression. J Pers Soc Psychol 70:614–636 B€anziger T, Scherer KR (2007) Using actor portrayals to systematically study multimodal emotion expression: The GEMEP corpus. In: Paiva A, Prada R, Picard, E. Affective computing and intelligent interaction 2007: Lecture notes in computer science, vol 4738. Springer, Berlin, pp 476–487 Bates D (2005) Fitting linear mixed models in R using the lme4 package. R News 5:27–30 Bennett P, Perini E (2003) Tail docking in dogs: a review of the issues. Aust Vet J 81:208–218 Brennan M, Kirkland J (1982) Classification of infant cries using descriptive scales. Infant Behav Dev 5:341–346 Bryant GA, Barrett HC (2008) Vocal emotion recognition across disparate cultures. J Cogn Cult 8:135–148

126

E. Scheiner and J. Fischer

Byrne G, Suomi SJ (1999) Social separation in infant Cebus apella: patterns of behavioral and cortisol response. Int J Dev Neurosci 17:265–274 Clement CJ, Koopmans-van-Beinum FJ (1995) Influence of lack of auditory feedback: vocalizations of deaf and hearing infants compared. In: Ed. by University-of-Amsterdam, Proceedings of the Institute of Phonetic Sciences, Amsterdam, pp 25–37 Cowie R, Douglas-Cowie E, Tsapatsoulis N, Votsis G, Kollias S, Fellenz W, Taylor JG (2001) Emotion recognition in human–computer interaction. IEEE Signal Precess Mag 18:32–80 Darwin C (1872) The expression of the emotions in man and animals. Murray, London Davitz J (1964) The communication of emotional meaning. McGraw-Hill, New York Dubner R (1994) Methods of assessing pain in animals. In: Wall PD, Melzack R (eds) Textbook of pain. Churchill Livingstone, Edinburgh, pp 293–302 Eilers RE, Oller DK (1994) Infant vocalizations and the early diagnosis of severe hearing impairment. J Pediatr 124:199–203 Ekman P, Davidson RJ (1994) The nature of emotion. Oxford University Presss, New York Ekman P, Oster H (1979) Facial expressions of emotions. Annu Rev Psychol 30:527–554 Elfenbein HA, Ambady N (2002) On the universality and cultural specificity of emotion recognition: a meta-analysis. Psychol Bull 128:203–235 Ethofer T, De Ville DV, Scherer K, Vuilleumier P (2009) Decoding of emotional information in voice-sensitive cortices. Curr Biol 19:1028–1033 Fichtel C, Hammerschmidt K (2002) Responses of redfronted lemurs to experimentally modified alarm calls: evidence for urgency-based changes in call structure. Ethology 108:763–777 Fichtel C, Hammerschmidt K (2003) Responses of squirrel monkeys to their experimentally modified mobbing calls. J Acoust Soc Am 113:2927–2932 Fichtel C, Hammerschmidt K, Jurgens U (2001) On the vocal expression of emotion. A multiparametric analysis of different states of aversion in the squirrel monkey. Behaviour 138:97–116 Fischer J (2010) Nothing to talk about? On the linguistic abilities of nonhuman primates (and some other animal species). In: Frey U, St€ ormer C, Willf€ uhr K (eds) Homo Novus – a human without illusions. Springer, New York, pp 35–48 Fischer J (in press) Where is the information in animal communication? In: Menzel R, Fischer J. Animal thinking: contemporary issues in comparative cognition. MIT, Cambridge, MA Fischer J, Hammerschmidt K, Todt D (1995) Factors affecting acoustic variation in Barbary macaque (Macaca sylvanus) disturbance calls. Ethology 101:51–66 Frick RW (1985) Communicating emotion: the role of prosodic features. Psychol Bull 97:412–429 Frijda NH (1986) The emotions. Cambridge University Press, Cambridge Giblin PT (1981) Affective development in children: an equilibrium model. Gen Psychol Monogr 103:3–30 Hammerschmidt K, J€ urgens U (2007) Acoustic correlates of affective prosody. J Voice 21:531–540 Hauser MD, Chomsky N, Fitch WT (2002) The faculty of language: what is it, who has it, and how did it evolve? Science 298:1569–1579 Hofstede G (1980) Culture’s consequences. Sage, Beverly Hills, CA Hothorn T, Bretz FW, Westfall P (2008) Simultaneous inference in general parametric models. Biom J 50:346–363 Hunt W (1941) Recent development in the field of emotions. Psychol Bull 38:249–276 Izard CE (1971) The face of emotion. Appleton-Century-Crofts, New York Izard CE, Malatesta C (1987) Perspectives on emotional development: I. Differential emotions theory of early emotional development. In: Osofsky JD (ed) Handbook of infant development. Wiley, New York, pp 494–554 Izard CE, Huebner RR, Risser D, McGinnes GC, Dougherty LM (1980) The young infant’s ability to express discrete emotion expressions. Dev Psychol 16:132–140 Johnstone T, Scherer KR (2000) Vocal communication of emotion. In: Lewis M, Haviland-Jones JM (eds) Handbook of emotions. The Guilford, New York, pp 220–235

Emotion Expression: The Evolutionary Heritage in the Human Voice

127

J€ urgens U (1979) Vocalization as an emotional indicator: a neuroethological study in the squirrel monkey. Behaviour 69:88–117 J€ urgens U (1994) The role of the periaqueductal grey in vocal behaviour. Behav Brain Res 62:107–117 J€ urgens U (2002) Neural pathways underlying vocal control. Neurosci Biobehav Rev 26:235–258 J€ urgens U (2003) Zum stimmlichen Ausdruck emotionaler Zust€ande. Eine vergleichend verhaltens- und neurobiologische Untersuchung. Sprache Stimme Geh€or 27:71–74 J€ urgens U (2009) The neural control of vocalization in mammals: a review. J Voice 23:1–10 J€ urgens U, Hammerschmidt K (2006) Common acoustic features in the vocal expression of emotions in monkeys and man. Primate Rep 74:3–7 J€ urgens U, Von Cramon DYC (1982) On the role of the anterior cingulate cortex in phonation – a case report. Brain Lang 15:234–248 Juslin PN, Laukka P (2003) Communication of emotions in vocal expression and music performance: different channels, same code? Psychol Bull 129:770–814 Keller H, Sch€olmerich A (1987) Infant vocalizations and parental reactions during the first 4 months of life. Dev Psychol 23:62–67 Laukka P, Audibert N, Auberge´ V (2007) Graded structure in vocal expression of emotion: what is meant by “prototypical expression”? In: 1st International Workshop on Paralinguistic and Speech – between models & data. Saarbr€ ucken, Germany, pp 1–4 Lazarus RS (1991) Emotion and adaption. Oxford University Press, New York LeDoux JE (1994) Emotional processing, but not emotions can occur unconsciously. In: Ekman P, Davidson RJ (eds) The nature of emotion: fundamental questions. Oxford University Press, New York, pp 291–292 LeDoux JE (1996) The emotional brain. Simon & Schuster, New York Leinonen L, Hiltunen T, Linnankoski I, Laakso M-L (1997) Expression of emotional-motivational connotations with a one-word utterance. J Acoust Soc Am 102:1853–1863 Lewis M (1993) Self-conscious emotions: embarrassment, pride, shame and guilt. In: Lewis M, Haviland JM (eds) Handbook of emotions. The Guilford, New York, pp 563–574 Lewis M (2000) The emergence of human emotions. In: Lewis M, Haviland-Jones JM (eds) Handbook of emotions. The Guilford, New York, pp 265–280 Luce RD (1959) Individual choice behavior. Wiley, New York Luce RD (1963) A threshold theory for simple detection experiments. Psychol Rev 70:61–79 Lunn DJ, Thomas A, Best N, Spiegelhalter S (2000) WinBUGS – a Bayesian modelling framework: concepts, structure, and extensibility. Stat Comput 10:325–337 Maesteripieri D, Call J (1996) Mother–infant communication in primates. Adv Study Behav 25:631–642 Malatesta-Magai C, Izard CE, Camras L (1991) Conceptualizing early infant affect: emotions as fact, fiction or artefact? In: Strongman KT (ed) International review of studies of emotion. Wiley, Chichester, pp 1–36 Manser MB (2001) The acoustic structure of suricates’ alarm calls varies with predator type and the level of response urgency. Proc R Soc Lond B Biol Sci 268:2315–2324 Marler P, Bourne GH (1977) Primate vocalizations: affective or symbolic? In: Progress in ape research. Academic Press, New York. pp 85–96 Marler P, Evans CS, Hauser MD, Papousek H, J€ urgens U, Papousek M (1992) Animal signals: motivational, referential, or both? In: Nonverbal vocal communication. Cambridge University Press, Cambridge. pp 66–86 Masataka N, Symmes D (1986) Effect of separation distance on isolation call structure in squirrelmonkeys (Saimiri sciureus). Am J Primatol 10:271–278 Matsumoto D (1989) Cultural influences on the perception of emotion. J Cross-Cult Psychol 20:92–105 M€oller S, Sch€onweiler R (1999) Analysis of infant cries for the early detection of hearing impairment. Speech Commun 28:175–193

128

E. Scheiner and J. Fischer

Muller E, Hollien H, Murry T (1974) Perceptual responses to infant crying: identification of cry types. J Child Lang 1:89–95 Murray IR, Arnott JL (1993) Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. J Acoust Soc Am 93:1097–1108 Murry T, Hollien H, Muller E (1975) Perceptual responses to infant crying: maternal recognition and sex judgements. J Child Lang 2:199–204 Murry T, Admundson P, Hollien H (1977) Acoustical characteristics of infant cries: fundamental frequency. J Child Lang 4:321–328 Oller DK, Eilers RE, Bull DH, Carney AE (1985) Prespeech vocalizations of a deaf infant: a comparison with normal metaphonological development. J Speech Hear Res 28:47–63 Panksepp J (1994) A proper distinction between affective and cognitive process is essential for neuroscientific progress. In: Ekman P, Davidson RJ (eds) The nature of emotion: fundamental questions. Oxford University Press, New York, pp 224–226 Pantic M, Rothkrantz LJM (2003) Toward an affect-sensitive multimodal human–computer interaction. Proc IEEE 91:1370–1390 Papousˇek M (1992) Early ontogeny of vocal communication in parent–infant interactions. In: Papousˇek H, J€urgens U, Papousˇek M (eds) Nonverbal vocal communication. Cambridge University Press, Cambridge, pp 230–261 Papousˇek M (1994) Vom ersten Schrei zum ersten Wort: Anf€ange der Sprachentwicklung in der vorsprachlichen Kommunikation. Verlag Hans Huber, Bern Ploog DW, Papousek H, J€ urgens U, Papousek M (1992) The evolution of vocal communication. In: Nonverbal vocal communication. Cambridge University Press, Cambridge. pp 6–30 Porter FL, Miller RH, Marshall RE (1986) Neonatal pain cries: effect of circumcision on acoustic features and perceived urgency. Child Dev 57:790–802 Preuschoft S (1992) “Laughter” and “Smile” in Barbary macaques (Macaca sylvanus). Zeitschrift f€ur Tierpsychologie 91:220–236 Protopapas A, Eimas PD (1997) Perceptual differences in infant cries revealed by modifications of acoustic features. J Acoust Soc Am 102:3723–3734 Rendall D, Seyfarth RM, Cheney DL, Owren MJ (1999) The meaning and function of grunt variants in baboons. Animal Behav 57:583–592 Rendall D, Cheney DL, Seyfarth RM (2000) Proximate factors mediating ‘contact’ calls in adult female baboons and their infants. J Comp Psychol 114:36–46 Russell JA (1980) A circumplex model of affect. J Pers Soc Psychol 39:1161–1178 Scheiner E, Hammerschmidt K, Jurgens U, Zwirner P (2002) Acoustic analyses of developmental changes and emotional expression in the preverbal vocalizations of infants. J Voice 16: 509–529 Scheiner E, Hammerschmidt K, Jurgens U, Zwirner P (2004) The influence of hearing impairment on preverbal emotional vocalizations of infants. Folia Phoniatrica et Logopaedica 56:27–40 Scheiner E, Hammerschmidt K, J€ urgens U, Zwirner P (2006) Vocal expression of emotions in normally hearing and hearing-impaired infants. J Voice 20:585–604 Scherer KR (1979) Nonlinguistic vocal indicators of emotion and psychopathology. In: Itzard CE (ed) Emotions in personality and psychopathology. Plenum, New York, pp 495–527 Scherer KR (1984) On the nature and function of emotion: a component process approach. In: Scherer KR, Ekman P (eds) Approaches to emotion. Erlbaum, Hillsdale, NJ, pp 293–318 Scherer KR (1986) Vocal affect expression: a review and a model for future research. Psychol Bull 99:143–165 Scherer KR (1993) Neuroscience projections to current debates in emotion psychology. Cogn Emot 7:1–41 Scherer KR, Wallbott HG (1994) Evidence for universality and cultural variation of differential emotion response patterning. J Pers Soc Psychol 66:310–328 Scherer KR, Banse R, Wallbott HG (2001) Emotion inferences from vocal expression correlate across languages and cultures. J Cross Cult Psychol 32:76–92

Emotion Expression: The Evolutionary Heritage in the Human Voice

129

Schneider C, Hodges JK, Fischer J, Hammerschmidt K (2008) Acoustic niches of siberut primates. Int J Primatol 29:601–613 Schrader L, Todt D (1993) Contact call parameters covary with social context in common marmosets, Callithrix j. jacchus. Anim Behav 46:1026–1028 Seyfarth RM, Cheney DL (2003) Meaning and emotion in animal vocalizations. Ann N Y Acad Sci 1000:32–55 Smith CA, Ellsworth PC (1985) Patterns of cognitive appraisal in emotions. J Pers Soc Psychol 48:813–828 Sroufe LA (1979) Socioemotional development. In: Osofsky JD (ed) Handbook of infant development. Wiley, New York, pp 462–518 Stafleu FR, Rivas E, Rivas T, Vorstenbosch J, Heeger FR, Beynen AC (1992) The use of analogous reasoning for assessing discomfort in laboratory animals. Anim Welf 1:77–84 Stark RE (1983) Phonatory development in young normally hearing and hearing-impaired children. In: Hochberg I (ed) Speech of the hearing impaired: research, training and personnel preparation. Univ. Park Press, Baltimore, pp 251–266 Strongman KT (1996) The psychology of emotion. Theories of emotion in perspective. Wiley, Chicester Sturtz S, Ligges U, Gelman A (2005) R2WinBUGS: a package for running WinBUGS from R. J Stat Softw 12:1–16 Sutton D, Larson CR, Lindeman RC (1974) Neocortical and limbic lesion effects on primate phonation. Brain Res 71:61–75 Thierry B, Demaria C, Preuschoft S, Desportes C (1989) Structural convergence between silent bared-teeth display and relaxed open-mouth display in the Tonkean Macaque (Macaca tonkeana). Folia Primatol 52:178–184 Todt D (1986) Hinweis-Charakter und Mittler-Funktion von Verhalten. Zeitschrift f€ur Semiotik 8:183–232 Van-Bezooijen RAMG (1984) Characteristics and recognizability of vocal expressions of emotion. Foris Publications, Dordrecht, Holland Venables WN, Ripley BD (2002) Modern applied statistics with S. Springer, New York Vignemont FD, Singer T (2006) The empathic brain: how, when and why? Trends Cogn Sci 10:435–441 Wagner HL (1993) On measuring performance in category judgment studies of nonverbal behavior. J Nonverbal Behav 17:3–28 Wasz-H€ockert O, Lind J, Vurenkoski V, Patanen T, Valanne E (1968) The infant cry. A spectrographic and auditory analysis. Spastics International Medical Publications in association with William Heinemann Medical Books Ltd, London Watts JM, Stookey JM (1999) Effects of restraint and branding on rates and acoustic parameters of vocalization in beef cattle. Appl Anim Behav Sci 62:125–135 Weary DM, Fraser D (1995) Calling by domestic piglets – reliable signals of need. Anim Behav 50:1047–1055 Wilting J, Krahmer E, Swerts M (2006) Real vs. acted-emotional speech. In: Interspeech-2006. Pittsburgh PA, USA. Winter PP, Handley D, Schott D (1973) Ontogeny of squirrel monkey calls under normal conditions and under acoustic isolation. Behaviour 47:230–239 Zeskind PS, Sale J, Maio ML, Huntington L, Weiseman JR (1985) Adult perceptions of pain and hunger cries: a synchrony of arousal. Child Dev 56:549–554

.

Social Conventions, Institutions, and Human Uniqueness: Lessons from Children and Chimpanzees Emily Wyman and Hannes Rakoczy

Abstract Cooperative behavior has become conventionalized and institutionalized over the course of human evolution. When faced with situations in which we desire to coordinate with others, we adopt social conventions such as driving on a particular side of the road, and adhere to these for social reasons: we expect others to, they expect us to, and this is common knowledge in our cultural community. Many of these practices have also become institutionalized via processes of formal codification and symbolic mediation, resulting for instance, in traffic laws and road signs. And such practices have a normative quality such that there may be penalties for non-adherence. Conventional and institutionalized modes of coordinating represent derived evolutionary traits in the human lineage. Here, proximate causes of this uniqueness are grounded in a group of human-specific social-cognitive abilities, known as ‘collective intentionality’. Already apparent in young children, and apparently absent in chimpanzees, these abilities include a capacity to cooperate with joint goals and joint attention; to collectively assign symbolic functions and to grasp the ‘collective imaginings’ that these prescribe; and to act according to social norms. Ultimate causes of this uniqueness are discussed in terms of reduced levels of social competition; group-selection processes promoting hyper-cooperativeness; and the institution of an egalitarian social organization in human evolution.

1 Introduction Social conventions constitute ways of coordinating with others (Lewis 1969). It is by adhering to a convention that people convene at set times, travel without collisions, and communicate what they mean to one another in various spoken

E. Wyman (*) Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103, Leipzig, Germany e-mail: [email protected] H. Rakoczy Institute of Psychology & Courant Research Centre “Evolution of Social Behavior”, University of G€ ottingen, 37077 G€ottingen, Germany e-mail: [email protected]

W. Welsch et al. (eds.), Interdisciplinary Anthropology, DOI 10.1007/978-3-642-11668-1_6, # Springer-Verlag Berlin Heidelberg 2011

131

132

E. Wyman and H. Rakoczy

languages. However, these conventional modes of coordination are not simply regularities in practice. Many have become institutionalized over the course of human evolution. In some cases, this amounts to formal or legal codification of the practices, as in the cases of terms of employment, marriage contracts, and traffic rules. But human social life is also guided by less formally codified institutions in the forms of symbolically mediated practices. These include, for instance, codes of dress, modes of greeting people, and symbolic communication systems such as spoken languages. Central to both legally codified and uncodified modes of coordination are their normative quality (Gilbert 1989). Social conventions and institutions do not specify what “is done”, but rather what “ought to be done”. Thus, if a person breaches the terms of his or her employment contract or, more informally, arrives to a wedding inappropriately dressed, there will be consequences such as legal punishment or loss of social standing. The normative force of social conventions thus becomes especially evident in the sanctions that follow deviance from the rules. Institutionalized forms of cooperation appear to be unique to humans. This is not to say that our phylogenetically closest relatives, the chimpanzees, do not exhibit impressive cultural capacities. Indeed, they coordinate action with one another in a wide range of activities including group hunting (Boesch and Boesch 1989; Gilby et al. 2008; Watts and Mitani 2002), boundary patrol (Mitani and Watts 2005), and mate guarding (Watts 1998). They also communicate with one another intentionally and flexibly in their gesture (Call and Tomasello 2007). And there appear to be local, group-based traditions in tool-use techniques, grooming and courtship behaviors, and modes of gestural communication (Boesch and Boesch 1990; Pika et al. 2005; Whiten et al. 1999, 2005), such that a range of styles are habitually or customarily adopted by different groups. However, while the extent to which these traditions result from social learning processes, or are rather shaped by variations in the local ecology between different groups is unclear [see, for example Huffman and Hirata (2004) and Humle and Matsuzawa (2002)], a striking difference remains between chimpanzee and human culture: In addition to the massive discrepancy in the quantity and complexity of material culture between our two species, in no case does chimpanzee social interaction appear to be mediated symbolically or governed by any type of socially and collectively recognized normative rules (Hill et al. 2009). Thus, while chimpanzees act in socially coordinated ways with one another to great success, human interaction additionally involves predetermined social roles, such as “colleague”, “parent”, or “friend”, that prescribe cooperation according to culturally defined norms. Furthermore, the use of artifacts in chimpanzee traditions appears to be restricted to instrumental tool use [such as nutcracking, see Boesch and Boesch (1990)]. This in no way compares with the way in which humans assign symbolic status to objects, as well as the human body, in the form of uniforms, tattoos, passports, jewelry, religious artifacts, money, and so on, resulting in the creation and transfer of normative rights and obligations. Thus, while chimpanzee coordination and cultural traditions are impressive, they are not conventionally and institutionally governed. In order to explore the basis of this cultural disparity, we examine the following: some important aspects of young children’s engagement in conventionalized

Social Conventions, Institutions, and Human Uniqueness

133

institutional practice; the social-cognitive abilities they recruit in such practice; and some critical points at which the social-cognitive abilities of chimpanzees and children appear to diverge. In particular, children’s engagement in cooperative activities involving collective intentions to act together with others are explored. Relatedly, their use of joint attention in coordinating such activities, their engagement in play with objects assigned with conventional status, and their understanding of social norms are discussed. Cross-species differences between children and chimpanzees in the behavioral and social-cognitive components of conventional institutional practice are considered at each stage. Finally, these proximate social-cognitive differences are placed within a wider evolutionary framework. It is proposed that factors that may have fundamentally contributed to species divergence in conventional and institutionalized modes of cooperation include (1) inter-species variation in more general levels of competitive cognitive constraint; (2) processes of gene–culture coevolution involving social conformity, moralistic punishment, and group-level adaptations for hypercooperativeness (Richerson and Boyd 2005); and (3) the institution of an egalitarian social organization in human evolution (Boehm 1999; Erdal and Whiten 1996; Knauft 1991).

2 The Background of Collective Intentionality The underlying structure of human institutional reality may be described in terms of its collective intentional basis (Searle 1995). A group of individuals have a collective intention to do something together when their reasons for acting are not reducible to a set of individual intentions. Thus, for instance, when two people take a walk together, it is not simply that they each have individual intentions to walk that happen to coincide. Their individual intentions derive from their collective intention, such that it is because they intend to walk together that either of them wishes to walk at all. These collective intentions involve joint goals of the form “We intend to X”, and are normatively binding, such that abandoning the activity entails a risk of censure (Gilbert 1989). So, if one person unexpectedly departs from the joint walk without warning, the other may reprimand them, or demand explanation, and this reaction will be recognized as legitimate. Importantly, collective intentions underlie the existence of different types of rules in human society: regulative and constitutive rules [see Rawls (1955) and Searle (1995)]. Regulative rules are those that regulate existing social practices, such as traffic rules. Constitutive rules, by contrast, bring new social practices into existence, such as the rules of marriage ceremonies. The difference is that people may have driven cars before traffic rules were in place, but people did not stand before altars and exchange wedding rings before the rules of marriage existed; the marriage rules create the practices associated with official marriage. The collective intentional basis of both types of rule, however, leads to a degree of arbitrariness in form such that people can drive on either the left or the right in order to coordinate,

134

E. Wyman and H. Rakoczy

and exchange wedding rings or some other object in order to symbolize their marriage status. What matters is that there is collective agreement on the rules and a community-wide commitment to adhere to them. Constitutive rules have the form “X counts as Y in context C”, and impose nonphysical functions or what are known as “status functions”, on people, actions, and objects by collective intentionality (Searle 1995). For instance, there is nothing to the physical makeup of a person that enables him to perform the duties of a religious official. It is rather by collective recognition of his status as “priest” within a particular context that he is invested with such powers. Similarly, there is nothing intrinsic to the rings that are exchanged or the words that are spoken at a marriage ceremony that renders the couple married; they count as having married status because we recognize that they do, within the context of our cultural practice. The primary effect of status assignment is the creation of deontic relationships between people, in the form of rights and obligations. For instance, the ordainment of a priest gives that individual the right to conduct marriage ceremonies, but also obliges them to conduct services. When humans coordinate with one another with collective intentions and the imposition of status, normatively governed conventions and institutions emerge. In light of this, it seems notable that children in their second year of life show indications of cooperating with others in collectively intentional ways, and chimpanzees overall do not (Tomasello et al. 2005). Specifically, they appear to cooperate with joint goals, involving rudimentary commitments to the joint activity: On engaging with an adult in a simple task such as retrieving a toy, when the adult ceases to cooperate for no apparent reason, toddlers wait patiently for him to restart, and eventually try to reengage him (Warneken et al. 2006). Chimpanzees in a similar situation (but involving food), however, do not wait for their partner or make any attempts to direct or reengage them, despite the fact that this is well within their capabilities (Go´mez 2007). They rather attempt the task on their own (Warneken et al. 2006). Importantly, human toddlers do not appear simply to want to continue their own selfish enjoyment of the activity: even when aware that they can perform the task alone, they still try to reengage their recalcitrant partner (Gr€afenhain et al. 2009). Another species difference appears to be in the way that young children are concerned for the equal sharing of resources at the end of a cooperative activity. After acting together jointly in pairs, once a child has retrieved his or her rewards they continue to cooperate with their partner to ensure the partner likewise retrieves their own reward (Hamann et al. in press). And they do not appear similarly concerned when there has been no previous cooperation between the two. This concern that all receive rewards after joint activity does not arise in chimpanzees on the same task (Greenberg et al. in press). Lastly, young children also appear to understand something of the more explicit commitments that characterize collective intentional activity: After a verbal declaration to engage in joint activity (e.g., “let’s play together”), young children are more likely to engage recalcitrant partners, and also more likely to verbally excuse themselves when a more attractive activity presents itself (Gr€afenhain et al. 2009). In all, this suggests that young children form joint goals and commitments in their simple forms of cooperation, but there is no convincing evidence yet that

Social Conventions, Institutions, and Human Uniqueness

135

chimpanzees do the same. In fact, what appears to critically affect the rates at which chimpanzees cooperate with each other is whether or not the food to be secured can be easily monopolized by social dominants, as well as the specific levels of tolerance between pairs in separate feeding situations (Melis et al. 2006). This issue will be explore in more detail later on (Section 7), but for now it may be taken to suggest that the cooperative activities of chimpanzees are more tightly constrained by competitive motivations than are those of human infants. Thus, it may be that such motivations prohibit the formation of collective intentions in chimpanzees.

3 Coordination and Convention At the root of conventional and institutional practice lies the notion of coordination. In his seminal work, Lewis (1969) defined a social convention as one of the multiple solutions to a recurrent problem in which several individuals wish to coordinate and each person’s best action depends on what the others do. For example, two friends find their telephone conversation cut off, and they both desire to reestablish connection. The two solutions in which one calls and the other waits, or vice versa, represent alternative solutions to the coordination problem, in other words, alternative conventions. And while neither minds much as to which convention is settled on, both prefer one of these solutions to coordination failure (e.g., both trying to call back). Importantly, in such a situation, each party must reason about what the other person will do. But a potential recursion problem may arise here. In order to figure out what to do, I have to reason about whether you will decide to call back. But you are likely to be reasoning the same about me. Therefore, in order to decide what to do, I must reason about your reasoning about my reasoning, and so on potentially ad infinitum. Central to the adoption of a particular coordination convention is, therefore, some form of joint, mutual, or shared knowledge of what each party understands of the situation. However, the particular cognitive prerequisites for coordinating toward a convention have become a matter of some debate. One possibility is that coordinators require “common knowledge” of a situation, such that they may recursively reason about what each other understands of the situation, at least a few levels up the reasoning hierarchy (“I expect you to expect me to expect you”, etc.). But then questions arise as to when and how appropriate “cut off” points are reached in this hierarchy of inferences, such that an individual can ever be satisfied that common knowledge exists (Gilbert 1989). This, as well as other concerns about the capacity of adults to reason about recursively embedded states [let alone young children, see Tollefson (2005)], has led to alternative proposals as to how such mutual understanding might be established. These place joint understanding of a situation more squarely in the domain of perception and suggest that children and adults may use psychological heuristics for assessing whether or not mutual knowledge exists between parties. Thus, for example, in situations requiring coordination, two individuals might assess the evidence that their partners are rational and attending to the task-relevant aspects of the environment (including themselves)

136

E. Wyman and H. Rakoczy

and make inferences about whether common knowledge holds on this basis (Clark and Marshall 1981). The more specific phenomenon of “joint attention”, in which each partner monitors the same aspect of their environment as well as the other’s attention (Bruner 1983; Tomasello 1995), has recently been proposed not just as a basis for common knowledge but as a form of common knowledge in itself [see Peacocke (2005) and Tomasello (1995)]. On the one hand, there are structural resemblances in the way in which joint attention and common knowledge may both iterate recursively: just as I may “know that you know that I know, etc”., I may “see that you see that I see, etc”. But it is also possible that the perceptual basis of joint attention enables individuals to bypass complex inferential processes altogether, since the other person can literally see their partner attend to a target and themselves (Peacocke 2005). In fact, since perception is an intentionally guided process of information acquisition (Brink 2001; Gibson and Rader 1979), this picture may be oversimplified. But behavioral cues such as gaze and head direction may operate as salient cues in assessing whether individuals are in joint attention (that are not obviously available in the case of common knowledge). And within a frame of joint activity, particularly one of potential coordination, children may reason something of the form: “if we’ve both looked towards the target, and to each other, perhaps we can assume enough information is shared between us to launch cooperation”. We, therefore, assessed the role of joint attention in young children’s decisions to coordinate toward a convention in a coordination game (Wyman et al. submitted). In this particular game, known as the “Stag Hunt” (Rousseau 1762; Skyrms 2004), the child and an adult partner continually and individually collected lowvalue prizes (hares). Occasionally, the additional option of collecting a high-value prize (a stag) cooperatively with the adult arose, and children had to decide which of the two to opt for. However, the decision entailed a risk: a lone attempt on the high-value prize would certainly fail and would also lead to loss of the child’s lowvalue prize (see Fig. 1). Half of the children played the game in conditions of individual but parallel attention: the child could see the prizes, could see the adult monitor the prizes, and was potentially aware that the adult could see the same of them. For the other half of the children, by contrast, the adult also looked over and made mutual eye contact with the child, thus creating joint attention to the high-value prize. The result was that children coordinated with the adult to obtain the high-value prize more often in conditions of joint attention to the prizes than in conditions of individual attention.

Player 1

Fig. 1 Schematic payoff matrix of the stag hunt game (where x > y)

Player 2

Stag

Hare

Stag

x,x

0,y

Hare

y,0

y,y

Social Conventions, Institutions, and Human Uniqueness

137

This suggests an important role for joint attention in children’s decisions to coordinate toward joint goals with others. It also points to the possibility that joint attention may act as a developmental precursor to the type of recursive, inferencebased common knowledge that adults seem capable of contemplating to some degree. Lastly, it suggests joint attention may act as a psychological heuristic for the assessment of common knowledge in general (Campbell 2005; Peacocke 2005). Interestingly, chimpanzees in a “Stag Hunt” situation are quite capable coordinators: when two conspecifics can either retrieve a low-value food (raisins) alone, or rather coordinate to cooperatively retrieve a high-value food (banana) that is available for a limited period of time, they are highly successful in securing the high-value food (Bullinger et al. in press). However, the strategies by which they achieve coordination may be slightly different from those of young children. In particular, they do not appear to visually monitor their partners or actively seek out mutual eye contact with them. Rather, one partner spontaneously approaches the high-value food, and if the other does not follow after some time, attempts to communicate with him or her. Further studies that investigate the cooperative propensities of child peers in “Stag Hunt” games, and the particular strategies they use to coordinate are currently under way. But these provisional results suggest that coordination in children may be centrally mediated by the mutual expectations or knowledge embodied in joint attention, whereas that in chimpanzees may be based on a behavioral strategy involving the mutual adjustment of actions and, when the risk of failure seems immanent, imperative communication. In fact, while it appears that chimpanzees have good grasp of what others see (Call and Tomasello, 2008), there is some suggestion that joint attention (in which they understand that they and others attend to an object and each other’s attention) is not within their cognitive repertoire. In particular, there are quite specific developmental differences in the emergence of joint attention-related abilities in human and chimpanzee infants (Tomasello and Carpenter 2005): Human infants first develop skills of “joint engagement” in which they check back and forth between an object and an adult’s face during interaction; they then begin to engage in attention following behaviors in which they “tune into” the attentional frame of others and direct others’ attention with their own communicative gesturing; lastly, they engage in imitative learning [see also Carpenter et al. (1998)]. Chimpanzee infants, by contrast, first produce some imitative behaviors, and their attention following and communicative gesturing emerge afterward. Importantly, they fail to develop any joint engagement behaviors at all (Tomasello and Carpenter 2005). In line with this, chimpanzee infants conspicuously fail to develop any declarative gestures, that is, gestures produced for the purpose of sharing attention with others or showing objects for that purpose. Human infants, by contrast, from the age of 12 months, spontaneously point for others simply with the singular goal of sharing attention with them (Liszkowski et al. 2004). One possibility, then, is that while chimpanzees engage in relatively sophisticated forms of behavioral coordination and communication, they do not do so on the basis of mutual expectations, or the type of mutual knowledge embodied by joint attention, as young children appear to do. In this sense, their coordination is not by convention.

138

E. Wyman and H. Rakoczy

4 Coordination and Fiction A special case of coordination arises in human interaction that is mediated by collectively assigned status functions. As mentioned, status is assigned to people, actions, and objects via the constitutive rule “X counts as Y in Context C”. This essentially results in the symbolic mediation of social interaction, and places particularly interesting cognitive demands on interactants. Since there is nothing in the X term that physically denotes the Y term, in order to understand status functions, Searle (2005) notes that we have to “think at two different levels at once”. He elaborates “we have to be able to see the physical movements, but see them as a touchdown, to see the piece of paper, but see it as a dollar bill, to see the man but to see him as a leader. . .” (pp. 12–13). The cognitive ability to take such a dual perspective is required for an appreciation of symbolic phenomena in general. For example, in order to successfully interpret the symbols on a map, one cannot simply observe that there are markings on a piece of paper. One must additionally recognize that the map maker intends the reader to interpret the blue lines as rivers, the numbers as altitude markers, and so on [see Rakoczy et al. (2005b) on the development of this ability in children]. The way this dual perspective works in another domain, that of symbolic art, offers additional insights into how we understand institutional status. The idea is that the assignment of status functions to props generates a set of prescribed imaginings (Walton 1990). In observing a painting, for instance, one not only observes that there are strokes of paint applied to a flat canvas. To appreciate the painting as work of art, one is also required to imagine that there is a couple who stroll through the park, the sun is setting, and so on. Indeed, this is precisely the intention of the artist: In crafting a work of art, he or she invests in shaping some aspect of the environment such that it will result in something more than observations of a literal nature (such as “there is a canvas” or “there is a block of wood”). He or she creates a work with the intention of triggering associations, interpretations, and imaginings. And only to the extent that others adhere to these psychological prescriptions do they engage with or appreciate the work as art. This notion of prescribed imaginings may provide some insight into how institutional structures exert social force in governing our daily coordinations, despite their ontological subjectivity: Ultimately, we ascribe to a set of “collective fictions” in our recognition of institutional status and its associated norms because neither exists independently of our collective acceptance that they exist (Castoriadis 1998; Plotkin 2003; Searle 1995). Thus, in a similar sense to our collectively imagining that a couple strolls through the park in appreciating a painting, we may be said to collectively imagine that a paper is “money” or that a couple is “married” in our institutional affairs. This is precisely the function of symbolic status: to direct our imaginings in collectively recognized, normatively governed ways. But critically, in the case of institutional status, this leads to normatively governed patterns of behavior: We allow those in possession of money to acquire certain goods and we require that those in receipt of money relinquish those goods; we allow married couples

Social Conventions, Institutions, and Human Uniqueness

139

certain rights and oblige them to fulfill certain duties. In this way, the prescribed imaginings associated with the assignment of status functions may be central in mediating the social norms at the basis of institutional practice. From a developmental perspective, it may be important that props invested with status functions via constitutive rules underlie the institution of fiction more generally (Walton 1990). In particular young children’s games of fictional play appear to contain something of the elementary structure of institutional practice (Rakoczy 2006, 2007). Just as paper may count as “money” in the context of our adult exchange practices, blocks may count as “apples” in young children’s games of joint pretense (Walton 1990). The assignment of status functions is by collective intention (it is only by our intentions that these blocks count as “apples”) and results in normative prescriptions for action: Once children assign the status of pretend of “apples” to their blocks, they ought, therefore, to be “eaten” and not “drank” or used to build with. In addition, the role of performative speech acts in pretense is central to status function creation: Just as a priest may consecrate a marriage with the words “I now pronounce you man and wife”, in pretense, children may ordain objects with conventional status, for example, with the words, “these are now our apples!” However, pretend play is not yet institutional practice, and the differences between the two render pretense “proto-institutional” rather than directly analogous to the adult phenomenon (Rakoczy and Tomasello 2007). For instance, typically in pretense, status is assigned and must be respected by just a few individuals, and so children do not need to consider whether, and how, a whole community understands that status. The status functions are not part of a wider “web” of functions and practices (as in the case of money, for instance, in which an individual must grasp not only what a dollar bill is, but how it is earned, the relative value of goods, and so on). And the status functions exist temporarily and nonseriously such that they do not have “real-life” consequences in the way that, for instance, acquiring and spending dollar bills do. In fact, it is precisely because of these differences that pretense has been proposed to constitute a developmental “cradle” for children’s understanding of social conventions and institutions (Rakoczy and Tomasello 2007). And this possibility renders pretend play a useful tool for investigating what young children understand of status assigned by constitutive rules, and their associated normativity.

5 Coordinating with Objects and Status Young children begin pretending during their second year, mostly in social interactions with caregivers (Haight and Millar 1992), and by imitating the pretend actions they see others perform (Rakoczy and Tomasello 2006; Rakoczy et al. 2005a). An interesting question with regard to their understanding of institutional phenomena is what, during such play, they understand of the constitutive rule “X counts as Y in C” such that, for example, a “wooden block” counts as an “apple” in the context of “their game”.

140

E. Wyman and H. Rakoczy

By around age three, children appear to understand something of the dual perspectives involved in pretending with objects. They correctly state, for instance, that although somebody is pretending a piece of string is a snake, it is really only a piece of string (Abelev and Markman 2006; Flavell et al. 1987; Lillard 1993). Children this age also understand that an object may be assigned multiple pretend identities, for instance, observing that while they pretend an empty cup contains chocolate milk, another person may pretend it contains orange juice (Bruell and Woolley 1998; Gopnik and Slaughter 1991; Hickling et al. 1997). More revealing, however, are situations in which children inferentially extend the pretend stipulations that have been set up in a game through their own pretend actions. When a child, for instance, pretends to drink pretend milk that an adult has pretended to pour, they demonstrate a collective or joint intention to assign status together with that person (Rakoczy 2006). This is because, unlike in the case of real pouring (in which the adult’s pouring actually enables the child’s drinking), there is no physical contingency between the two pretend actions that could otherwise motivate or explain the child’s pretend elaboration. It is significant, then, that children as young as 2 years old produce inferential pretense in their object substitution, for instance, pretending to eat what the other has cooked, or clean what the other had spilled (Harris and Kavanaugh 1993; Rakoczy and Tomasello 2006; Rakoczy et al. 2004). This serves as particularly convincing evidence that they engage in status assignment, and thus understand at least the “X counts as Y” part of the constitutive rule. However, whether they also assign this status context-specifically is not yet clear. This is important because it is the essence of status assignment that it exists only relative to context. Thus, for instance, a religious dignitary may be allocated substantial authority by one group of people, but be considered powerless by another; a bank note may enable the purchase of valuable goods in one country and be rejected as invalid outside that country. It is only within the context of a joint agreement, practice, or particular community that conventional status holds any force. We, therefore, investigated the understanding that 3-year-old children have of the context-specific nature of jointly assigned status. Specifically, we assessed their ability to pretend with an object whose pretend status changed between two different contexts (Wyman et al. 2009b). Children were initially confronted with an object that had no obvious function (such as a stick). They were then required to pretend that the object had one status (such as “spoon”) in one context and a different status (such as “toothbrush”) in a second context. Crucially, however, they were also required to switch back to the original context, pretending appropriately again (that the object was a “spoon”). In addition, as a particularly convincing measure of their understanding, they were required to pretend inferentially at each stage of the game (in context 1, again in context 2, and then again back at context 1) by not only repeating, but in some way elaborating the pretend acts that had previously been performed there. The result was that 3-year-olds pretended appropriately and inferentially when switching back and forth between contexts. And this was the case regardless of whether the

Social Conventions, Institutions, and Human Uniqueness

141

contexts were set up by one adult who moved between two locations, or rather by two different adults at the same location. Thus, young children appear to understand the rudiments of the constitutive rule “X counts as Y in Context C” in their games of joint pretense. Additionally, they demonstrate not only an understanding of status function assignment but also the consequences this has for what may be deemed appropriate action in each context. Lastly, the fact that children pretended appropriately both with the same person at two different locations and with two different people at the same location suggests that they do not simply associate or “map” different statuses to people or places. It rather indicates an understanding that it is joint activity or practice that underlies status function assignment. In contrast to the relatively sophisticated understanding young children have of symbolic status, the symbolic capacities of chimpanzees appear to be quite limited. Strikingly, chimpanzees are able to both understand and use a wide variety of seemingly symbolic devices in the form of American Sign Language gestures (Fouts 1972; Gardner and Gardner 1969), as well as abstract lexicon symbols (Greenfield and Savage-Rumbaugh 1990; Savage-Rumbaugh et al. 1986). They are also able to match sets of objects presented on a screen to the Arabic numeral representing the sum of the set, and to select the set of objects that correctly matches the numeral (Biro and Matsuzawa 2001). However, while these abilities are unquestionably impressive, they may demonstrate highly advanced associative learning capacities, rather than any real symbolic competence, and they do not indicate that chimpanzees understand anything like constitutive rules. For the most part, these capacities rely on massively extended training programs of conditional reinforcement, containing hundreds of trials in which the animals receive food after successfully connecting a sign with a particular referent. Over time, they then develop a wide range of arbitrary sign-referent connections, enabling them to later select referents in responses to signs, and signs in response to referents. But this does not demonstrate an understanding that any particular symbol “counts as” or “stands for” something beyond itself, that it does so context-specifically, or that it does so by social agreement. In fact, there is some indication that what chimpanzees understand of these symbolic devices is their instrumental use in interactions, rather than any collectively assigned meaning: Almost all instances of chimpanzee productive communication in gestures and lexicons are restricted to one communicative function: requesting objects or actions from humans (Greenfield and Savage-Rumbaugh 1990; Rivas 2005). This disinclination to use either signs or lexicons for other communicative functions, such as to inform or to share attention with others (as infants as early as 12 months old do with their pointing gestures, see Liszkowski 2005; Liszkowski et al. 2004, 2006), suggests that what chimpanzees understand of particular gestures and lexicons is their functional role in acts of request, rather than the underlying structure of their assigned symbolic status. In effect, what chimpanzees may understand of gesture signs, lexicons, and numerals is that when humans produce them, they themselves should respond in a particular way, and when they produce them, humans will likely act in a particular way.

142

E. Wyman and H. Rakoczy

There is another domain in which it appears possible that chimpanzees and apes in general might symbolically assign status to objects: that of pretend play. For instance, there are suggestions that chimpanzees may pretend to eat from a picture of food, or to feed a cuddly toy with grapes (Lyn et al. 2006). Similarly, there is an observation of a captive gorilla apparently handling a wooden log as though it was a baby (Go´mez and Martı´n-Andrade 2002). However, not only are these apparent pretend behaviors highly infrequent in captivity and rarely observed in the wild, evidence that the apes actually have an intention to pretend [which is definitive of pretend acts in general, see Rakoczy (2006)] is unconvincing: Without anything like inferential measures of pretend action, it is difficult to ascertain from observations whether the chimpanzee intentionally pretends that a picture is food or simply responds to the picture as though it were real [as young infants sometimes do, see Deloache et al. (2003)]. It is similarly unclear whether the chimpanzee pretends the cuddly toy is eating, or rather responds to a caretaker’s command to “feed the monkey” [as in Lyn et al. (2006)]. And whether a gorilla intentionally substitutes an object for a baby, or simply plays out instinctive motor routines designed to catalyze maternal behavior in the wild, needs to be established before pretend intent is attributed (Go´mez and Martı´n-Andrade 2005). In general, observations of pretend play in apes are rare, lacking any indications of inferential pretense, and often arise even in the absence of models of the serious behaviors to which they might refer. It appears, therefore, that pretense in apes may be most accurately described as the production of action schemas outside their usual behavioral context rather than anything obviously symbolic (Go´mez and Martı´n-Andrade 2005). The symbolic use of objects in social interaction, and particularly in episodes of pretend play, appears to mark avenues of species divergence between humans and chimpanzees.

6 Coordinating with Norms Conventional and institutional practice is normatively governed (Gilbert 1989). If one drives on the wrong side of the road, attempts to speak to an English person in French, or to take another person’s property, there will be costs. Indeed, the very hallmark of normativity is the sanctions that apply for nonadherence, for instance, in the form of direct penalties (Richerson and Boyd 2005), social ostracism (Panchanathan and Boyd 2004), or simply the costs inherent to coordination failure (Bicchieri 2006). Conventionalized and institutionalized forms of coordination thus not only specify how people regularly coordinate but how they ought to coordinate. And when coordination is mediated by people and objects assigned with conventional status, there are ways those people and objects ought to be treated. Young children appear to understand something of regulative social norms. They grasp the difference, for example, between conventional norms such as “children cannot go outside without clothes” and natural laws such as “children cannot turn

Social Conventions, Institutions, and Human Uniqueness

143

into fish” (Kalish 1998). They also correctly reason from deontic norms such as “if Anne wants to go outside, she ought to wear her coat”, and understand that such norms may motivate behavior (Kalish and Shiverick 2004). In addition, they capably identify violations in normative agreements both between adults and between peers [such as agreements to swap toys, (Harris and Nunez 1996; Harris et al. 2001)]. With regard to status functions, clear signs of normative understanding have been found in the domain of children’s games. Thus, when an object such as a building block is invested with the status function of “dice” in a game (having some red, some blue sides), children actively protest when a puppet joins the game, but then proceeds to build, exclaiming “no that’s our dice!” (Rakoczy et al. 2008). In pretense games too, one study suggests that young children see pretend status as having normative consequences for action (Rakoczy 2008): In this study, a collection of objects such as clothes pegs were assigned the status of pretend “carrots”, while one was assigned the status of pretend “knife”. A puppet then entered and pretended to eat the “knife”, leading young children to protest, “no, that’s our knife!” However, further questions remain regarding young children’s understanding that the norms associated with status operate context-specifically. For instance, in adult practice, using a playing card to fan oneself may be perfectly acceptable during a casual conversation. But this would be considered highly inappropriate within the context of a game of bridge. Similarly, a given card may be considered a high-value trump in one game but the lowest value card in another, and so it ought to be treated differently according to the social context. Whether young children understand that social norms operate relative to particular practices and contexts remains unclear. We, therefore, ran two studies in order to establish whether young children understand the context-specificity of social norms in their joint pretense (Wyman et al. 2009a). Specifically, we investigated whether they might identify certain behaviors as norm violations when they were performed within a particular normative context (a game), but not outside that context. However, we also explored whether they might differentiate between different normative contexts (different games), by identifying actions as violations in one context but not in a different normative context. Lastly, in addition to their ability to identify norm violations, we investigated their motivation to actually enforce norms through their active linguistic protest. In the first study, the child and an experimenter took an object with a conventional function (such as a pencil) and used it together in its conventional way (i.e., used it to draw with). They then assigned it a pretend status (such as “toothbrush”) and proceeded to pretend with it. After this, a puppet entered and in all cases drew with the pencil. However, sometimes he declared an intention beforehand to join the game (saying “I’ll play the toothbrush game too”) and so his drawing ought to have been deemed inappropriate. In other cases he refrained from joining (declaring that he’d prefer to draw), such that his action ought to have been of no particular consequence. The result was that young children protested normatively when the puppet first joined the game, but then failed to play by the rules operative within it (they, for instance, exclaimed “No, you should brush your teeth!”). However, when the puppet performed exactly the

144

E. Wyman and H. Rakoczy

same action, without having first joined the game, children left him in peace, and sometimes actively consented (e.g., commenting “yes, let’s draw”). In the second study, two alternative normative contexts were set up in the form of two different pretend games. This time, the child and adult took an object with no clear function (such as a stick). Then, over at “Bob the builder’s house”, the child and adult decided to place hats “just like Bob’s” on their heads, and to pretend the object was, for example, a “toothbrush”. Afterward they moved to a different location, and there at the “Zoo table” placed their “zoo-keeper hats” on and pretended the object was something different, such as a “spoon”. Lastly, a puppet entered and in all instances performed the same action (such as pretend “tooth brushing”). However, sometimes he first moved to the zoo table and wore a zoo-keeper hat, so his action ought to have been observed as inappropriate. But at other times he first went to Bob’s house and wore his “Bob hat” so his actions should have been unproblematic. The result was that children protested when the puppet did pretend tooth brushing while at the zoo table (and wearing the zoo keeper hat). However, they failed to protest when he performed exactly the same action at Bob the builder’s house (and wearing a Bob the Builder hat). They, therefore, appear to understand the context-specificity of normative rules in their pretend games. It is quite striking that 3 year old children identify the actions of a character as a normative violation when he has joined a particular context, but not when he performs exactly the same action outside it (the first study), or in a different context (the second study). And this understanding of context-specificity appears to be fairly flexible: they ably use not only verbal declarations as indications of entry into a particular context, but also movement between spatial locations, and the wearing of appropriate attire. Most impressively, young children not only identify normative violations, but actively police them through their verbal protests. Overall, this implies a relatively sophisticated understanding of social norms and their context-specificity, as well as some degree of personal commitment to regulating those norms. The question of whether chimpanzee behavior is normatively governed, or whether chimpanzees have any normative awareness, is a challenging one. The most convincing signs of normative awareness in children are not simply their following such rules, but their verbal protest at violations of them (e.g., “No! You shouldn’t do that”), and this is obviously not possible in nonhuman primates. However while more implicit methods of assessment must be relied upon, even these show no indications of normative regulation in chimpanzees (Tomasello 2009). As mentioned, chimpanzees do not wait for or try to reengage partners who cease to coordinate with them during a joint task (Warneken et al. 2006). But in other tasks involving norms of fairness and generosity, divergence in the behavior of children and chimpanzees is also evident. For instance, in “dictator games” (in which children must simply split a resource between themselves and another party), children tend to make fair, that is, roughly equal offers despite the fact that this leads to personal loss (Gummerum et al. 2008; Takezawa et al. 2006). Relatedly, in “ultimatum games” (in which offers may be rejected, such that

Social Conventions, Institutions, and Human Uniqueness

145

neither party receives anything), young children tend to reject low offers, apparently perceiving them as unfair (Sutter and Matthias 2007; Takezawa et al. 2006). In addition, as early as 7 years of age children indicate a general aversion to inequality, preferring an equal split, even to one in which they themselves would receive more (Fehr et al. 2008). In contrast to these apparent concerns for fairness in children, chimpanzees show no preference for distributing equal amounts of food to themselves and a conspecific over retrieving that same amount of food for themselves only (Jensen et al. 2006; Silk et al. 2005). They act as “rational maximizers” in the ultimatum game, making low offers and rationally accepting any nonzero offers (Jensen et al. 2007). And they show no signs of inequality aversion (Br€auer et al. 2006). In sum, there are no indications yet that chimpanzee actions are governed by social norms. Normative actions and instincts appear to be human-specific.

7 Why Are Social Conventions and Institutions Human-Specific? The question of why evolution has produced a conventional, symbolically mediated system of institutionalized cooperation in humans, but not in our primate relatives, is profound. Indeed, only a proximate explanation has been offered here, to the effect that social-cognitive differences between humans and chimpanzees support qualitatively different types of social interaction. This has resulted in social institutional practices in humans but not in our evolutionary cousins. Therefore, after summarizing the critical social-cognitive differences in human and chimpanzee social interaction, some speculations will be offered as to why these differences emerged in the first place. Proposals regarding the ultimate causes of inter-species divergence will be along three lines: (1) general competitive constraints on chimpanzee social-cognition and behavior, (2) the emergence of high-fidelity social learning mechanisms and group selection processes in humans, and (3) the emergence of a social egalitarian political organization in our evolutionary history. Divergence in human and chimpanzee social-cognitive abilities is already apparent, when human toddlers in their second year of life begin to engage in collective intentional action defined by joint goals and commitments (Tomasello et al. 2005). The goal structure of collective intentional action enables the emergence of joint attention (Tomasello 2009). This acts as a “coordination device”, by which children assess whether they and their partners are sharing attention to critical aspects of their environment in order to cooperate (Wyman et al. submitted). Joint attention thus seems to go some way for children in establishing the mutual expectations required for coordinating towards conventional forms of cooperative action. The joint goals and commitments entailed in instrumental cooperation are soon after employed in coordinating joint fictional activities in which children assign conventional and symbolic status to objects with others (Wyman et al. 2009b), and even police the norms that govern these collective fictions (Wyman et al. 2009a). The

146

E. Wyman and H. Rakoczy

structure of collective intentional practice thus provides an ontogenetic foundation for the development of conventional, institutional cooperation in the form of joint goals, status assignment, and normativity (Rakoczy and Tomasello 2007). Chimpanzee coordination, by contrast, seems most accurately described in terms of the accomplishment of individual, parallel goals (Tomasello et al. 2005; Warneken et al. 2006). Without the joint goal structure of collective intentional cooperation, chimpanzees do not appear to use joint attention in their coordinated activity (Bullinger et al. in prep) and, in fact, do not develop joint attention abilities at all (Tomasello and Carpenter 2005). They, consequently, do not coordinate conventionally, engage in pretend play, assign conventional status, or engage in institutionalized forms of social interaction. And there are no indications of normative awareness in chimpanzees. So, a reasonable question at this point is why chimpanzees do not form joint goals and commitments in the first place. One potential reason is that chimpanzee coordinative activity is in general too heavily constrained by competitive motives for joint cooperative goals to emerge. For instance, under certain conditions, chimpanzees apparently fail to understand visual attention in others. Firstly, they do not preferentially beg for food from a human who can see them over one who cannot [e.g., because their eyes are covered, or their back is turned: Povinelli and Eddy 1996]. Secondly, when a person who has witnessed food being hidden under one of two containers subsequently stares at that container, they fail to use this person’s gaze to locate the food for themselves (Call et al. 1998). However, under conditions of social competition, the picture is quite different: when subordinate chimpanzees are paired with dominants in competition over food, they preferentially approach the stash that their competitor has not seen hidden (Hare et al. 2000). Similarly, they preferentially approach food that a dominant has seen placed, if he is subsequently switched with another dominant animal (Hare et al. 2001). In competitive situations, therefore, chimpanzees seem more than able to track the different events an individual has seen, as well as which individual has seen what. Likewise, the ability of chimpanzees to understand communicative cues also appears to come under heavy competitive constraint. When food is hidden under one of two containers, despite being highly motivated to find the food, they are unable to use a clear pointing gesture in order to locate it (Tomasello et al. 1997). The reasons for this are somewhat unclear, but it is telling that when the human makes visually similar, but noncommunicative gesture toward the food (such as reaching for it in order to steal it), chimpanzees fare relatively well (Hare and Tomasello 2004). Importantly, it may not be the human’s attempt to communicate per se that the animals are unable to understand. For example, when a person makes a communicative but prohibitive sign toward the food and vocalizes in prohibitive tone of voice, they easily infer its location and retrieve it for themselves (Herrmann and Tomasello 2006). This suggests that chimpanzees in competitive situations are able to use information about others’ goals in order to infer important information about the location of their food. However, they are unable to grasp cooperative and helpful attempts to direct their attention toward the same reward.

Social Conventions, Institutions, and Human Uniqueness

147

Most tellingly, chimpanzee coordination itself is highly constrained by competition. When faced with the challenge of pulling with a conspecific to retrieve food on a movable tray, the strongest predictors of chimpanzees’ success are the levels of tolerance they show in a separate feeding situation, and whether the food will be easily monopolizable after retrieval (Melis et al. 2006). One key reason, then, that chimpanzees do not appear to form joint goals and commitments may be that their social interactions occur within a framework of competitive motivations in which the danger of aggression is ever present, and the rewards eventually secured will be in dispute [see Hare and Tomasello (2005)]. That is, in environments pervaded by the threat of exploitation, it simply may not pay to have one’s intentions and attention read by others (Tomasello 2009). Without this framework of collective intentional action, it is then perhaps not surprising that chimpanzee cooperation is not normatively governed (Tomasello 2009). When individuals coordinate repeatedly with joint goals, joint attention, and joint commitments, mutual expectations that allow parties to predict the likely course of events in each cooperative scenario emerge. To the extent that these expectations come to be considered as legitimate (see Bicchieri 2006), jointly recognized standards of action emerge. Thus, cooperation takes on a normative dimension. Over time, these patterns of expectation may become generalized, such that new individuals assume the relevant roles and the duties these entail, despite their having been established prior to those individuals’ engagement in the activity. These generalized, agent-neutral, normatively governed roles form the basis of institutionalized forms of cooperative activity. So without collective intentional action – and the mutual expectations and commitments this entails – cooperative norms and institutions apparently fail to emerge. Once communities engage in institutionalized cooperation, further norms relating to social conformity may also come into play (Tomasello 2009). Social learning in the form of imitation of local practices allows youngsters in a community to bypass trial-and-error learning and benefit from the established knowledge of a community (Tomasello et al. 1993). And the signaling of group membership through conformist behavior (as well as symbolic marking) may allow individuals to identify in-group members, aiding selective imitation of their conventional wisdom as well as selective interaction with them (Boyd and Richerson 2008). In particular, if the effects of coordination failure are costly, it may pay to identify and interact with those who adhere to the same moral system. But more generally, imitation and conformist learning – in which individuals copy the most commonly observed model – may lead to the coevolution of cultural as well as genetic traits (Richerson and Boyd 2005): The idea is that conformist biases may establish enough cultural uniformity and heritable variation within groups to outweigh the diluting effects of migration between groups. This results in relatively stable group traits, such that when competition for resources or direct conflict emerges, selection may begin to operate at the group level. If cooperative cultural adaptations result in fitness advantages to some groups, those cooperative practices and their related norms will spread, as will their genetic bases. Rapid cultural or “runaway selection” (Fischer 1930) for ever-increasing levels of cooperation may

148

E. Wyman and H. Rakoczy

then occur resulting in the evolution of cooperative “social instincts” (Boyd and Richerson 2006). These include, among other things, expectations that life will be structured by cooperative and moral norms, and learning systems designed to internalize those norms (Erdal and Whiten 1996). Genes and culture coevolve to produce ultra-sociality, hyper-cooperativity, and normatively governed institutional practices. Cross-species differences in imitation capabilities may thus contribute to cultural divergence between chimpanzees and humans in two key ways. Firstly, the tendency of children, in contrast to chimpanzees, to copy actions rather than their results [see, for example, Call et al. (2005)] may represent a high-fidelity social learning mechanism in humans, particularly crucial for the acquisition of complex or conventional actions [that no individual may plausibly invent themselves, Tennie et al. (2009)]. The consequence appears to have been a “cultural ratcheting” process in humans. Particular skills and artifacts have been maintained cross-generationally with new modifications accumulating through time, rather than being lost and reinvented with each generation (Tomasello 1999). This process may go some way in explaining the massive discrepancy that exists in the quantity and complexity of chimpanzee and human material cultures [see Marshall-Pescini and Whiten (2008) for results in line with this]. Secondly, chimpanzee social learning mechanisms may have failed to produce the degree of cultural uniformity within groups necessary for selection processes to begin to favor cooperation at the group level. However, group-level selection for cooperation presents an inherent “free-rider” problem: Once cooperation has become routine, it pays any individual to refrain from contributing but nevertheless to enjoy the reward, thus destabilizing group cooperation altogether. So key to the evolution of cooperation appears to be some punishment mechanism that penalizes and deters cheating (Boyd and Richerson 1992). Indeed, moralistic punishment may effectively stabilize group-wide cooperation, and if the form of punishment is severe enough, it may only have to be meted out rarely (Boyd and Richerson 2006). It also seems that, at least in theory, punishment can potentially stabilize any trait or norm (adaptive or otherwise), producing massive variation in the content of human conventional practices (Boyd and Richerson 1992). Despite this, however, there is striking uniformity in the social norms that appear to have stabilized modes of early human social organization. In particular, it seems that moralistic punishment of social dominance may have led to the evolution of egalitarian social structure in human evolution, similar to that seen today in smallscale, mobile foraging groups (Boehm 1999; Erdal and Whiten 1996; Knauft 1991). In these societies, the development of social leveling mechanisms in the form of unfavorable social opinion [see also, Panchanathan and Boyd (2004)], social exclusion, and direct punishment appear to have focused quite specifically on regulating the actions of individuals who try to gain physical or political dominance over others. This shows up most clearly in cross-cultural norms against physical aggression, monopolization of sexually active females, and food sharing norms (Boehm 2008). And these norms seem to have resulted in modes of egalitarian organization that are critically divergent from the hierarchical and dominancebased systems that characterize chimpanzee social life (Knauft 1991). Part of the

Social Conventions, Institutions, and Human Uniqueness

149

puzzle of why chimpanzee’s social-cognitive reasoning is limited in cooperative contexts and does not involve collective intentional cooperation may be that the overarching political structure of chimpanzee social organization simply is not conducive to this. In line with this, modern day egalitarian societies also positively sanction quite specific forms of activity: cooperation, generosity, resource sharing, and aid (Boehm 2008). These behaviors are rewarded with favorable reputation, political alliances (especially in the form of marriage), increased opportunities for cooperation, and resource support in times of scarcity. In searching for the evolutionary home of collective intentionality, therefore, it seems important that the egalitarian political structures that appear to have characterized significant phases of human evolution (Knauft 1991) centrally involve mechanisms that curb social dominance by punishment and positively prescribe cooperation at the individual. It may be that this kind of political context constituted an evolutionary precondition for the emergence of institutionalized forms of cooperation such as cooperative hunting (Hill 1982), resource sharing (Gurven 2004), and allocare (Hrdy 2009) underpinned by collective intentionality.

8 Summary and Conclusions A comprehensive account of the character of conventional, institutionalized cooperation and the reasons for its emergence in the hominin lineage will not derive from one particular discipline of research. A full picture will require insights from evolutionary thinking in biology, anthropology, psychology, linguistics, human and primate behavioral ecology, and sociology to name but a few key areas. Broadly, the contribution that developmental psychology can offer to investigations of human-specific forms of cooperation is unique in documenting some of the cognitive prerequisites and contexts in which young children begin to engage in collective intentional activity with a conventional and “proto-institutional” structure. And comparative psychological research can serve to pinpoint cognitive divergences between humans and chimpanzees that have plausibly contributed to cultural divergence in modes of cooperation. But this psychological perspective is especially critical to our understanding of conventional, institutional, and symbolic practice because these activities are governed by rules that have no existence outside our common recognition and acceptance that they exist: their ontological status and normative force are fundamentally dependent on our collective cognitions. Collective intentional cooperation emerges in young children in their second year of life, as they begin to coordinate with others with joint goals and commitments (Tomasello et al. 2005). In these contexts, joint attention emerges in which young children not only monitor but share attention with others to aspects of their environment. Children then use joint attention to mediate these activities, indicating a concern with managing mutual expectations in their joint projects with others

150

E. Wyman and H. Rakoczy

(Wyman et al. submitted). Their coordination thus takes on a conventional character. It is not long before young children begin to incorporate objects into their coordinations and, together with others, to invest these with symbolic status in their fictional play (Wyman et al. 2009b). In these situations, their social interactions begin to resemble adult institutional practice in rudimentary form, involving status functions assigned by constitutive rules and social norms (Wyman et al. 2009a). In contrast to Piaget (1932) who classified young children’s games as either symbolic or rule governed, Vygotsky (1978) perceptively recognized the rulegoverned basis of social pretense: A key observation was that “the development from games with an overt imaginary situation and covert rules, to games with overt rules and a covert imaginary situation outlines the evolution of children’s play from one pole to the other” (pg 96). But this transition within the domain of young children’s play may more broadly describe the general process by which children are enculturated into the social practices of their communities. Children indeed start out engaging in collective imaginings with others in their play, and these activities are governed largely by unarticulated norms that emanate from the imposition of pretend status via constitutive rules. But they must later come to grasp the more serious and widely recognized constitutive rules that define institutional practices such as marriage and exchange. This eventually entails taking part in the prescribed imaginings (Walton 1990), or “collective fictions” of their community, and consequently following normatively governed courses of action. The development from engagement in practices with overt imaginary content and covert rules to those with overt rules but covert – or less obvious – imaginary content describes children’s progressive admission into conventional and institutional life. That chimpanzees do not engage in social pretense may be symptomatic of, and simultaneously contribute to, an absence of institutional cooperation in their species. Without the framework of collective intentional action involving joint goals, commitments, and joint attention, there may be no cooperative foundation to support the assignment of conventional, symbolic status and rules of conduct either in play or in their more serious affairs. But without pretend play, there is no “developmental cradle”, no proto-institutional activity in which chimpanzees can get an initial grip on the underlying structures of institutionalized cooperation. However, disparities between children’s and chimpanzees’ propensities to form collective intentions only make sense against a broader background of species divergence in relative levels of competition and cooperation. Across several domains (namely, understanding visual attention, nonverbal communication, and coordination) chimpanzee social-cognition appears to excel in competitive contexts, and to be constrained in analogous but cooperative situations. This implies that chimpanzee social interaction in general may occur in contexts of competitive motivation. Against the potential threat of competitive exploitation, it may not pay chimpanzees to, for example, inform others about valuable resources in the environment, establish shared attention to those resources, or to commit to joint action in order to retrieve them. But since no other ape engages in institutionalized forms of cooperation, this competitive model may represent the phylogenetically primitive state that characterized the common ancestor to humans and chimpanzees. Therefore,

Social Conventions, Institutions, and Human Uniqueness

151

this simply raises further questions as to how it came to be that cooperative or “trusting” motivations ever emerged in the hominin lineage. Both group selection theories (Richerson and Boyd 2005) and antidominance theories (Boehm 1999; Erdal and Whiten 1996) posit the emergence of moralistic punishment as critical to the emergence of cooperation in humans. However, group selection theories emphasize the function of punishment as an evolutionary stabilizing mechanism, rather than the content of what it stabilizes [see Boyd and Richerson (1992)]. Antidominance theories, by contrast, suggest more specifically that the initial evolutionary function of punishment was to police members of early hominin communities who aggressed others in acts of social dominance. By these accounts, the original social norms to emerge in evolution were those effecting sociopolitical egalitarianism, enforced by social subordinates with fitness interests in abolishing hierarchical social order (Knauft 1991). Such a context may have provided some respite from the threat of aggression and competition that appears to constrain chimpanzee social interaction, and a concomitant elaboration and variation of existent forms of cooperative activity. If existing advantages accrued to especially effective cooperators [perhaps initially through mutualistic gain, see Roberts (2005)], selection may have come to favor those who not only coordinated their actions behaviorally with others, but coordinated their expectations through the mutual monitoring of attention. While these may seem like rather basic building blocks, coordinated actions based on mutual expectations and attention monitoring hold the seeds of collective intentionality. As cooperation with these characteristics becomes routine, expectations coordinated via mutual attention monitoring may come to be recognized as legitimate by the parties involved. This results in a “bottom-up” form of normativity (in contrast to the “top-down” community norms specifying that individuals cooperate), whereby they not only coordinate toward goals but also recognize mutually binding commitments to those goals. The deontic obligations and rights now inherent to joint activity come to define specific cooperative roles that persist through time. And, also by collective intention, both people and objects may be assigned symbolic status in public representations of these rights and obligations. In this way, the evolutionary emergence of collective intentionality may have given rise to conventional and institutionalized forms of cooperation in the human lineage.

References Abelev M, Markman E (2006) Young children’s understanding of multiple object identity: appearance, pretense and function. Dev Sci 9:6 Bicchieri C (2006) The grammar of society: the nature and dynamics of social norms. Cambridge University Press, Cambridge Biro D, Matsuzawa T (2001) Use of numerical symbols by the chimpanzee (pan troglodytes): cardinals, ordinals, and the introduction of zero. Anim Cogn 4(3–4):193–199

152

E. Wyman and H. Rakoczy

Boehm C (1999) Hierarchy in the forest: the evolution of egalitarian behavior. Harvard University Press, Cambridge, MA Boehm C (2008) Purposive social selection and the evolution of human altruism. Cross Cult Res: J Comp Soc Sci 42(4):319–352 Boesch C, Boesch H (1989) Hunting behavior of wild chimpanzees in the taı¨ national park ivory coast. Am J Phys Anthropol 78(4):547–573 Boesch C, Boesch H (1990) Tool use and tool making in wild chimpanzees. Folia Primatol 54 (1–2):86–99 Boyd R, Richerson PJ (1992) Punishment allows the evolution of cooperation (or anything else) in sizable groups. Ethol Sociobiol 13(3):171–195 Boyd R, Richerson PJ (2006) Culture and the evolution of the human social instincts. In: Enfield NJ, Levinson SC (eds) Roots of human sociality: culture, cognition, and interaction. Berg Publishers, Oxford, pp 453–477 Boyd R, Richerson PJ (2008) Gene-culture coevolution and the evolution of human social institutions. In: Engel C, Singer W (eds) Better than consciousness? Decision making, the human mind and implications for institutions. MIT, Cambridge Br€auer J, Call J, Tomasello M (2006) Are apes really inequity averse? Proc R Soc Lond B Biol Sci 273(1605):3123–3128 Brink I (2001) Attention and the evolution of communication. Pragmat Cogn 9(2):259–277 Bruell MJ, Woolley J (1998) Young children’s understanding of diversity in pretence. Cogn Dev 13:257–277 Bruner J (1983) Child’s talk: learning to use language. Norton, New York Bullinger A, Wyman E, Melis A, Tomasello M (in press) Chimapnzees, coordination in a ‘stag hunt’ game. International Journal of Primatology Call J, Carpenter M, Tomasello M (2005) Copying results and copying actions in the process of social learning: chimpanzees (pan troglodytes) and human children (homo sapiens). Anim Cogn 8(3):151–163 Call J, Hare BA, Tomasello M (1998) Chimpanzee gaze following in an object-choice task. Anim Cogn 1(2):89–99 Call J, Tomasello M (2007) The gestural communication of apes and monkeys. Lawrence Erlbaum Associates, New York Call J, Tomasello M (2008) Does the chimpanzee have a theory of mind? 30 years later. Trends Cogn Sci 12(5):187–192 Campbell J (2005) Joint attention and common knowledge. In: Eilan N, Hoerl C, McCormack T, Roessler J (eds) Joint attention, communication and other minds: issues in philosophy and psychology. Clarendon, New York, pp 287–297 Carpenter M, Nagell K, Tomasello M (1998) Social cognition, joint attention, and communicative competence from 9 to 15 months of age. Monogr Soc Res Child Dev 63(4):1–143 Castoriadis C (1998) The imaginary institution of society. MIT, Cambridge Clark H, Marshall CR (1981) Definite reference and mutual knowledge. In: Joshi AK, Webber B, Sag I (eds) Elements of discourse understanding. Cambridge University Press, Cambridge, pp 10–63 Deloache J, Pierroutsakos S, Uttal D (2003) The origins of pictorial competence. Curr Dir Psychol Sci 19(3):114–118 Erdal D, Whiten A (1996) Egalitarianism and machiavellian intelligence in human evolution. In: Mellars P, Gibson KR (eds) Modelling the early human mind. McDonald Institute Monographs, Cambridge, pp 139–150 Fehr E, Bernhard H, Rockenbach B (2008) Egalitarianism in young children. Nature 454 (7208):1079–1083 Fischer R (1930) The genetical theory of natural selection. Clarendon, Oxford Flavell J, Flavell E, Green F (1987) Young children’s knowledge about the apparent-real and pretend-real distinctions. Dev Psychol 23(6):816–822

Social Conventions, Institutions, and Human Uniqueness

153

Fouts RS (1972) Use of guidance in teaching sign language to a chimpanzee (pan troglodytes). Q J Exp Psychol B 80(3):515–522 Gardner RA, Gardner BT (1969) Teaching sign language to a chimpanzee: a standardized system of gestures provides a means of 2 way communication with a chimpanzee. Science 165 (3894):664–672 Gibson E, Rader N (1979) Attention: the perceiver as performer. In: Hale G, Lewis M (eds) Attention and cognitive development. Plenum, New York, pp 6–36 Gilbert M (1989) On social facts. Princeton University Press, Oxford Gilby IC, Eberly LE, Wrangham RW (2008) Economic profitability of social predation among wild chimpanzees: individual variation promotes cooperation. Anim Behav 75(2):351–360 Go´mez J-C (2007) Pointing behaviors in apes and human infants: a balanced interpretation. Child Dev 78(3):729–734 Go´mez J-C, Martı´n-Andrade B (2005) Fantasy play in apes. In: Pellegrini AD, Smith PK (eds) The nature of play: great apes and humans. Guilford, New York, pp 139–172 Go´mez JC, Martı´n-Andrade B (2002) Possible precursors of pretend play in nonpretend actions of captive gorillas (gorilla gorilla). In: Mitchell RW (ed) Pretending and imagination in animals and children. Cambridge University Press, Cambridge, pp 255–268 Gopnik A, Slaughter V (1991) Young children’s understanding of changes in their mental states. Child Dev 62(1):98–110 Gr€afenhain M, Behne T, Carpenter M, Tomasello M (2009) Young children’s understanding of joint commitments. Dev Psychol 45(5):1430–1443 Greenberg J, Hamann K, Warneken F, Tomasello M (in press) Chimpanzee helping in collaborative and non-collaborative contexts. Anim Behav Greenfield PM, Savage-Rumbaugh ES (1990) Grammatical combination in pan paniscus: process of learning and invention in the evolution and development of language. In: Parker ST, Gibson KR (eds) “Language” and intelligence in monkeys and apes: comparative developmental perspetives. Cambridge University Press, Cambridge, UK, pp 540–578 Gurven M (2004) To give and to give not: the behavioral ecology of human food transfers. Behav Brain Sci 27(4):543–559 Gummerum M, Keller M, Takezawa M, Jutta M (2008) To give or not to give: children’s and adolescents’ sharing and moral negotiations in economic decision situations. Child Dev 79 (3):562–576 Haight W, Millar P (1992) The development of everyday pretend: a longitudinal study of mothers’ participation. Merrill Palmer Q 38(3):331–349 Hamann K, Warneken F, Tomasello M (in press). Children’s developing commitments to joint goals. Ch Dev Hare B, Call J, Agnetta B, Tomasello M (2000) Chimpanzees know what conspecifics do and do not see. Anim Behav 59(4):771–785 Hare B, Call J, Tomasello M (2001) Do chimpanzees know what conspecifics know? Anim Behav 61(1):139–151 Hare B, Tomasello M (2004) Chimpanzees are more skilful in competitive than in cooperative cognitive tasks. Anim Behav 68(3):571–581 Hare B, Tomasello M (2005) The emotional reactivity hypothesis and cognitive evolution. Trends Cogn Sci 9(10):464–465 Harris P, Kavanaugh R (1993) Young children’s understanding of pretense. Monogr Soc Res Child Dev 58(1):1–92 Harris P, Nunez M (1996) Understanding of permission rules by preschool children. Child Dev 67 (4):1572–1591 Harris P, Nunez M, Brett C (2001) Let’s swap: early understanding of social exchange by British and Nepali children. Mem Cognit 29(5):757–764 Herrmann E, Tomasello M (2006) Apes’ and children’s understanding of cooperative and competitive motives in a communicative situation. Dev Sci 9(5):518–529

154

E. Wyman and H. Rakoczy

Hickling AK, Wellman HM, Gottfried GM (1997) Preschoolers’ understanding of others’ mental attitudes towards pretend happenings. Br J Dev Psychol 15(3):339–354 Hill K (1982) Hunting and human evolution. J Hum Evol 11(6):521–544 Hill K, Barton M, Hurtado AM (2009) The emergence of human uniqueness: characters underlying behavioral modernity. Evol Anthropol Issues News Rev 18(5):187–200 Hrdy S (2009) Mothers and others: the evolutionary origins of mutual understanding. Belknap, Cambridge, MA Huffman MA, Hirata S (2004) An experimental study of leaf swallowing in captive chimpanzees: insights into the origin of a self-medicative behavior and the role of social learning. Primates 45(2):113–118 Humle T, Matsuzawa T (2002) Ant-dipping among the chimpanzees of bossou, guinea, and some comparisons with other sites. Am J Primatol 58(3):133–148 Jensen K, Call J, Tomasello M (2007) Chimpanzees are rational maximizers in an ultimatum game. Science 318(5847):107–109 Jensen K, Hare B, Call J, Tomasello M (2006) What’s in it for me? Self regard precludes altruism and spite in chimpanzees. Proc R Soc B 273:1013–1021 Kalish C (1998) Reasons and causes: children’s understanding of conformity to social and physical laws. Child Dev 69(3):706–720 Kalish C, Shiverick SM (2004) Children’s reasoning about norms and traits as motives for behaviour. Cogn Dev 19:410–416 Knauft BM (1991) Violence and sociality in human evolution. Curr Anthropol 32(4):391–428 Lewis D (1969) Convention: a philosophical study. Harvard University Press, Cambridge Lillard AS (1993) Young children’s conceptualization of pretense: action or mental representational state? Child Dev 64(2):372–386 Liszkowski U (2005) Human twelve-month-olds point cooperatively to share interest with and provide information for a communicative partner. Gesture 5(1/2):135–154 Liszkowski U, Carpenter M, Henning A, Striano T, Tomasello M (2004) Twelve-month-olds point to share attention and interest. Dev Sci 7(3):297–307 Liszkowski U, Carpenter M, Striano T, Tomasello M (2006) 12- and 18-month-olds point to provide information for others. J Cogn Dev 7(2):173–187 Lyn H, Greenfield P, Savage-Rumbaugh S (2006) The development of representational play in chimpanzees and bonobos: evolutionary implications, pretense, and the role of interspecies communication. Cogn Dev 21(3):199–213 Marshall-Pescini S, Whiten A (2008) Chimpanzees (Pan troglodytes) and the question of cumulative culture: an experimental approach. Anim Cogn 11:449–456 Melis AP, Hare B, Tomasello M (2006) Engineering cooperation in chimpanzees: tolerance constraints on cooperation. Anim Behav 72(2):275–286 Mitani JCC, Watts DP (2005) Correlates of territorial boundary patrol behaviour in wild chimpanzees. Anim Behav 70(5):1079–1086 Panchanathan K, Boyd R (2004) Indirect reciprocity can stabilize cooperation without the secondorder free rider problem. Nature 432(7016):499–502 Peacocke C (2005) Joint attention: its nature, reflexivity, and relation to common knowledge. In: Eilan N, Hoerl C, McCormack T, Roessler J (eds) Joint attention, communication and other minds: issues in philosophy and psychology. Clarendon/Oxford University Press, New York, NY, pp 298–324 Piaget J (1932) The moral judgment of the child. Keegan Paul, London Pika S, Liebal K, Call J, Tomasello M (2005) The gestural communication of apes. Gesture 5(1–2):41–56 Plotkin H (2003) The imagined world made real: towards a natural science of culture. Rutgers University Press, New Jersey Povinelli DJ, Eddy TJ (1996) What young chimpanzees know about seeing. Monogr Soc Res Child Dev 61(3):1–152

Social Conventions, Institutions, and Human Uniqueness

155

Rakoczy H (2006) Pretend play and the development of collective intentionality. Cogn Syst Res 7:113–127 Rakoczy H (2007) Play, games and the development of collective intentionality. New Directions in Child and Adolescent Development (Special issue on “Conventionality”) 115:53–68 Rakoczy H (2008) Taking fiction seriously: young children understand the normative structure of joint pretend games. Dev Psychol 44(4):1195–1201 Rakoczy H, Tomasello M (2006) Two-year-olds grasp the intentional structure of pretense acts. Dev Sci 9(6):558–565 Rakoczy H, Tomasello M (2007) The ontogeny of social ontology: steps to shared intentionality and status functions. In: Tsohatzidis SL (ed) Intentional acts and institutional facts: essays on john searle’s social ontology. Springer, Berlin Rakoczy H, Tomasello M, Striano T (2004) Young children know that trying is not pretending: a test of the “Behaving-as-if” construal of children’s early concept of pretense. Dev Psychol 40 (3):388–399 Rakoczy H, Tomasello M, Striano T (2005a) On tools and toys: how children learn to act on and pretend with ‘virgin objects’. Dev Sci 8(1):57–73 Rakoczy H, Tomasello M, Striano T (eds) (2005b) How children turn objects into symbols: a cultural learning account. Erlbaum, New York Rakoczy H, Warneken F, Tomasello M (2008) The sources of normativity: young children’s awareness of the normative structure of games. Dev Psychol 44(3):875–881 Rawls J (1955) Two concepts of rules. Philos Rev 64(1):3–32 Richerson PJ, Boyd R (2005) Not by genes alone: how culture transformed human evolution. University of Chicago Press, Chicago Rivas E (2005) Recent use of signs by chimpanzees (pan troglodytes) in interactions with humans. [Original]. J Comp Psychol 119(4):404–417 Roberts G (2005) Cooperation through interdependence. Anim Behav 70:901–908 Rousseau J (1968/1762) The social contract. Penguin, London Savage-Rumbaugh ES, McDonald K, Sevcik RA, Hopkins WD, Rubert E (1986) Spontaneous symbol acquisition and communicative use by pygmy chimpanzees (pan paniscus). J Exp Psychol Gen 115(3):211–235 Searle J (2005) What is an institution? J Inst Econ 1(1):1–22 Searle JR (1995) The construction of social reality. Free, New York Silk JB, Brosnan SF, Vonk J, Henrich J, Povinelli DJ, Richardson AS et al (2005) Chimpanzees are indifferent to the welfare of unrelated group members. Nature 437(7063):1357–1359 Skyrms B (2004) The stag hunt and the evolution of social structure. Cambridge University Press, Cambridge Sutter Z, Matthias Z (2007) Outcomes versus intentions: on the nature of fair behavior and its development with age. Ergebnisse versus absichten: Zur natur fairen verhaltens und seine entwicklung mit zunehmendem alter. J Econ Psychol 28(1):69–78 Takezawa M, Gummerum M, Keller M (2006) A stage for the rational tail of the emotional dog: roles of moral reasoning in group decision making. Eine buehne fuer den rationalen schwanz des emotionalen hundes: Rollen der moralischen argumentation bei der entscheidungsfindung in gruppen. J Econ Psychol 27(1):117–139 Tennie C, Call J, Tomasello M (2009) Ratcheting up the ratchet: on the evolution of cumulative culture. Philos Trans R Soc Lond B Biol Sci 364(1528):2405–2415 Tollefson D (2005) Let’s pretend! Children and joint action. Philos Soc Sci 35(1):75–97 Tomasello M (1995) Joint attention as social cognition. In: Moore C, Dunham P (eds) Joint attention: its origin and role in development. Erlbaum, Hillsdale, NJ, pp 103–130 Tomasello M (1999) The cultural origins of human cognition. Harvard University Press, Cambridge, MA Tomasello M (2009) Why we cooperate. MIT, Cambridge, MA Tomasello M, Call J, Gluckman A (1997) Comprehension of novel communicative signs by apes and human children. Child Dev 68(6):1067–1080

156

E. Wyman and H. Rakoczy

Tomasello M, Carpenter M (2005) The emergence of social cognition in three young chimpanzees. Monogr Soc Res Child Dev 70(1):1–132 Tomasello M, Carpenter M, Call J, Behne T, Moll H (2005) Understanding and sharing intentions: the origins of cultural cognition. [Original]. Behav Brain Sci 28(5):675–735 Tomasello M, Kruger AC, Ratner HH (1993) Cultural learning. Behav Brain Sci 16(3):495–511 Vygotsky LS (1978) Mind in society: the development of higher psychological processes. Harvard University Press, Cambridge, MA Walton K (1990) Mimesis as make-believe: on the foundation of the representational arts. Harvard University Press, Harvard Warneken F, Chen F, Tomasello M (2006) Cooperative activities in young children and chimpanzees. Child Dev 77(3):640–663 Watts DP (1998) Coalitionary mate guarding by male chimpanzees at ngogo, kibale national park, uganda. Behav Ecol Sociobiol 44(1):43–55 Watts DP, Mitani JCC (2002) Hunting behavior of chimpanzees at ngogo, kibale national park, uganda. Int J Primatol 23(1):1–28 Whiten A, Goodall J, McGrew WC, Nishida T, Reynolds V, Sugiyama Y et al (1999) Cultures in chimpanzees. Nature 399(6737):682–685 Whiten A, Horner V, de Waal FBM (2005) Conformity to cultural norms of tool use in chimpanzees. Nature 437(7059):737–740 Wyman E, Rakoczy H, Tomasello M (2009a) Normativity and context in young children’s pretend play. Cogn Dev 24:149–155 Wyman E, Rakoczy H, Tomasello M (2009b) Young children understand multiple pretend identities in their object play. Br J Dev Psychol 27(2):385–404 Wyman E, Rakoczy H, Tomasello M (submitted). Joint attention enables children’s coordination in a ‘stag hunt’ game

The Continuity of Evolution and the Special Character of Humans: Concluding Overview Michael Forster and Wolfgang Welsch

The general approach to understanding human beings that is exemplified in these chapters goes back to Charles Darwin, and it may therefore be helpful to remind readers of the main lines of Darwin’s position. In his seminal work On the Origin of Species (1859), Darwin developed a thoroughly nonreligious, naturalistic theory of organic nature, which incorporated key general principles such as that species evolve from each other, they do so as parts of a unified tree of life, and the central mechanism of evolution is natural selection. However, he left the most controversial aspect of his account implicit rather than explicit: its inclusion of human beings. In contrast, in The Descent of Man (1871), he explicitly brought human beings within the compass of the same general principles. Accordingly, the opening chapters of the book make a powerful systematic case for human beings’ continuity with the rest of animal nature. The case comprises several parts. First, Darwin devotes a chapter to arguing that human beings are physiologically continuous with other animals: Many features of the human body are homologous with corresponding features in other species (e.g., the human hand with the horse’s foot and the seal’s flipper). This saliently includes the human brain, which Darwin indeed argues is extremely similar in structure to that of the apes. Other features of the human body are rudiments of features presumably once possessed by our animal ancestors and still possessed by other animals (e.g., the modicum of hair on our bodies, and our os cocyx). In addition, Darwin emphasizes that human embryos are very similar in their features to the embryos of other mammals. Second, Darwin argues that human beings are also continuous with other animals in their mental features. These differ only in degree rather than in kind; in particular, there is no fundamental difference in mental features between human

M. Forster (*) Department of Philosophy, University of Chicago, Stuart Hall 203, Chicago, IL, USA e-mail: [email protected] W. Welsch Institute of Philosophy, Friedrich-Schiller-University Jena, Zw€atzengasse 9, 07743 Jena, Germany e-mail: [email protected]

W. Welsch et al. (eds.), Interdisciplinary Anthropology, DOI 10.1007/978-3-642-11668-1_7, # Springer-Verlag Berlin Heidelberg 2011

157

158

M. Forster and W. Welsch

beings and higher mammals. This part of Darwin’s case has more vitality for us than the physiological part. For it is both more detailed and more controversial. In particular, it contradicts not only the mechanistic view of animals that was held by the seventeenth-century Cartesians, but also the (methodological) behaviorism that dominated biology during the first half of the twentieth century (until animal mentality began to be taken seriously again by cognitive ethologists such as Konrad Lorenz and Donald Griffin). This part of Darwin’s case has two distinguishable sides. He begins by considering a series of mental features whose commonness to human beings and other animals he considers to be relatively obvious and uncontroversial, showing with specific and often striking examples that the features in question are indeed held in common between human beings and other animals. The main features he considers here are instincts, emotions, curiosity, imitation, attention, memory, imagination (including dreaming), and reasoning (including planning). In connection with the last of these features, he notes, for example, that hunting dogs faced with practical dilemmas while retrieving birds sometimes invent intelligent solutions that transcend and indeed contradict their training. Next, he goes on to consider a series of mental features whose ascription to nonhuman animals seems to him more controversial, features that have indeed been alleged by some people to be uniquely human. He again shows that in fact the features in question are present in some other animals, or at least very closely approximated by them. These features include progressive improvement (he holds that this occurs in some animals, while also conceding that it is much enhanced in human beings because of their preservation of information through language); the use of tools (here he adduces several positive examples, among them chimpanzees’ employment of rocks to crack nuts); abstraction and the use of general concepts (he notes ingeniously that a dog perceiving another dog or a human at a distance will often initially react with hostility but then change attitude upon recognizing it/him as a friend, thereby showing that it was initially entertaining a general concept); self-consciousness (he is at least open to the idea that some animals have a form of self-consciousness); language (here he notes several examples of language use by animals, including the alarm cries of certain species of monkeys and the intelligent use of language by certain birds, especially parrots); a sense of beauty (here he emphasizes that a sense for visible and audible beauty plays a central role in the sex lives of animals); and even a belief in God and engagement in religion (in this connection, he argues that if one conceives a belief in God in a minimal enough way to cover all human beings, and so merely as a conviction in unseen spiritual agencies, then some animals seem to share such a conviction, e.g., dogs; and that the attitude of religious devotion is likewise at least closely approximated by some animals, e.g., by dogs in their reverence for their masters). Finally, Darwin considers another feature of human mental life, which he takes to be the most impressive example of a feature that seems to be uniquely human, namely, the moral sense. He does not quite deny that this feature is uniquely human. But he argues that it is merely a special development of other features that are not unique to humans, in particular our character as social animals and our intelligence.

The Continuity of Evolution and the Special Character of Humans

159

To be more exact, he argues that any creature endowed with a social character, if also endowed with the level of intelligence that we humans have, would develop a moral sense. He also notes that there are at least close approximations to our moral sense in some animals – for example, that dogs seem to have something like a conscience, and that animals sometimes seem to have a rudimentary conception of property. In making this whole case concerning mental life, Darwin strikingly anticipated, and would receive strong confirmation from, more recent work in cognitive ethology. For example, his illustrations of animal reasoning have been richly amplified in the work of Donald Griffin. His comments on progressive improvement in some animals find confirmation in Christophe Boesch’s recent work on chimpanzee culture. So too do his remarks on animals’ tool use, in particular his example of chimpanzees using rocks to crack nuts. Likewise, his openness to the idea that some animals possess a form of self-consciousness has recently been validated by Gordon Gallup’s ingenious mirror tests with animals. His acceptance of the idea of animal language has received confirmation as well – for example, his focus on monkeys’ alarm cries has been continued and enriched by Dorothy Cheney and Robert Seyfarth in their research on the differentiated alarm cries of vervet monkeys, and his insistence that some bird language, especially that of parrots, is intelligent has been confirmed by Irene Pepperberg in her work on the African Grey parrot. Finally, his suggestion that our moral sense is approximated by some animals has been extended and confirmed by Frans de Waal’s work on chimpanzees and other primates. The chapters in this volume are all written broadly within the framework of Darwin’s approach. But each of them explores some specific feature of human mental life in detail in the hope of throwing new light on its nature and contributing toward an answer to the question whether or not it is uniquely human. The chapter by Peter Uhlhaas and Wolf Singer, “Psychosis as Evolutionary Costs for Complexity and Cognitive Abilities in Humans”, argues that the cortical circuits responsible for complex cognition and behavior in humans have remained largely unchanged over the course of evolution; hence humans’ complex cognition is the product of the same basic cortical algorithms as were operative in earlier phases of evolution. The cognitive distinctiveness of humans derives mainly from relatively subtle physiological differences, in particular the size and architecture of the neocortex. Evolution has thus worked in a conservative, gradual way – which helps to explain why some nonhuman animals already possess rudimentary versions of such sophisticated human cognitive equipment as theory of mind and language. One possible explanation of the relatively elaborate development of the neocortex in humans lies in human social life and its cognitive demands, for there is a robust correlation across species between size of social group and brain size. Human brains emphasize information processing and exchange between cortical areas more than processing sensory inputs (roughly 90% of the connections in human brains have the former function). This corresponds to the fact that in human beings many steps intervene between the body’s registering of sensory inputs and resulting cognition/behavior. This complex mediation makes possible phenomena

160

M. Forster and W. Welsch

such as flexibility, different interpretations, multiperspectivity, and pretend play in human cognition/behavior. But the same distinctive feature of human brains also has certain maladaptive consequences, including schizophrenia, for the human brain’s emphasis on internal computations rather than sensory inputs easily leads to pathologies such as psychosis (sensory hallucinations and delusions) and excessive multiperspectivity. Rainer Mausfeld’s chapter, “Intrinsic Multiperspectivity: Conceptual Forms, and the Functional Architecture of the Perceptual System”, is concerned with multiperspectivity, especially as it relates to perception. This phenomenon is pervasive in humans, for example, in our pretend play and in our double way of relating to a theater performance, a film, or a painting. Mausfeld begins by characterizing and criticizing certain common-sense intuitions about perception and a closely related standard model of perception that is widespread in psychology. The common-sense intuitions are (1) naı¨ve realism, or a conviction in the basic veracity of our perceptions (this is refuted by physics); (2) an assumption that veridical perception is biologically adaptive (on the contrary, adaptiveness does not require veracity, since mere structural faithfulness would suffice for adaptiveness; moreover, the regularities to which organisms might be structurally faithful in their perceptions are infinitely various); and (3) an experience of our minds as unities and as directly accessing perceived reality (on the contrary, the physiology of our perceptual mechanisms shows our minds to be multiplicities and to function in indirect ways). The standard model of perception that derives from these common-sense intuitions, especially from naı¨ve realism, imputes to us sensory inputs that are already endowed with a thin conceptual structure (e.g., as “surfaces”), from which we then build up more complex concepts by processes such as induction or association. This model is mistaken both because it projects conceptual structure from the output onto the sensory inputs, and because it cannot in fact explain the leap from such inputs to more complex conceptualization. An alternative model is therefore needed. The model that is required to perform the task is similar to Noam Chomsky’s model for explaining human linguistic competence. In order to explain human perception, we must attribute to the human perceptual system a rich biologically given conceptual structure – including such concepts as “surface”, “physical object”, “intentional object”, and “self” – which is merely activated by sensory inputs (and which then forms the basis for higher order cognitive systems). The biologically given conceptual forms in question constitute a hierarchical system, built up in a modular fashion over the course of evolution. They are flexible in character, permitting closer specification in light of feedback from sensory inputs. This distinctive character of the human perceptual system brings with it an inevitable multiperspectivity that is intrinsic to the human perceptual system itself. Examples of this intrinsic multiperspectivity are the double way in which we can perceive paintings either as two-dimensional surfaces or as representations of threedimensional phenomena; and the Heider-Simmel film, whose shapes in motion we can perceive either simply as such or as agents acting.

The Continuity of Evolution and the Special Character of Humans

161

The Heider–Simmel example also illustrates how the different conceptual frameworks between which we can shift in perception are largely cued by different features of the sensory input: in this case, geometric form (which mainly cues the interpretation of the figures simply as shapes in motion) versus motion (which mainly cues the interpretation of them as agents acting). The example also illustrates how perceptual salience may have nothing to do with “reality”, since a conceptualization of the film’s contents as agents acting is most salient but least “real”. It is perhaps worth distinguishing between two aspects of Mausfeld’s chapter, the first of which could in principle be detached from the second, and might be retained even if the second were rejected. First, the chapter makes a case that human perception intrinsically involves active conceptualizations of various sorts and hence multiperspectivity. Second, the chapter also claims that the conceptual forms involved are biologically given, or innate (not derived either from sensory inputs or from culture), and moreover that this sort of perception is uniquely human (not shared with other primates). Christian Spahn’s chapter, “Prospects of Objective Knowledge”, argues that behind the tendency to deny realism that has dominated analytic philosophy since the advent of logical positivism, there are dubious Cartesian dualist and antievolutionary assumptions at work. Carnap was an early representative of the analytic tradition in question, with his insistence, in the light of the verification principle, that questions of existence only make sense relative to a conceptual scheme (only as “internal” questions, not as “external” ones). Quine’s position was then similar, his main innovation lying in a multiplication of the conceptual schemes in question. Both positions assumed a dualism of conceptual scheme versus sense data, or world. Davidson then made a salutary break with this tradition, rejecting not only Quine’s notion of multiple conceptual schemes but also scheme-content dualism. However, neither Davidson’s argument nor his positive position was sufficiently clear. Putnam then resumed this tradition’s predominant trend toward antirealism, or idealism (which he called “internal realism”). His central argument was that our knowledge of reality is unavoidably our knowledge of reality. But this was at bottom merely a tautology, and one that could in fact better be used to support realism. For the insight that we cannot transcend our concepts only leads to antirealism if one implicitly makes a dualist assumption that our concepts are not reality-containing, that reality is extraconceptual. And there are two good arguments against such a dualist assumption: first, that it is incoherent (since, for one thing, it itself makes a claim about how things really are), and second, that it is evolutionarily implausible. Spahn argues that a better alternative to this tradition lies in taking an evolutionary view of human cognition that situates it within nature (rather than over against nature). Such an evolutionary approach can seem to lead to radically divergent epistemological conclusions: a realism (e.g., Konrad Lorenz), or a constructivism that emphasizes the species-relativity of cognition (e.g., the early Humberto Maturana). However, it is not clear that either sort of conclusion really follows. Undermining the validity of the inference to realism are the facts, first, that orthodox Darwinism allows for the occurrence of nonadaptive traits; and second, that even if cognition

162

M. Forster and W. Welsch

were adaptive that would not entail that it was true (on this point Spahn’s chapter agrees with Mausfeld’s). On the other hand, it is also a mistake to assume that the sensory inputs on which human cognition is based must be closer to reality than the cognition to which they give rise and that the internal constraints involved in the processing of those inputs inevitably remove the resulting cognition further from reality. Rather, the very opposite might be the case. In the end, any attempt to use evolutionary theory either as an argument for realism or as an argument against it comes to grief: the former will be viciously circular and the latter self-defeating. Although evolutionary theory cannot answer the question whether realism is true, it can answer the question: if realism is true, then how so? Indeed, the evolutionary epistemologies of Konrad Lorenz, Ruth Milikan, Evan Thompson, Merlin Donald, Francisco Varela, Humberto Maturana, Joe¨lle Proust, and others already furnish the main lines of an answer to that question: Sensory representations are fundamental to cognition. But one can speak of cognition proper only when there is in addition some sort of process of adjusting those representations in the interest of overall coherence (Proust). A further evolutionary step is the involvement of such processes as habit and reinforcement in cognition. At a still higher evolutionary stage voluntary control of cognitive efforts emerges. That leads to internal distinctions between sense and understanding, and between appearance and reality. Then at an even higher evolutionary stage internal simulation, language, and multiperspectivity (as well as cultural institutions that exploit it) supervene. One especially important idea common to such evolutionary epistemologies is that in order for real knowledge to arise some sort of break with mere sensory input must occur. Ricarda I. Schubotz’s chapter, “Long-term Planning and Prediction: Visiting a Construction Site in the Human Brain”, approaches the question whether long-term planning and prediction are distinctive of human beings. If so, and if in addition functions of planning and prediction are organized in the frontal lobes of the brain in the manner of a gradation, then one would expect the parts of this gradation that are essential to long-term planning and prediction to have evolved latest phylogenetically and to be the latest to mature in individual humans ontogenetically. The frontopolar cortex (area 10) is anatomically distinctive in humans, both in terms of its volume and in terms of the degree of its connectivity. Functional analysis and neuroanatomy together suggest roughly the following picture of a gradation in the frontal lobes of the human brain: The premotor cortex performs cognitive functions that are preparatory to behavior. The lateral prefrontal cortex represents goals and results pursued, doing so in a conceptual fashion. Finally, the frontopolar cortex is responsible for more abstract forms of control such as multitasking and higher-level planning; is heavily involved in processes such as episodic and prospective memory, together with theory of mind; and more generally, integrates multiple cognitive operations (or propositions). Two hypotheses suggest themselves: (1) the frontopolar cortex furnishes the constraints based on long-term memory that go into planning and prediction. Just as this area of the brain develops late ontogenetically, so do episodic memory and mental time travel. Phylogenetically too it is not clear if nonhuman animals are

The Continuity of Evolution and the Special Character of Humans

163

capable of episodic memory or mental time travel (though this is a controversial question). (2) The enhanced connectivity in humans between parts of the frontopolar cortex which are responsible for long-term memory and goal-directed thought and action ensures that only humans can engage in long-term plans and predictions (though here again, caution is in order, since area 10 does exist in nonhuman animals as well, and has hardly been investigated in such cases). The chapter concludes with some experimental results concerning the functions of the three parts of the frontal lobes already discussed. Experiment 1 showed the premotor cortex to be active in the implementation of simple rules, but the frontopolar cortex to be essential for integrating multiple or complex relations. Experiment 2 reached a somewhat similar result. Experiment 3 showed that the premotor cortex is relevant to short-term prediction (which is therefore not the exclusive preserve of the lateral prefrontal cortex and the frontopolar cortex). Experiment 4 investigated subjects making comparisons between similar tokens versus comparisons between similar types, and showed that the latter comparisons distinctively require the involvement not only of the premotor cortex but also of the frontopolar cortex. Experiment 5 showed that involvement of the frontopolar cortex is also required for reasoning about relations (a special case of this being social knowledge). Overall, the experiments supported the hypothesis of a functional gradation in the frontal lobes – specifically, a functional trend proceeding from the premotor cortex to the frontopolar cortex toward an integration of rules, relations, multiple steps, and types that is relevant to complex planning and predicting (though the premotor cortex remains involved throughout as well). This is not to say that the frontal lobes are exclusively devoted to planning and prediction, however. Rather, their primary function seems to be one of relating, which includes planning and prediction as special cases. The chapter by Elisabeth Scheiner and Julia Fischer, “Emotional Expression – the Evolutionary Heritage in the Human Voice”, opens by noting Darwin’s seminal work on the expression of emotions in human beings and nonhuman animals in The Expression of the Emotions in Man and Animals (1872), and his position there that facial expressions of emotions in humans are largely innate. The chapter then sets out to explore another aspect of the expression of emotions, namely their expression in nonverbal vocalization. Are human forms of such expression unique to our species? Are they innate or culturally determined? The chapter observes that in nonhuman species aversive emotional states tend to find expression in both increased call rate and elevated pitch – suggesting that this may be a phylogenetically ancient trait. According to the authors, studies on both human infants and human adults tend to show a similar pattern, thus further supporting the thesis that this trait is phylogenetically ancient. Human infants’ cries and other vocalizations encode information about both positive and negative affective states. Where single vocalizations are concerned, the relevant features turn out to be common to the vocalizations of both hearing and hearing-impaired infants, but where vocal sequences are concerned, there are significant differences between the two groups – suggesting that in the latter case but not the former auditory learning plays a role.

164

M. Forster and W. Welsch

Concerning human adults, the authors note that past studies of the recognition of vocal expressions of emotion in human adults have tended to rely heavily on actors, and that this poses methodological problems due to the possibility that there are differences between the recognition of feigned as compared to authentic vocal expressions of emotion. Investigating this is therefore one purpose of their study. The authors are also interested in addressing the question whether the recognition of vocal expressions of emotion in adults is universal or culturally specific. In order to explore both of these questions, the authors designed an experiment in which both authentic and acted vocal expressions of emotion by native speakers of German were presented both to German subjects and to subjects of several other nationalities, and a series of relevant questions were then posed to the subjects concerning the authenticity versus inauthenticity of the expressions, the nature of the emotions expressed, and so on. Among the more general results that emerged from the experiment were the following: overall, about 68% of the authentic expressions were identified correctly, whereas only 50% of the acted expressions were identified correctly (though these results varied depending on the specific emotion involved); subjects had difficulty distinguishing between authentic and acted expressions; and subjects had a bias toward choosing the assessment “authentic”. The ability to identify emotions turned out to be affected both by the type of emotion involved (e.g., there was a relatively strong tendency to confuse expressions of hot anger and expressions of panic fear) and by the culture of the listener (e.g., there was a bias in German subjects toward selecting “anger”, as contrasted with a bias in Romanian and Indonesian subjects toward selecting “sadness”). The latter suggests that the recognition of emotions in vocalization involves an interplay between universal and culturally specific aspects. The chapter by Emily Wyman and Hannes Rakoczy, “Social Conventions, Institutions and Human Uniqueness: Lessons from Children and Chimpanzees”, argues that conventions play a central role in human life, whereas in contrast the social interactions of chimpanzees seem neither to be governed by normative rules nor mediated symbolically. The chapter approaches this issue via the development of collective intentionality and joint attention, considered as preconditions for conventionalized institutional practice, in young children. Collective intentions play a central role both in regulating and in constituting human social activities. Human infants begin to develop collective intentions in their second year, whereas chimpanzees seem never to do so. Similarly, young children develop joint attention, which plays a key role in making coordination and convention possible for humans, whereas chimpanzees may not be capable of joint attention at all, their forms of coordination instead seeming to have a different basis. Another striking feature of human social life is the central role played by collectively assigned status functions, involving a two-level conception of things – not only a minimal physical conception of them, but also a more imaginative or

The Continuity of Evolution and the Special Character of Humans

165

fictional one. Examples of this phenomenon include maps, paintings, money, and marriage. Children’s pretend play seems to be a precursor of this practice. Previous experiments have showed that by the age of two or three human infants master not only the two-level aspect of pretend play, but also the ability to inferentially extend the pretend play beyond what has been modeled for them. In a new experiment, the authors show that by the age of three human infants also master the special contextsensitivity that is another aspect of pretend play. In contrast, despite the fact that chimpanzees can develop quite impressive symbol use, it seems that they do not achieve the sort of mastery of symbolic status that young children already exhibit in their pretend play. Indeed, despite the fact that chimpanzees have sometimes been thought to engage in pretend play, it is not clear that they ever really do. Yet another aspect of human convention is its normative character, its association with penalties and a sense of “ought”. Previous experiments have shown that young children already possess some understanding of normativity in their games, e.g., that they will protest when the role assigned to an object in a game is violated by a participant. In a new experiment, the authors tested whether young children also have a grasp of the context-relativity that is another important aspect of understanding normativity, and they found that young children did indeed have such a grasp by age 3. In contrast, there is no compelling evidence that chimpanzees ever achieve a grasp of norms. The authors conclude by considering the question why human beings and chimpanzees differ in these fundamental respects, so that whereas human beings have convention-based social institutions and assign symbolic statuses chimpanzees do not. They suggest three possible lines of explanation. First, it may be that chimpanzee societies are too competitive in nature to allow for collective intentions or joint attention. Second, it may be that the sort of copying of actions that such processes support in human infants (as contrasted with the mere copying of outcomes of actions in which chimpanzees engage) enable a “ratchet effect” of cultural transmission of cognition that leads to a group selection that prefers groups and individuals in which such dispositions are most pronounced. Third, it may be that in contrast to the dominance-based societies of chimpanzees, human evolutionary history has developed more egalitarian forms of socio-political organization, which have facilitated an emphasis on collective intentions. We now turn to some tentative concluding thoughts about these chapters and the issues they address. All of the authors of these chapters share Darwin’s naturalistic vision of human beings as the product of evolution. However, where the question of the degree of continuity between human beings and other existing animals in respect of mentalistic features is concerned, the contributors seem more divided, some of them implying continuities while the others imply discontinuities. Before considering this dispute further, it is worth drawing a conceptual distinction between two issues that could easily be confused here. This is not quite, as it might seem to be, a dispute concerning Darwin’s fundamental position that evolution works by gradual transitions rather than by sudden jumps (though that position could be, and sometimes has been, disputed as well). For even if human beings were sharply

166

M. Forster and W. Welsch

discontinuous from existing nonhuman animal species in their features, Darwin might still be right in that fundamental position (namely, if the gradual links involved had since become extinct). However, as can be seen from The Descent of Man, he not only believed in such a diachronic continuity between man and the rest of nature but also in their synchronic continuity in the present, and moreover saw the latter as the main evidence for the former. So synchronic continuity in the present was important for him both in itself and for its evidential bearing on diachronic continuity. Now, the three chapters by Uhlhaas/Singer, Spahn, and Scheiner et al. tend to imply present continuities. In contrast, the chapters by Mausfeld, Wyman/Rakoczy, and (in a more tentative way) Schubotz tend to assert sharp discontinuities. Clearly, this is an issue that requires further research. However, our own intuitions tend to sympathize more with the former side of the dispute (continuity) than with the latter (discontinuity). The following are some tentative critical observations concerning the latter side designed to provide some justification for such a reaction. Concerning Mausfeld’s chapter, as we mentioned earlier, his general thesis that human perception involves conceptualization and hence multiperspectivity could be detached from his theses that the conceptualization in question is innate (rather than derived from experience or culture) and that such perception is unique to humans (rather than shared by some other animals). The latter theses seem questionable to us for several reasons: (1) Mausfeld’s own suggestion that the conceptual forms involved are vague and become more closely specified in light of feedback from sensory input already constitutes a concession on the issue of innateness toward an alternative position. (2) Some of Mausfeld’s own examples, when examined more closely, suggest a cultural rather than an innate source for the conceptualizations involved (e.g., the double-reading of paintings, which is in fact a skill that cultures unfamiliar with paintings have to learn). (3) The Chomsky/Fodor idea of innate, universal, human-specific modules for language-learning and other cognitive processes seems highly questionable. Where are these found in the human brain? How would their existence square with the striking continuity in brain architecture between apes and humans (as already noted by Darwin and more recently discussed by Merlin Donald among others)? How could the miniscule period of evolutionary time that separates us from our animal-like ancestors have sufficed for their development? Why postulate them in the first place given that the cognitive achievements they are called on to explain can be accounted for in terms of simpler, more familiar cognitive processes (as L. Jonathan Cohen and Michael Tomasello have argued convincingly in the linguistic case, for example)? (4) Concerning the alleged uniqueness to humans of conceptualization in perception and a resulting multiperspectivity, one might reasonably suspect that both of these features are in fact shared by at least some other animals, such as chimpanzees (who seem capable of treating an X as a Y in play contexts, for example). Concerning the chapter by Wyman and Rakoczy, while their positive evidence for the early development in humans of collective intentions, joint attention, a grasp of symbolic statuses, pretend play, and an understanding of normativity is very impressive, one may reasonably be more skeptical about their denial of similar achievements to other animals, in particular chimpanzees. The massive genetic

The Continuity of Evolution and the Special Character of Humans

167

overlap that exists between humans and chimpanzees (roughly 99%) justifies a default presumption that where similar-looking behavior occurs similar cognitive processes are at work. And there is indeed much evidence of similar-looking behavior relating to each of the processes in question here. For example, chimpanzees sometimes cooperate in hunting and in conspiring against conspecifics in ways that strongly suggest collective intentionality; they sometimes seem to exhibit joint attention when engaged in cooperative activities; they sometimes seem to grasp symbolic statuses and to indulge in pretend play (e.g., treating one object as though it were another, indeed in one case discussed by Frans de Waal treating themselves as another individual by mimicking his disability); and they sometimes seem to have an understanding of norms (e.g., disapproving of socially disruptive behavior by members of their group; reacting negatively to receiving unequal rewards for similar achievements; and in the case of a human-raised, language-taught chimpanzee discussed by Jane Goodall, first attempting to lie to a human handler about the source of a pile of dung that had been deposited in a room, and then verbally selfchastising once its real source became clear). Of course, these strong appearances of continuity could in principle be overturned in favor of alternative explanations if sufficiently compelling empirical evidence in support of doing so were forthcoming. But the burden of proof to be borne by those who wish to overturn them is a heavy one, and it is not clear to us that it has yet been borne. Concerning Schubotz’s chapter, her tentative suggestion that long-term planning and prediction may be unique to humans seems to us underevidenced, given that, as she herself notes, the most relevant area of the human brain, the protofrontal cortex (area 10), is also present in nonhuman animals, and hardly any empirical studies of its operation in such cases have yet been done. Moreover, there is at least some prima facie evidence of impressively complex and long-term planning in chimpanzees – for example, their collective hunting, with its differentiation of roles; de Waal’s observations concerning the long-term political conspiring that male chimpanzees sometimes seem to engage in; and Doehl’s experiments showing that chimpanzees are able to plan ahead in quite complicated ways in order to accomplish tasks for the procurement of rewards. The same evidence also supports the thesis that chimpanzees are capable of prediction. And there is some additional evidence for this as well; for example, human-trained chimpanzees saying “goodbye” in anticipation of a separation. A further interesting question that is touched on by several of the chapters in this volume, and on which they again seem somewhat divided, concerns the bearing of evolution on the veracity or otherwise of human cognition. Spahn’s chapter argues in the light of evolution for a form of epistemological realism and sketches an evolutionary explanation of how we are able to achieve true cognition. In contrast, several of the other chapters in the volume suggest that evolution may have caused us to become creatures whose cognition is discrepant with reality in various ways, perhaps even radically so. Thus, the chapter by Uhlhaas and Singer argues that the human brain’s relatively heavy emphasis on internal processing, rather than on the processing of sensory inputs, makes it unusually vulnerable to psychoses. More radically, Mausfeld’s chapter argues that there is a severe discrepancy between the

168

M. Forster and W. Welsch

world as we perceive it and the world as we know it to be in light of natural science, especially physics (one should think here, for example, of the perceptual illusions that physical bodies are solid rather constituted by spatially separated particles, that they possess not only primary qualities but also secondary qualities, and that the sun revolves around the earth). And his chapter points out that biological adaptativeness does not require that our cognition be true, but only that it includes a sort of structural realism. Even the chapters by Scheiner et al. and by Wyman/Rakoczy contribute to the picture that human cognition has an evolutionarily determined or enabled proneness to illusions. Thus, the chapter by Scheiner et al. notes that we have a general misleading bias toward attributing authenticity to expressions of emotions, as well as more culturally modulated misleading biases toward identifying the emotions expressed as being of certain kinds more often than they really are (e.g., Germans toward identifying them as “anger”). Similarly, the chapter by Wyman and Rakoczy notes that we have a strong propensity to assign entities statuses that are in some sense merely imaginary or fictional (e.g., treating bits of paper as money). Further examples of human susceptibilities to illusions that may be evolutionarily determined or enabled come to mind as well. For example, experiments by Benjamin Libet that were discussed by Christopher Frith in an interesting talk at the same conference where the articles in this volume were presented seem to show that agents are the victims of systematic illusions concerning their own intentions; David Hume already in the eighteenth century noted our tendency to project not only secondary qualities but also moral and aesthetic values onto the world itself; and the illusions of mythology, religion, and superstition form another large part of human cognitive history. It is not clear that Spahn’s chapter really does conflict with this emphasis on evolution’s production of human propensities to cognitive illusions in the end. For one thing, Spahn would of course himself concede that human cognition is often mistaken; he notes that the biological adaptiveness of cognition is compatible with its erroneousness; he points out that evolutionary epistemology can encourage constructivist positions as well as realist ones; and he forgoes any aspiration to justify realism in evolutionary terms (instead only aiming to explain how the cognitive achievements posited by realism could have come about evolutionarily). For another thing, even the most radical forms of the skeptical line of thought in question would have to concede that some of our cognition is veridical in order to be able to make their skeptical assessments about other aspects of our cognition in a coherent way (e.g., Mausfeld at least implies that physics is veridical, so that perception which conflicts with it is not). Still, the more skeptical line of thought does at least suggest that Spahn’s task of providing an evolutionary explanation of humans beings’ veridical cognition needs to be complemented with an equally pressing task of providing an evolutionary explanation of human beings’ illusory cognition. Indeed, recognizing just how widespread cognitive illusions are among human beings might even lead one toward a radical and disturbing suspicion that Nietzsche already entertained in the nineteenth century: that illusion is actually more biologically adaptive for creatures like us than truth.

The Continuity of Evolution and the Special Character of Humans

169

This point brings us back to our central topic of human distinctiveness. Is the quantity of illusion greater in human beings than in other animals? This strikes us as an interesting and difficult question. The following are some tentative thoughts about it: First, there are a number of specific obstacles that make answering it difficult (though not necessarily impossible). One is that the very ascription of beliefs to nonhumans is often problematic (albeit that it also seems unavoidable in certain cases, e.g., chimpanzees). Another is that accurately interpreting the content of the beliefs ascribed to nonhumans is often problematic as well (e.g., while ascribing a vervet monkey a belief that there is an eagle overhead may suffice as a first approximation, it surely does not capture the content of the belief precisely). Yet another is that while identifying local errors in nonhumans may not always be too difficult (e.g., when a dog literally “barks up the wrong tree”), identifying more global errors like the ones discussed above in connection with human beings (e.g., errors about secondary qualities) seems more problematic. In particular, while general types of belief, in order to be veridical, presumably need to achieve some sort of structural or pragmatic compatibility with reality, it is unclear what more they need to achieve. Moreover, discrepancy with the human perspective may not be a good criterion of failure here, since there may be a relational aspect to truth. Second, even if the quantity of illusion is absolutely greater in the case of human beings, this may only be because human beings have a richer stock of beliefs than other animals. So the more interesting question might be whether the quantity of illusion is also relatively greater? Third, even if the quantity of illusion is both absolutely and relatively greater in the case of human beings, this may be partly a function of the more ambitious nature of the questions that human beings address (and at least in some cases moreover answer correctly). Questions about the nature of the world as a whole would be one good example (that such questions are distinctively human has been widely recognized for millennia). Another example would be the sorts of questions concerning man’s place in the world that are addressed by the theory of evolution and the contributions to it contained in this volume.

.

Index

A Abductive aspects of perception, 43 Aboutness, 66, 67, 72 Action, 84–86, 88–93 Action planning, 86 Action potentials (APs), 3 Amodal completion, 22, 46, 48 Analytic philosophy, 161 Animal pain, 108, 109 stress, 109 vocalization, 106, 108–111, 116, 163, 164 Animal mental life, 157–158, 163, 165 Antirealism, 56, 61, 161 Arousal, 106, 107, 110, 112, 124 Associations, 31, 32, 89, 160 B Behaviorist tradition, 33 Bi-stability, perceptual, 46 Brain, 157, 159, 160, 162, 166, 167 lesion, 94, 95 size, 4 Brodmann Area 10, 81 C Causal analyses, internal, 20, 39, 48 Chimpanzees, 81, 131–151, 158, 159, 164–167, 169 Chunking, 93, 94 Cognition cognition and adaptation, 68 constraints of cognition, 68 evolutionary account of cognition, 65–68, 71–75, 161, 162 nonconceptual cognition, 59, 62 pushmipullyu cogntion, 73 Cognitive control, 84, 86, 87

Cognitive ethology, 158, 159 Collective, 133–135, 164–167 Common sense conception of perception, 23–31, 160 Communication nonverbal, 106, 132 Computational semantics, 45 Computational theory, 43 Conceptual forms, 22, 35–45, 49, 50, 160 biological plausibility, 40–42 underspecified, 43–45 Conceptual-intentional systems, 37 Constitutive rules, 133, 134, 139, 141, 150 Constructivism, 62, 67, 161 Conventions, 118, 119, 125, 131–151, 164, 165 joint attention, 136, 164–167 joint engagement, 137 reasoning, 135 stag hunt game, 136 Coordination convention, 135–137 fiction, 138–139 norms, 142–145 objects and status, 139–142 Core knowledge systems, 37 Creative forces of perception, 43 Creativity, 8, 13 Cross-species, 6, 133, 148 Cue integration, 40 Cultural ratcheting process, 148 Culture conventions, 118, 119, 125 D Darwin, C., 2, 68, 106, 108, 157–159, 163, 165, 166 Descartes’ conception of perception, 42 Dictator games, 144

W. Welsch et al. (eds.), Interdisciplinary Anthropology, DOI 10.1007/978-3-642-11668-1, # Springer-Verlag Berlin Heidelberg 2011

171

172 Dispositional properties, perception of, 39, 48 Dogs, 158, 159, 169 Dreaming, 12 Dualism, 2, 55, 57, 63–65, 70, 161 Cartesian dualism, 57 dualism of “input” and “construction”, 70 dualism of scheme and content, 63, 64 dualism of “the conceptual” and “the real”, 65 Dual nature of picture perception, 40, 46, 48 E Emotion(s), 158, 163, 164, 168 acted, 118–123, 125 authentic, 117–125 aversion, 109–111, 117, 118 components, 107, 115, 118 recognition, 117–125, 164 Empiricist coneptions of the mind, 32, 33 Encephalization quotient (EQ), 5 Episodic memory, 86–89 Epistemology, 162, 167, 168 evolutionary epistemology, 56–58, 67–69, 74 modern epistemology, 56 Error, perceptual, 28, 29 Ethology, 33, 35, 41 Event(s), 84, 85, 87, 88, 90, 99 Evolution, 2, 26, 27, 40–42, 72, 83, 148, 157, 159, 162, 167 phylogenetic ritualization, 107 Evolvability, 40 Explanatory depth and width, 35 Expression emotional, 105–125 facial, 106, 111, 112, 118 nonverbal, 106, 113, 115, 116 F Figure-ground segmentation, 46 fMRI. See Functional magnetic resonance imaging Frames of reference, mental, 46 Frontal lobes/frontal cortex, 80–85, 87, 88, 90–94, 96, 98, 99 Frontopolar cortex, 81–85, 87–90, 92, 94–96 Functional magnetic resonance imaging (fMRI), 91–94, 96 Fundamental Problem of Perception Theory, 32, 34, 42 G Gradation, 80–86, 90, 92, 94, 96, 162, 163 Grammar of vision, 39

Index H Hallucinations, 8, 10–12, 160 Heider–Simmel demonstration, 47–51, 160, 161 Hierarchical model, hierarchical organization, 40, 80, 84 Homo sapiens, 2, 5, 6, 8, 13 I Idealisation, as methodological principle of the natural sciences, 22, 23, 28 Idealism, 56, 57, 59, 61–63, 71, 161 Illusions, 168, 169 argument from illusions, 70 optical illusion, 57, 69 perceptual, 24, 29 Imagination, systems for, 23, 38 Inductive procedures for concept acquisition, limits of, 32, 35 Integration, 41, 49, 50, 72, 81, 84, 85, 87, 89, 90, 92–94, 96, 98 Intentionality, 66, 67 collective, 133–135, 149, 151, 167 shared intentionality, 74 Intentional object, 35 Intentions, 48, 147, 150, 164–166, 168 cooperative activities, 135 status functions, 134 types of rules, 133 Interfaces between internal systems, 41 Internalist approaches to perception, 25, 42, 61 Internal semantics of perception, 37 Inverse optics approach, 32 J Joint attention, 136, 164–167 L Language, 6, 8, 37, 39, 60, 61, 74, 132, 141, 158, 159, 162, 166, 167 Logical positivism, 56, 161 Long-term memory, 88–90, 162 Long-term planning, 79–99, 162, 167 M Maturation, 81, 83, 89 Meaning, in perception, 32, 33, 40, 43, 45, 46 Mental attitude, 47 Mentalizing, 87–89, 97 Mental perspectives, 20–22, 52 Mental time travel, 88, 89, 162, 163 Metarepresentations, 8, 14 Modularity in evolution, 40, 41, 49

Index Monitoring, 87, 90, 151 Multiperspectivity, 7, 19–23, 45–52, 160–162, 166 Mykind, as type of perceptual objects, 39, 47–49 N Natural selection, 67, 70, 157 Neocortex, 5–7, 159 Neural oscillations, 4 Neuron, 4–7, 82, 83, 108 Neuroreductionist explanations, 23 Normativity, 139, 142, 146, 151, 164–166 O Ontogenetic, 80, 81, 83, 89 P Parrots, 158, 159 Patient, 94–96 Perception, 8, 22–51, 69–71, 97, 160, 161, 166, 168 Perceptual objects, 20, 22, 32, 33, 37, 39, 43, 47, 48 Perceptual System, as system of conceptual forms, 36–39 Phenomenal experience, 9, 21, 25 Phenomenal percept, 38, 44, 45, 49, 50 Phylogenetic, 80, 83, 88–90, 111, 116 Planning, 79–99, 150, 158, 162, 163, 167 Prediction, 79–99, 162, 163, 167 Prefrontal cortex, 6, 81, 86, 94, 96, 162, 163 Premotor cortex, 81, 84, 85, 95, 163 Pretence, 7 Pretend play, 8, 21, 139, 142, 146, 150, 160, 165–167 Psychology evolutionary psychology, 74 Psychosis, 8, 11–13 Q Qualia, 30, 57 R Ratchet effect, 148, 165 Rating bias, 121, 123, 124 choice theory, 121, 123 dissimilarity, 121 Rational maximizers, 145 Raven progressive matrices, 91 Realism, 160–162, 167, 168

173 external realism/externalism, 59, 63, 64, 72 internal realism/internalism, 56, 57, 59, 61–63, 71–75 naive, 24, 25, 30, 31 structural, 30 Realness, as a perceptual attribute, 20, 50, 51 Reference, notion of, 28–31, 42, 64 Regulative rules, 133 Relational reasoning, 91, 96–99 Representation inner representations, 72 misrepresentation, 72 Representation, notion of, 25 S Schizophrenia, 9 Semantic relation in perception, 43 Semantic underspecification in perception, 43–45 Sequence learning, 92 Sign theory of perception, 26, 30, 42 Social conventions, 131–151 collective intentionality, 133–135 coordination, 135–145 human-specific, 145–149 Societies collectivistic, 124, 125 individualistic, 124, 125 Speech, 106, 108, 116, 117, 119 prosody, 116, 124 Stag hunt game, 136 Standard Model of Perception, 31–33, 48 Symbols, 8, 138 Synapses, 3 T Theory of Mind, 7, 87, 88, 159, 162 Transcedental illusion, Kant’s, 24 Translation problem of radical translation, 60 Tree of life, 157 Triggering function, 42 Triggering of conceptual forms, 38, 41, 44, 49–51 Truth correspondence theory of truth, 61, 63, 69 criteria for truth, 69 U Underspecification, global vs. local, 44 V Vagueness, perceptual, 43–46, 51

174 Veridicality, 28 Vocalization alarm calls, 110 animal, 108–110 human infant, 111–116 neurobiology, 108 nonverbal, 113, 115, 116 pathway, 108

Index sequence, 114 Voice animal, 106–110, 112 human, 105–125 infant, 109, 111–118, 125 Z Zoo table, 144