Body, Language and Mind Volume 1: Embodiment
≥
Cognitive Linguistics Research 35.1
Editors Dirk Geeraerts Rene´ Dirven John R. Taylor Honorary editor Ronald W. Langacker
Mouton de Gruyter Berlin · New York
Body, Language and Mind Volume 1: Embodiment Edited by Tom Ziemke Jordan Zlatev Roslyn M. Frank
Mouton de Gruyter Berlin · New York
Mouton de Gruyter (formerly Mouton, The Hague) is a Division of Walter de Gruyter GmbH & Co. KG, Berlin
앝 Printed on acid-free paper 앪
which falls within the guidelines of the ANSI to ensure permanence and durability.
Library of Congress Cataloging-in-Publication Data Body, language, and mind. Volume 1, Embodiment / edited by Tom Ziemke, Jordan Zlatev, Roslyn M. Frank. p. cm. ⫺ (Cognitive linguistics research ; 35.1) Includes bibliographical references and index. ISBN 978-3-11-019327-5 (hardcover : alk. paper) 1. Language and languages ⫺ Philosophy. 2. Mind and body. 3. Semiotics. I. Ziemke, T. (Tom), 1969⫺ II. Zlatev, Jordan. III. Frank, Roslyn M. P107.B63 2007 401⫺dc22 2007028708
Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de.
ISBN 978-3-11-019327-5 ISSN 1861-4132 쑔 Copyright 2007 by Walter de Gruyter GmbH & Co. KG, D-10785 Berlin All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Printed in Germany
Table of contents
List of contributors Introduction: The body eclectic Tom Ziemke and Roslyn M. Frank
VII
1
Section A: Historical roots We are live creatures: Embodiment, American Pragmatism and the cognitive organism Mark Johnson and Tim Rohrer
17
Bringing the body back to life: James Gibson's ecology of embodied agency Alan Costall
55
From the meaning of embodiment to the embodiment of meaning: A study in phenomenological semiotics Goran Sonesson
85
Embodiment and social interaction: A cognitive science perspective Jessica Lindblom and Tom Ziemke
129
Section B: Body and mind Representing actions and functional properties in conceptual spaces Peter Giirdenfors
167
From pre-representational cognition to language Takashi Ikegami and Jordan Zlatev
197
vi
Table ofcontents
Making sense of embodied cognition: Simulation theories of shared neural mechanisms for sensorimotor and cognitive processes Henrik Svensson, Jessica Lindblom and Tom Ziemke
241
Phenomenological and experimental contributions to understanding embodied experience Shaun Gallagher
271
Section C: Body, language and culture Embodiment, language, and mimesis Jordan Zlatev
297
The body in space: Dimensions of embodiment TimRohrer
339
On the biosemiotics of embodiment and our human cyborg nature Claus Emmeche
379
Embodiment and self-organization of human categories: A case study of speech Luc Steels and Bart de Boer
411
Communication as situated, embodied practice Wolff-Michael Roth
431
Index
457
List of contributors
Bart de Boer did a Master's degree in computer science at the Rijksuniversiteit Leiden (1994) and a PhD in artificial intelligence at the AI lab of the Vrije Universiteit Brusse1 (1999) under professor Luc Steels. He has worked as a postdoc in Brussels and at the University of Washington under professor Patricia Kuhl. He has also performed linguistic fieldwork in Nepal. His main research interest is in computer mode1ing the evolution of speech. He is currently working as an assistant professor in cognitive robotics at the Rijksuniversiteit Leiden. e-mail:
[email protected] Alan Costall is Professor of Theoretical Psychology at the University of Portsmouth, England. His research interests are wide, but are held together by a commitment to interdiscip1inarity, and a broadly ecological or mutualist perspective. He has been increasingly involved in work on the history of modem psychology and its relations (or lack of them) to other disciplines. Recent publications include: A. Costa11 and o. Dreier, (eds.), Doing Things with Things. (London: Ashgate, 2006); A. Costa11, I. Leudar, and V. Reddy. (2006) "Failing to see the irony in 'mind-reading. '" Theory & Psychology 16(2): 163-167; Bard, K.A., M. Myowa-Yamakoshi, M. Tomonaga, M. Tanaka, A. Costa11, A., and T. Matsuzawa. (2005) "Group differences in the mutual gaze of chimpanzees (Pan Troglodytes)." Developmental Psychology 41: 616-624; Rogers, S. D., E. E. Kadar and A. Costa11, A. (2005) "Gaze patterns in the visual control of straight road driving and braking as a function of speed and expertise." Ecological Psychology 17: 19-38; Costa11, A., and I. Leudar. (2004). "Where is the 'theory' in theory of mind?" Theory and Psychology 14: 625-648; Costa11, A., M. Sinico and G. Parovel. (2003) "The concept of 'invariants' and the problem of perceptual constancy." Rivista di Estetica, n.s. 24(3), 49-53; Ost, J., and A. Costa1l. (2002) "Misremembering Bart1ett: A study in serial reproduction." British Journal ofPsychology 93: 243-255. e-mail:
[email protected] Claus Emmeche is a theoretical biologist, Ph.D., Associate Professor and Head of the Center for the Philosophy of Nature and Science Studies, 10-
viii
List ofcontributors
cated at the Niels Bohr Institute, University of Copenhagen. The Faculty of Science founded the center in 1994 to explore a new and more sciencerelated way to do philosophy of nature, yet keeping a notion of science as more than natural science. Emmeche has taught courses in philosophy of biology and philosophy of science and his current research interests include biosemiotics, artificial life, ontology, organism/body/cyborg relations, and philosophy of nature. He is active in the Copenhagen biosemiotics school (cf. Reading Hoffmeyer: Rethinking Biology, with Kalevi Kull and Frederik Stjemfelt) and in developing a cluster of mandatory science studies courses for the bachelor programmes in Denmark. e-mail:
[email protected] Roslyn M. Frank is Professor Emeritus in the Department of Spanish and Portuguese at the University of Iowa. She is co-editor of Cognitive Models in Language and Thought: Ideology, Metaphors and Meaning (2003); Language and Ideology, Vol. 2. Cognitive Description Approaches (2001) and has published extensively in the field of cognitive linguistics as well as in ethnoscience, most particularly in ethnomathematics and ethnoastronomy. Her research on the Basque language has taken her to Euskal Herria, the Basque Country, where she has done extensive fieldwork and given numerous seminars. In addition she has given presentations on these research topics throughout Europe. e-mail:
[email protected] Shaun Gallagher is Professor and Chair of Philosophy and Cognitive Sciences at the University of Central Florida; he has been occasional Visiting Professor at the University of Copenhagen (2004-2006) and Visiting Scientist at the Medical Research Council's Cognition and Brain Sciences Unit at Cambridge University (1994). He is co-editor of the interdisciplinary journal Phenomenology and the Cognitive Sciences. His research interests include phenomenology and philosophy of mind, cognitive sciences, hermeneutics, theories of the self and personal identity. His most recent book, How the Body Shapes the Mind, is published by Oxford University Press (2005). He is co-editor of the forthcoming Does Consciousness Cause Behavior? An Investigation ofthe Nature of Volition (MIT Press, 2006). He is currently working on several projects, including a co-authored book, The Phenomenological Mind: Contemporary Issues in Philosophy ofMind and the Cognitive Sciences (Routledge, 2007). His previous books include: Hermeneutics and Education (1992) and The Inordinance of Time (1998).
List ofcontributors
ix
He has edited or co-edited volumes including: Ipseity and Alterity: Interdisciplinary Approaches to Intersubjectivity (2004),· Models of the Self (1999); Hegel, History, and Interpretation (1997). Home page: http:// pegasus.cc.ucf.edu/"-Jgallaghr e-mail:
[email protected] Peter Gardenfors is professor of cognitive science at Lund University (Sweden). He leads the Ph.D. program in Cognitive Science there (LUCS). He has published numerous books and articles on decision theory, epistemology, belief revision, concept formation and the evolution of cognition (see http://www.lucs.lu.se/People/Peter.Gardenfors/bibli02000.html). The most important books are Knowledge in Flux: Modeling the Dynamics of Epistemic States (Bradford Books, MIT Press, 1988); Conceptual Spaces (Bradford Books, MIT Press, 2000); How Homo Became Sapiens: On the Evolution of Thinking (Oxford University Press, 2003); and The Dynamics of Thought (Springer, 2005). e-mail:
[email protected] Takashi Ikegami earned his Ph.D in Physics (1989). He works on Artificial Life and Complex Systems by simulating computational models. His publications range from self-reproduction, ecological systems, embodied cognition to cognitive linguistics. Some of his recent articles are: Ikegami, T. (2005). "Neutral phenotypes as network keystone species." Population Ecology 47: 21-29; Iizuka, H. and T. Ikegami. (2004) "Adaptability and diversity in simulated turn-taking behavior." Artificial Life 10: 361-378; Ikegami, T., and G. Morimoto. (2003) "Chaotic itinerancy in coupled dynamical recognizers." CHAOS 13: 1133-1147; Ikegami, T. (1999). "Evolvability of machines and tapes." J. Artificial Life and Robotics 3( 4): 242245; Ikegami, T. and M. Taiji. (1998). "Structures of possible worlds in a game of players with internal models." Acta Polytechnica Scandinavica 91: 283-292. e-mail:
[email protected] Mark Johnson is Professor of Philosophy and Knight Professor of Liberal Arts and Sciences at the University of Oregon. His research has focused on the philosophical implications of the role of human embodiment in meaning, conceptualization, and reasoning. He is co-author, with George Lakoff, of Metaphors We Live By (1980) and Philosophy in the Flesh (1999) and author of The Body in the Mind (1987) and Moral Imagination (1993). He
x
List ofcontributors
is currently completing a book on the aesthetic dimensions of meaning, drawing on evidence from cognitive science, phenomenology, neuroscience, and the arts that reveals the origins of meaning in felt qualities, sensorimotor patterns, and emotions. e-mail:
[email protected] Jessica Lindblom is a cognitive science Ph.D. candidate working at the School of Humanities and Informatics, University of Skovde, Sweden. She previously received a master's degree in computer science (2001) and bachelor's degree in cognitive science (2000). Her main research interests are social aspects of embodied and situated cognition, and their implications to interactive technology. Some of her publications, together with her supervisor professor Tom Ziemke, are Lindblom and Ziemke (2003) "Social situatedness of natural and artificial intelligence. Vygotsky and beyond." Adaptive Behavior 11(2): 79-96, and Lindblom and Ziemke (2005) "Body-in-motion: broadening the social mind." In: Bruno G. Bara, Lawrence Barsalou and Monica Bucciarelli (eds), Proceedings of the XXVII Annual Conference ofthe Cognitive Science Society, 1284-1289. Mahwah, NJ: Lawrence Erlbaum. e-mail:
[email protected] Tim Rohrer has published extensively on metaphor and embodiment in diverse disciplines for over fifteen years. His work has ranged from experimental cognitive neuroscience to information technology policy and from the politics of conflict resolution to the philosophy of language. In addition to the lines of investigation reflected in this volume, he is also researching how metaphors shape attitudes toward wildfire mitigation in the western United States. He is perhaps best known as the founder and maintainer of the Center for the Cognitive Science of Metaphor Online, a collection of formative articles in metaphor theory and cognitive semantics (http://zakros.ucsd.edu;'--.Jtrohrer/metaphor/metaphor.htm). He holds a PhD in philosophy from the University of Oregon and has been a Thomas J. Watson scholar, a Fulbright researcher at the Center for Semiotic Research in Aarhus, Denmark, and a NllI fellow at the Institute for Neural Computation at the University of California at San Diego. At present he directs the Colorado Advanced Research Institute in Boulder, Colorado. e-mail:
[email protected]
List ofcontributors
xi
Wolff-Michael Roth is Lansdowne Professor of Applied Cognitive Science at the University of Victoria, Canada. His research focuses on cultural-historical, linguistic, and embodied aspects of scientific and mathematical cognition and communication from elementary school to professional practice, including, among others, studies of scientists, technicians, and environmentalists at their work sites. The work is published in leading journals of linguistics, social studies of science, sociology, learning sciences, and education and various subfields of education (curriculum, mathematics education, science education). His recent books include Toward an Anthropology of Science (Kluwer, 2003), Rethinking Scientific Literacy (Routledge, 2004, with A. C. Barton), Talking Science (Rowman and Littlefield, 2005), and Doing Qualitative Research: Praxis of Method (SensePublishers, 2005). e-mail:
[email protected] Goran Sonesson is Professor of semiotics and Director of the Department of Semiotics at Lund University. He holds a doctorate in general linguistics from Lund University, as well as a doctorate in semiotics from the Ecole des Hautes Etudes en Sciences Sociales, Paris. Between 1978 and 1983, he was involved in Paris with the semiotics of gesture, and then worked on Mayan language and culture in Mexico, after which he has occupied different research positions in semiotics in Lund. Sonesson's main work is the monograph Pictorial Concepts (Lund: Lund University Press 1989), which is a critical survey of different contributions to pictorial semiotics, reviewed in the context of findings in perceptual psychology and cognitive science. An important part of the book is devoted to a critical assessment of the theories of iconicity presented by, among others, Goodman and Eco. This work has subsequently been extended in numerous articles, published in Semiotica, RSSL Zeitschrift fur Semiotik, VISID, Degres, Sign System Studies, Current Anthropology, etc. His most recent publications are concerned with bringing a semiotic perspective to the study of evolution. e-mail:
[email protected] Luc Steels is a professor of Computer Science at the Vrije Universiteit Brussel (VUB). He graduated in linguistics at the University of Antwerp and in computer science at the Massachusetts Institute of Technology, working in the MIT AI Laboratory. After that he worked in the domain of geophysical measurement interpretation as a project leader for geological expert systems at Schlumberger. In 1983 he founded the VUB Artificial
xii
List ofcontributors
Intelligence Laboratory, which he still directs this day. He was cofounder and chairman (from 1990 until 1995) of the VUB Computer Science Department (Faculty of Sciences) and also founder and director of the Sony Computer Science Laboratories in Paris. His scientific research interests cover the whole field of artificial intelligence, including natural language, vision, robot behavior, learning, cognitive architecture, and knowledge representation. His publications can be found in major AI/cogsci journals such as Behavioral and Brain Sciences, Trends in Cognitive Science, Artificial Intelligence Journal, etc. He also has edited a dozen books. At the moment his research focus in on fundamental research into the origins of language and meaning. e-mail:
[email protected] Henrik Svensson is currently pursuing a Ph.D. in cognitive science at the School of Humanities and Informatics, University of Skovde, supervised by professor Tom Ziemke. He received his B.S. degree (2001) and M.Sc. degree (2002) from the University of Skovde. His main research interest concerns the relation between agent-environment interaction and higherlevel cognition. Some of his recent publications are: Svensson, H. and T. Ziemke (2005) "Embodied representation: What are the issues", in: B Bara, L. Barsalou, and M. Buccarelli (eds.), Proceedings ofthe 27th Annual Meeting of the Cognitive Science Society, 2116-2121. Mahwah, NI: Lawrence Erlbaum; and Svensson and Ziemke (2004) "Making sense of embodiment", In: K. Forbus, D. Gentner and T. Regier (eds.), Proceedings of the 26th Annual Conference of the Cognitive Science Society, 1309-1314. Mahwah, NI: Lawrence Erlbaum. e-mail:
[email protected] Tom Ziemke is Professor of Cognitive Science in the School of Humanities and Informatics at the University of Skovde, Sweden. His research is mainly concerned with embodied and distributed cognition, i.e. theories and models of how cognition is shaped by the living body and its interaction with the material and social environment. He is coordinator of a largescale European project on robotic models of embodied cognition, called "Integrating Cognition, Emotion and Autonomy" (www.his.se/icea). and member of the executive committee of euCognition - The European Networkfor the Advancement ofArtificial Cognitive Systems. He is also asso-
List ofcontributors
xiii
ciate editor of the journals New Ideas in Psychology and Connection Science. e-mail:
[email protected] Jordan Zlatev is Associate Professor at the Centre for Languages and Literature, Lund University, Sweden. His PhD thesis (Stockholm University, 1997) is the monograph Situated Embodiment: Studies in the Emergence of Spatial Meaning, in which he formulates a synthetic socialcognitive framework for the study of language, and applies this to the typology and acquisition of spatial semantics. He is the co-founder of the annual international workshop series Epigenetic Robotics: Modelling Cognitive Development on Robotic Systems, first held in Lund, 2001 and the bi-annual conference Language, Culture and Mind, first held in Portsmouth 2004. Zlatev collaborates extensively with semioticians, cognitive scientists and philosophers within the project Language, Gestures and Pictures in Semiotic Development and more recently within the highly interdisciplinary EU-project Stages in the Evolution and Development of Sign Use (SEDSU) (www.sol.lu.se/sedsu). His own work concentrates on differences in primary intersubjectivity between great apes and humans, and on a cross-cultural study of the ontogeny of gestural communication. The key theoretical concept for his current work is that of bodily mimesis, understood (following Donald 1991) as the conscious use of the body for representational means. Zlatev has published on these topics extensively over the past years in refereed journals and books, and is currently working on the monograph Bodily Mimesis and the Grounding ofLanguage and coediting the book The Shared Mind: Perspectives on Intersubjectivity. e-mail:
[email protected]
Introduction: The body eclectic
Tom Ziemke and Roslyn M Frank
1.
Background
This is the first volume of a two-volume set with the title Body, Language and Mind. While this volume focuses on the concept of embodiment, i.e. the bodily and sensorimotor basis of phenomena such as meaning, mind, cognition and language, the second volume addresses social situatedness, i.e. the ways in which individual minds and cognitive processes are shaped by their interaction with sociocultural structures and practices. Naturally, the volumes overlap significantly, and in fact they have both to some degree emerged out of a one-day theme session on embodiment held at the 8th International Cognitive Linguistics Conference in Logrofio, Spain, in the summer of 2003. Some of the contributors to these volumes also participated in the original theme session, whereas others have been invited later to complement the range of perspectives represented. The concept of embodiment has received a great deal of attention in the cognitive sciences during the last twenty years, and as a result terms like embodied mind, embodied action, embodied cognition are now commonly used, often in juxtaposition to concepts like situated action (Suchman 1987), situated cognition (e.g. Clancey 1997), distributed cognition (Hutchins 1995) or the extended mind hypothesis (Clark and Chalmers 1998). In fact, by the 1990s several authors had already declared embodied cognitive science, which is often taken to more or less include all of these concepts, to be a new paradigm in cognitive science (e.g. Varela, Thompson and Rosch 1991; Clark 1997, 1999; Pfeifer and Scheier 1999). All of this might give you the impression: (a) that there is a clear notion of what embodiment is, and subsequently, a consensus concerning in what sense cognitive processes (or perhaps certain types of cognition) are embodied, and (b) that embodied cognitive science in fact is a theoretical framework that is more or less established and agreed upon by researchers working in the field. Somewhat surprisingly, however, neither (a) nor (b) are actually true.
2
Tom Ziemke and Roslyn M. Frank
Although it is by now widely agreed that cognition is embodied, in the sense that it is shaped by the body and sensorimotor interaction with the environment, it is less clear exactly what this means (cf. e.g., Chrisley and Ziemke 2003; Clark 1999; Wilson 2002; Ziemke 2001a, 2003, 2007). Is it the physical, the biological, the animate, the phenomenal (experienced), or the social body that shapes cognition, or perhaps all of these? And, exactly how does the body shape cognition; is it, for example, only involved in actual sensorimotor interaction with the environment, i.e. in the grounding of mental representations in the traditional sense (cf. e.g. Harnad 1990; Ziemke 1999), or does its influence go further, i.e. is the body also crucially involved in thought, language, and other supposedly abstract activities, such as mathematics (cf. e.g. Lakoff and Nufiez 1999)? The following brief overview provides several useful distinctions in conceptions of embodiment that might help to clarify differences in theoretical frameworks and commitments in the field that sometimes remain hidden under a superficial agreement on 'embodiment'. First, we have Nufiez (1999) who distinguished between trivial, material, and full embodiment. Trivial embodiment simply is the view that "cognition and the mind are directly related to the biological structures and processes that sustain them". Obviously, this is not a particularly radical claim, and consequently few cognitive scientists would reject it (dualist philosophers of consciousness, on the other hand, might). According to Nufiez, this view further "holds not only that in order to think, speak, perceive, and feel, we need a brain - a properly functioning brain in a body - but also that in order to genuinely understand cognition and the mind, one can't ignore how the nervous system works". Material embodiment makes a stronger claim, but it only concerns the interaction of internal cognitive processes with the environment, i.e. the issue of grounding, and thus considers reference to the body to be required solely for accounts of low-level sensorimotor processes. In Nufiez's terms: "First, it sees cognition as a decentralized phenomenon, and second it takes into account the constraints imposed by the complexity of real-time bodily interactions performed by an agent in a real environment". Full embodiment, finally, is the view that the body is involved in all forms of human cognition, including seemingly abstract activities, such as language or mathematical cognition (e.g. Lakoff and Nufiez 1999). In Nufiez's own words: Full embodiment explicitly develops a paradigm to explain the objects created by the human mind themselves (i.e., concepts, ideas, explanations, forms of logic, theories) in terms of the non-arbitrary bodily-experiences
Introduction: The body eclectic
3
sustained by the peculiarities of brains and bodies. An important feature of this view is that the very objects created by human conceptual structures and understanding (including scientific understanding) are not seen as existing in an transcendental realm, but as being brought forth through specific human bodily grounded processes.
In a similar vein, Clark (1999) distinguished between the positions of weak embodiment and radical embodiment. According to the fonner, traditional cognitive science can roughly remain the same; i.e. theories are merely constrained, but not essentially changed by embodiment. This is similar to Nufiez's view of material embodiment. The position of radical embodiment, on the other hand, largely compatible with Nufiez's notion of full embodiment, is, as Clark fonnulated it, "radically altering the subject matter and theoretical framework of cognitive science". More recently, Wilson (2002) distinguished between six views of embodied cognition, of which only the last one requires full or radical embodiment whereas the first five might be considered variations or aspects of material embodiment: (1) cognition is situated, i.e. it occurs "in the context of task-relevant inputs and outputs", (2) cognition is timepressured, (3) cognition is for the control of action, (4) we off-load cognitive work onto the environment, e.g. through epistemic actions (Kirsh and Maglio, 1994), i.e. manipulation of the environment 'in the world', rather than 'in the head', (5) the environment is actually part of the cognitive system, e.g. according to Clark and Chalmers' (1998) notion of the 'extended mind', and (6) 'off-line' cognition is body-based, which according to Wilson is the "most powerful claim" (cf. Svensson, Lindblom and Ziemke this volume). Complementary distinctions and classifications are proposed in several of the contributions to this volume. Emmeche (this volume), for example, points out that different disciplines have different perspectives on the body. He therefore distinguishes between physical embodiment (physics), organismic embodiment (biology), animate embodiment (zoology), and anthropic embodiment (anthropology, sociology). However, Emmeche also points out: "The point is not the exact number of levels (these are contingent upon a historically relative state of science) but the fact that irreducible levels do exist". Rohrer (this volume) takes a different approach and catalogues what he considers twelve important senses or dimensions of embodiment in current research on the topic: philosophy, socio-cultural situation, phenomenology, perspective, development, evolution, the cogni-
4
Tom Ziemke and Roslyn M Frank
tive unconscious, neurophysiology, neurocomputational modelling, morphology, directionality ofmetaphor, and grounding. It should be noted that, naturally, not all of these views, types or dimensions of embodiment are equally relevant for all disciplines and perspectives. For example, the question of whether or not an embodied mind actually needs to be physical or biological might not be a meaningful question to a neuroscientist, given that real brains and bodies obviously are both physical and biological. However, for the philosopher wrestling with dualist or functionalist conceptions of mind, or for the artificial intelligence researcher trying to build, or at least model, minds, these are still burning questions (cf. e.g. Ziemke 2001a, 2007; Zlatev 2001, 2003; Johnson and Rohrer this volume; Lindblom and Ziemke this volume). Similarly, the phenomenal experience of the lived body is perhaps not the primary interest for the cognitive linguist interested in syntax or metaphor, but it certainly is relevant to many other aspects of language and mind (cf. e.g. Sonesson this volume; Gallagher this volume; Zlatev this volume; Roth this volume). To some degree the diversity of perspectives on embodied cognition can of course be attributed to the inter- or multidisciplinary nature of cognitive science as such. After all, it is not surprising that, for example, linguists, neuroscientists and artificial intelligence (AI) researchers would be interested in different aspects of embodiment, for the same reasons that they offer different, but hopefully complementary perspectives on the study of mind and language. True complementarity, however, rather than mere coexistence of different disciplines and perspectives, would seem to require some kind of common framework and terminology. For traditional cognitive science, the common ground was provided by theories of functionalism, computationalism and representationalism, the basic assumptions of which were summarized by Gardner (1987: 6) as follows: First of all, there is the belief that, in talking about human cognitive activities, it is necessary to speak about mental representations and to posit a level of analysis wholly separate from the biological or neurological, on the one hand, and the sociological or cultural, on the other.
Consequently, the complementarity and division of labor between different disciplines was, perhaps, relatively clear in the old, pre-embodiment days when everybody agreed that thought was the operation of computational processes on mental, presumably symbolic representations. Linguists were mostly concerned with language as a symbol system, without much interest in its biological implementation, while neuroscientists tended to be inter-
Introduction: The body eclectic
5
ested in how and where in the brain the relevant computations and representations were implemented, and AI researchers were striving to synthesize the very same computations and representations in artificial implementations. Apart from the fact that the above picture of traditional, computationalist cognitive science is of course a bit too rosy (the exact nature of those mental representations, for example, caused, and still causes, considerable debate), we have to admit that there is no equivalent consensus yet in embodied cognitive science. We could say that is because, as a discipline or paradigm, embodied cognitive science is still relatively young, but then again it is 20-30 years old by now, depending on exactly how you situate its origins in time. Moreover, one could rightly point out that this diversity of perspectives is due to the expanding nature of the community of researchers concerned with this approach. In short, since the embodied cognition perspective is still growing and gaining ground, it might be too early to hope for a convergence of perspectives, indeed, at this stage such a convergence of opinions might be premature. However, the current situation might also provoke comparisons with soap bubbles that are destined to burst sooner or later if there is nothing under the surface to hold them together. One could also say that things were simpler for traditional cognitive science, because that approach was able to define itself from the start both with respect to what it rejected, behaviorism, and with respect to a common vision, the computer metaphor for mind. But this, in turn, raises the question of exactly what the common vision might be in the case of embodied cognitive science (as well as its common unifying metaphor). Some people would say that simulation theories (cf. Svensson, Lindblom and Ziemke this volume) are what constitute the common vision or framework in embodied cognitive science - very roughly speaking, the idea that the same neural mechanisms are used for both sensorimotor interaction and abstract thought. Nonetheless these ideas certainly still need to be worked out in ~ much more detail. Whichever way you look at it, although embodied cognitive science undeniably is an exciting and vibrant area of research, it is hard not to agree with Wilson's (2002: 625) assessment of what might be perceived as a conceptual muddle: While this general approach is enj oying increasingly broad support, there is in fact a great deal of diversity in the claims involved and the degree of
6
Tom Ziemke and Roslyn M. Frank
controversy they attract. If the term "embodied cognition" is to retain meaningful use, we need to disentangle and evaluate these diverse claims.
In fact, that is exactly, what we hope to offer readers with the collection of
chapters making up this volume, namely, a means to disentangle and evaluate the different perspectives, and not least to understand which of them are actually competing or rather complementary. The papers selected for inclusion in this first volume have been written by some of the leading cognitive scientists, cognitive linguists, psychologists, philosophers, AI researchers, semioticians, and phenomenologists working on embodied cognition today. That means, the perspective of cognitive linguistics is here put into the context of a number of related disciplines. The abovementioned second volume, on the other hand, focuses on social situatedness and contains more directly linguistically oriented contributions. Admittedly, we cannot promise that after reading all of the contributions to this volume you will feel much wiser in terms of exactly which notion of embodiment is the "correct" one. But you will have learned a great deal about the most important current perspectives on embodiment and their historical backgrounds as well as the overlaps and controversies existing between them. That, we believe, is the best deal anybody can offer at the moment when it comes to understanding the multidisciplinarity of modem embodiment research. In that sense, this book can be read as an update to some of the "classical", highly influential books on embodiment that appeared in the 1990s, such as Varela, Thompson and Rosch's (1991) The Embodied Mind, Clark's (1997) Being There, or Lakoff and Johnson's (1999) Philosophy in the Flesh, as well as other multi-author collections on the topic (e.g. Nunez and Freeman 1999; Ziemke 2002; Robbins and Aydede in press). Other books provide more detailed accounts on specific topics, such as Damasio's (1994, 1999,2003) books on the role of the body in emotion and consciousness, Goldin-Meadow's (2003) book on gesture as embodied thought, Pfeifer's books on embodied AI (Pfeifer and Scheier 1999; Pfeifer and Bongard 2006), or Maturana and Varela's (1980, 1987) classical books on the biology of cognition. There are also several recent books that do a very good job at integrating some of the perspectives, such as Gallagher's (2005) How the Body Shapes the Mind, Gibbs' (2006) Embodiment and Cognitive Science, and Thompson's (2007) Mind in Life. However, we believe that these works need to be complemented with a broad spectrum of perspectives, from different disciplinary and historical backgrounds, as collected in this multi-author volume.
Introduction: The body eclectic
2.
7
Overview of the contributions to this volume
This volume is divided into three parts or sections, each of which consists of four or five chapters. The first section consists of four chapters that explore in detail the historical roots underlying current discussions of embodiment in cognitive science and linguistics. First, Johnson and Rohrer contrast the dualistic view of mind and body as two ontologically different entities, connected through internal representations of external reality, with what they call embodied realism, according to which cognition and language must be understood as arising from organic processes. They trace the rejection of mind-body dualism from early American Pragmatists, such as James and Dewey, forward to recent embodied cognitive science, which they argue needs to be pragmatically-centered. Costall's chapter points out that the move from mechanistic behaviorism to cognitive psychology and the computer metaphor for mind, through its focus on internal information processing, in fact simply maintained the view of the mechanical body as a mere shell or container for cognitive processes. As an alternative, he elaborates on the contributions that Gibson's "ecology of embodied agency" and his concept of affordances can make to our understanding of embodiment and "being in the world". Sonesson's chapter addresses the contributions that phenomenology and semiotics, with their focus on consciousness and meaning, can provide to current discussions of embodiment in cognitive linguistics, biosemiotics, and other disciplines, discussions which, in his view, tend to ignore relevant distinctions between different types and forms of meaning. Furthermore, building on Piaget's concept of differentiation, he suggests a developmental sequence going from schemas to signs and external representations. Lindblom and Ziemke, in the last chapter of the "historical roots" section, point out that, despite much recent emphasis on the bodily and social basis of cognition, the role of embodiment in social interaction is still relatively poorly understood. They therefore trace the role of biological and sociocultural factors in explanations of cognition from Darwin to modem cognitive science, and discuss further steps and conceptual clarifications that will be required in the development of a science of embodied cognition and social interaction.
8
Tom Ziemke and Roslyn M. Frank
The second section of this volume consists of four chapters that address from different perspectives the relation between body and mind as well as the actual mechanisms underlying embodied cognition. Gardenfors elaborates his theory of conceptual spaces, i.e. representational spaces built up from "quality dimensions". While earlier work on conceptual spaces focused on perceptual concepts, here it is shown how this analysis can be extended to actions as spatiotemporal patterns of forces and functional concepts as regions of affordances in an action space. The discussion extends work in cognitive linguistics/semantics by Johnson and Talmy on "force dynamics" and discusses the transition from an embodied, first-person notion of force to a third-person perspective. Ikegami and Zlatev argue that there is a tendency in embodiment theories to "resolve" oppositions between, for example, interaction and representation, or procedural and declarative lmowledge, by simply ignoring the differences, and thus, in effect, typically eliminating the more "disembodied" side of such oppositions. They therefore explore how mechanisms of pre-representational cognition such as dynamical categories, internal meaning spaces, and synaesthesia can play a role in the grounding of concepts, representations, and language. Drawing on experimental evidence from neuroscience, psychology and other disciplines, Svensson, Lindblom and Ziemke propose that a key to understanding the embodiment of cognition is the "sharing" of neural mechanisms between sensorimotor processes and higher-level cognitive processes. The latter are argued to be embodied in the sense that they make use of partial simulations or emulations of sensorimotor processes through the re-activation of neural circuitry also active in bodily perception and action. Gallagher, in the last chapter of the "body and mind" section, clarifies the distinction between body image and body schema from a phenomenological perspective, and then discusses how this distinction can contribute to our understanding of embodied experience and intersubjectivity. Relating these concepts to empirical research concerning issues such as intentional action, imitation and mirror neurons, he connects the phenomenological insights of Husserl and Merleau-Ponty to the most recent neuroscience research on social cognition. The third and final section contains five chapters that address the relation between body, language and culture from a range of different perspectives.
Introduction: The body eclectic
9
Zlatev argues that much recent work on embodiment tends to undervalue concepts such as representation, consciousness and convention/normativity, which he claims makes it difficult for current "embodiment theories" to account for human language and cognition. To illustrate these difficulties, he critically examines the approach presented in Lakoff and Johnson's (1999) highly influential book Philosophy in the Flesh, and develops his own alternative notion of mimetic schema as a mediator between the individual human body and collective language. Rohrer reviews the variety of standard usages that the term "embodiment" currently has in cognitive science and contrasts notions of embodiment and experientialism at different levels of investigation. With the goal of developing a broad-based theoretical framework for embodiment he examines examples of these usages and in the process brings into view related research issues such as mental imagery, mental rotation, spatial language and conceptual metaphor across several levels of investigation, with a focus on whether and how different conceptualizations can form a cohesive research program. Emmeche presents biosemiotics as a new perspective on living systems inspired by von Uexklill and Peirce. He suggests necessary distinctions between physical, biological, animate, phenomenal and social body, and develops a Neo-Aristotelian evolutionary emergentist perspective which he argues to be necessary in order to coherently account for these dimensions of embodiment. Furthermore, he characterizes humans as techno-culturally embedded beings within a space of meanings that are not only symbolic, but also socially empowered by different kinds of sociocultural systems. From the perspective of artificial intelligence, Steels and de Boer consider the kinds of categories that have been found to be involved in human behavior, and how they are shaped by embodiment as well as the collective dynamics generated by social interactions. Using a computational case study developed for speech sounds, they show how a group of autonomous agents equipped with a sufficiently realistic perceptual and auditory apparatus can arrive at a shared repertoire of vowels that exhibits the same universal trends as found in human vowel systems, although not biased by innate a priori categories. Roth, finally, in the last chapter of the volume, argues that many researchers are pre-occupied with written language as the main paradigm of communication. A large part of human communication, however, involves participants in face-to-face conversation, attending to semiotic resources such as gestures, body positions, and material structures in the environ-
10
Tom Ziemke and Roslyn M. Frank
ment. He provides a dialectical account of human activity and analyses an episode from a science classroom to exemplify the central role of the body in communication and to illustrate his view of communication as materially and socially situated and embodied practice.
3.
Conclusion
Embodied cognitive science is still a relatively young approach to the study of mind and language. Critics of the approach might argue that it has so far failed to produce a coherent theoretical framework that integrates and unifies different approaches and disciplines. However, one could also argue that it was in fact the premature convergence of traditional cognitive science on functionalist/computationalist and representationalist/symbolic theories of mind that for a long time disconnected cognitive science from many of its historical roots which are now being rediscovered (cf. e.g. Lindblom and Ziemke this volume; Johnson and Rohrer this volume). Hence, we are convinced that a fuller understanding of embodied cognition will emerge from the type of broad multidisciplinary interaction that has led to the production of this multi-author volume, rather than from the more narrow insights that any single discipline or approach could produce on its own. At the same time we are also convinced that the road towards a unified theoretical framework in embodied cognitive science is neither short nor straightforward. Indeed, there still is much work ahead for theorists and modelers of embodied cognition. But, as they say, even the longest journey starts with the first steps. And one of the most important steps in the development of a truly interdisciplinary science of embodied cognition certainly is to acquire a clear comprehensive understanding of the different perspectives that characterize this field today.
References Chrisley, Ron and Tom Ziemke 2003 Embodiment. In: Encyclopedia of Cognitive Science, 1102-1108. London: Macmillan. Clancey, William J. 1997 Situated Cognition: On Human Knowledge and Computer Representations. New York: Cambridge University Press.
Introduction: The body eclectic
11
Clark, Andy Being There - Putting Brain, Body and World Together Again. 1997 Cambridge, MA: MIT Press. An embodied cognitive science? Trends in Cognitive Sciences 3 (9): 1999 345-351. Clark, Andy and David Chalmers 1998 The extended mind. Analysis 58 (1): 7-19. Damasio, Antonio R. 1994 Descartes' Error: Emotion, Reason, and the Human Brain. New York: Avon Books. The Feeling of What Happens: Body and Emotion in the Making of 1999 Consciousness. New York: Harcourt Inc. Looking for Spinoza: Joy, Sorrow and the Feeling Brain. New York: 2003 Harcourt Inc. Gallagher, Shaun 2005 How the Body Shapes the Mind. Oxford: Oxford University Press. Gardner, Howard 1987 The Mind's New Science. New York: Basic Books. Gibbs, Ray Embodiment and Cognitive Science. Cambridge: Cambridge Univer2006 sity Press. Goldin-Meadow, Susan 2003 Hearing Gesture: How Our Hands Help Us Think. Cambridge, MA: Harvard University Press. Hamad, Stevan 1990 The Symbol Grounding Problem. Physica D, 42: 335-346. Hutchins, Edwin 1995 Cognition in the Wild. Cambridge, MA: MIT Press. Kirsh, David and Paul Maglio 1994 On Distinguishing Epistemic from Pragmatic Action. Cognitive Science, 18 (4): 513-549. Lakoff, George and Mark Johnson 1999 Philosophy in the Flesh: The Embodied Mind and its Challenges to Western Thought. New York: Basic Books. Lakoff, George and Rafael Nunez 1999 Where Mathematics Comes From. New York: Basic Books. Maturana, Humberto and Francisco J. Varela 1980 Autopoesis and Cognition: The Realization of the Living. Dordrecht, The Netherlands: D. Reidel Publishing. 1987 The Tree of Knowledge - The Biological Roots of Human Understanding. Boston: Shambalaya.
12
Tom Ziemke and Roslyn M. Frank
Nuiiez, Rafael 1999 Could the Future Taste Purple? Reclaiming Mind, Body and Cognition. Journal ofConsciousness Studies, 6 (11-12): 41-60. Nuiiez, Rafael and WaIter J. Freeman (eds.) 1999 Reclaiming Cognition - the Primacy of Action, Intention and Emotion. Exeter: Imprint Academic. Pfeifer, Rolf and Christian Scheier 1999 Understanding Intelligence. Cambridge, MA: MIT Press. Pfeifer, Rolf and Josh Bongard 2006 How the Body Shapes the Way We Think: A New View of Intelligence. Cambridge, MA: MIT Press. Robbins, Philip and Murat Aydede in press Cambridge Handbook on Situated Cognition. Cambridge, U.K.: University of Cambridge Press. Suchman, Lucy 1987 Situated Action. Cambridge, MA: MIT Press. Svensson, Henrik, Jessica Lindblom and Tom Ziemke this vo!. Making sense of embodied cognition: simulation theories of shared neural mechanisms for sensorimotor and cognitive processes. Thompson, Evan 2007 Mind in Life: Biology, Phenomenology, and the Sciences of Mind. Cambridge, MA: Harvard University Press. Varela, Francisco J., Evan Thompson and Eleanor Rosch 1991 The Embodied Mind: Cognitive Science and Human Experience. Cambridge, MA: MIT Press. Wilson, Margaret 2002 Six views of embodied cognition. Psychonomic Bulletin and Review 9 (4): 625-636. Ziemke, Tom 1999 Rethinking Grounding. In: Alexander Riegler, Markus Peschl and Astridvon Stein (eds.), Understanding Representation in the Cognitive Sciences: 177-190. Plenum Press: New York. 2001 a Are robots embodied? In: Christian Balkenius et a!. (eds), Proceedings ofthe First International Workshop on Epigenetic Robotics, 7583. Lund, Sweden: Lund University Cognitive Studies, vo!. 85. 2001 b The construction of 'reality' in the robot. Foundations of Science 6 (1): 163-233. 2003 What's that thing called embodiment? In: Richard AIterman and David Kirsh (eds.), Proceedings of the 25 th Annual Meeting of the Cognitive Science Society, 1305-1310. Mahwah, NJ: Lawrence Erlbaum.
Introduction: The body eclectic 2007
13
What's life got to do with it? In: Antonio Chella and Riccardo Manzotti (eds.), Artificial Consciousness: 48-66. Exeter: Imprint Academic. Ziemke, Tom (ed.) 2002 Special issue on situated and embodied cognition. Journal of Cognitive Systems Research 3 (3). Zlatev, Jordan 1997 Situated Embodiment. Studies in the Emergence ofSpatial Meaning. Stockholm: Gotab Press. 2001 The epigenesis of meaning in human beings, and possibly in robots. Minds and Machines, 11 (2): 155-195. 2003 Meaning = Life (+ Culture). An outline of a unified biocultural theory of meaning. Evolution ofCommunication 4 (2): 253-296.
Section A Historical roots
We are live creatures: Embodiment, American Pragmatism and the cognitive organism Mark Johnson and Tim Rohrer
Abstract The philosophical tradition mistakenly asks how the inside (i.e., thoughts, ideas, concepts) can represent the outside (i.e., the world). This trap is a consequence of the view that mind and body must be two ontologically different entities. On this view the problem of meaning is to explain how disembodied "internal" ideas can represent "external" physical objects and events. Several centuries have shown that given a radical mind-body dichotomy, there is no way to bridge the gap between the inner and the outer. When "mind" and "body" are regarded as two fundamentally different kinds, no third mediating thing can exist that possesses both the metaphysical character of inner, mental things and simultaneously possesses the character of the outer, physical things. Embodied Realism, in contrast to Representationalist theories, rejects the notion that mind and body are two ontologically distinct kinds, and it therefore rejects the attendant view that cognition and language are based on symbolic representations inside the mind of an organism that refer to some physical thing in an outside world. Instead, the terms "body" and "mind" are simply convenient shorthand ways of identifying aspects of ongoing organism-environment interactions - and so cognition and language must be understood as arising from organic processes. We trace the rejection of this mind-body dualism from the philosopher-psychologists known as the early American Pragmatists (James and Dewey) forward through recent cognitive science (such as Varela, Maturana, Edelman, Hutchins, Lakoff, Johnson, Brooks). We argue that embodied realism requires a radical reevaluation of the classical dualistic metaphysics and epistemology - especially the classical Representationalist theory of mind - and we conclude by investigating the implications for future investigations for a new, pragmatically-centered cognitive science. Keywords: cognitive linguistics, cognitive science, embodiment, image schema, metaphor, neurobiology, pragmatism, Representationalism, semantics.
18
1.
Mark Johnson and Tim Rohrer
Introduction: What difference does embodied realism make?
When a young child crawls toward the fire in the hearth and a mother snatches up the child before the child gets burned, is that cognition? When a team of British mathematicians decodes enemy ciphers during wartime, is that cognition? When ants carrying food back to their nest lay down chemical signals and thereby mark trails to a food source, is that cognition? Note the commonalities among these situations. In each case the body (both individual and social) is in peril. First, the well-being and continued successful functioning of the organism is at risk. Action must be undertaken to ensure the continued flourishing of the living, physical, individual body of the organism. To survive and flourish, the organism must make adjustments in its way of acting, both within its current environment and in its relations with other creatures. The child must be snatched from the imminent danger of the flames, the mathematicians desperately work to prevent their country from being overrun by the enemy, and the ants must find food and bring it back to the queen in order for the colony to survive. Second, note that in each case the cognition is social, composed of multiple organisms co-operating in response to current or anticipated problems posed by the environment. That environment is not merely physical but also includes the social "body" - whether the family, the nation or the ant colony - whose survival and flourishing is at risk. And finally, note that each of these situations have been taken by theorists as emblematic of cognition par excellance (Dewey 1925; Hodges 1983: 160-241; Deneubourg et al. 1983; Brooks and Flynn 1989). The importance of embodiment in cognition is now widely appreciated in the cognitive sciences, yet there remains considerable debate as to what the term "embodiment" actually means (Rohrer 2001a; in press; Ziemke 2003; Anderson 2003). Is "the body" merely a physical, causally determined entity? Is it a set of organic processes? Is it a felt experience of sensations and movement? Is it the individual physical body, or does it include the social networks such as families without which it would cease to exist? Or is the body a socially and culturally constructed artifact? In this chapter, we argue that each of these views contributes something important to an adequate theory of embodied cognition, and that a proper understanding of embodiment can be found within the philosophical context first elaborated in early American Pragmatism in the works of thinkers such as William James and John Dewey. As we see it, embodiment theory inherits several key tenets of how these Pragmatist philosophers viewed cognition:
We are live creatures
(1) (2) (3) (4)
(5)
19
Embodied cognition is the result of the evolutionary processes of variation, change and selection. Embodied cognition is situated within a dynamic ongoing organismenvironment relationship. Embodied cognition is problem-centered, and it operates relative to the needs, interests, and values of organisms. Embodied cognition is not concerned with finding some allegedly perfect solution to a problem, but one that works well enough relative to the current situation. Embodied cognition is often social and carried out cooperatively by more than one individual organism.
Note that the Pragmatists advance a radically different view of cognition than the one we are most familiar with from "classical" cognitive science, where it is assumed that cognition consists of the application of universal logical rules that govern the manipulation of "internal" mental symbols, symbols that are supposedly capable of representing states of affairs in the "external" world. Fodor summarizes this theory as follows: What I am selling is the Representational Theory of Mind ... At the heart of the theory is the postulation of a language of thought: an infmite set of 'mental representations' which function both as the immediate objects of propositional attitudes and as the domains of mental processes. (Fodor 1987: 16-17)
These internal representations in the "language of thought" acquire their meaning by being "about" - or referring to - states of affairs in the external world. Fodor aclmowledges that his Representationalist theory of meaning requires "a theory that articulates, in nonsemantic and nonintentional terms, sufficient conditions for one bit of the world to be about (to express, represent, or be true of) another bit" (Fodor 1987: 98). Typically the first "bit" would be a symbol in the internal language of thought while the second "bit" that it represents might be either some thing or event in the external world or else a brain state underlying a conception of some fictive entity or scene. The internal/external split that underlies this view presupposes that cognition could be detached from the nature and functioning of specific bodily organisms, from the environments they inhabit and from the problems that provoke cognition. Given this view, it would follow that cognition could take place in any number of suitable media, such as a human brain or a machine. This theoretical viewpoint, functionalism, was instru-
20
Mark Johnson and Tim Rohrer
mental in the developing the first electronic calculating machines and general-purpose computers. In fact, these machines were originally developed by the British military to reduce the tedious workload of military mathematicians (or human "computers" - in the sense of humans who compute). But this thought experiment did not end merely with offloading the tedium of calculation onto electronic machines. From its original conception in the work of Alan Turing (1937), the idea of a universal computing machine became the metaphor of choice for future models of the brain. For example in Newell and Simon's (1976) conception of the brain as a physical symbol system, they consider the human brain to be just a specific instance of a Turing-style universal machine. In short, for classical cognitive science cognition is defined narrowly as mathematical and logical computation with intrinsically meaningless internal symbols that can supposedly be placed in relation to aspects of the external world. The Pragmatist challenge to classical cognitive science should come as no surprise, since one of the Pragmatists' chief targets was the tendency within the philosophical tradition to assume that what demarcates "rational" humans from "lower" animals is the supposedly unique ability of humans to engage in symbolic representation between internal thoughts/ language and the external world. The remedy offered by the Pragmatists is based on their view that cognition is action, rather than mental mirroring of an external reality. Moreover, cognition is a particular kind of action - a response strategy that applies some measure of forethought in order to solve some practical real-world problem. During World War II the practical problem of breaking the German codes was of utmost importance to the British war effort, and this led to the development of a series of machines (the Bombes) which could try a vast number of possible cipher keys against intercepted German communications. These decoding machines were among the predecessors of the modem computer. Early computers were designed to model human action - computing possible cipher keys so that machines would replace human labour (Hodges 1983: 160-241). However, this success in the modelling of a very specific intellectual operation was soon mistakenly regarded as the key to understanding cognition in general. If one thinks that mathematical and logical reasoning is what distinguishes human beings from other animals, one might erroneously assume that any computational machine that could model aspects of this peculiarly human trait could also be used to model cognition in general. Hence the MIND As COMPUTER metaphor swept early (firstgeneration) cognitive science. This is a disembodied view of rationality. By
We are live creatures
21
contrast, on the Pragmatist view, our rationality emerges from, and is shaped by, our embodied nature. Thus, Dewey famously asserted that "to see the organism in nature, the nervous system in the organism, the brain in the nervous system, the cortex in the brain is the answer to the problems which haunt philosophy" (Dewey 1925: 198). In the following sections we show how the Pragmatist view of cognition as action provides an appropriate philosophical framework for the cognitive science of the embodied mind. We begin by describing the nondualistic, non-Representationalist view of mind developed by James and Dewey. Their understanding of situated cognition is reinforced by recent empirical research and developments within the cognitive sciences. We cite evidence from comparative neurobiology of organism-environment coupling ranging from the amoeba all the way up to humans, and we argue that in humans this coupling process becomes the basis of meaning and thought. We describe the patterns of these ongoing interactions as image schemas that ground meaning in our embodiment and yet are not internal representations of an external reality. This leads to an account of an emergent rationality that is embodied, social and creative.
2.
James and Dewey: The continuity of embodied experience and thought
In many ways the American Pragmatist philosophers James and Dewey provide us today with exemplary non-reductionist and non-Representationalist models of embodied mind. Their models combined the best biology, psychology and neuroscience of their day with nuanced phenomenological description and a commitment that philosophy should address the pressing human problems of our lives. James and Dewey understood something taken for granted in contemporary biological science: cognition emerges from the embodied processes of an organism that is constantly adapting to better utilize relatively stable patterns within a changing environment. One problem for such a naturalistic account of mind is to explain how meaning, abstract thinking, and formal reasoning could emerge from the basic sensorimotor capacities of organisms as they interact with the environment and each other. The fundamental assumption of the Pragmatists' naturalistic approach is that everything we attribute to "mind" - perceiving, conceptualizing, imagining, reasoning, desiring, willing, dreaming - has emerged (and con-
22
Mark Johnson and Tim Rohrer
tinues to develop) as part of a process in which an organism seeks to survive, grow and flourish within different kinds of situations. As James puts it: Mental facts cannot be properly studied apart from the physical environment of which they take cognizance. The great fault of the older rational psychology was to set up the soul as an absolute spiritual being with certain faculties of its own by which the several activities of remembering, imagining, reasoning and willing, etc. were explained, almost without reference to the peculiarities of the world with which these activities deal. But the richer insight of modem days perceives that our inner faculties are adapted in advance to the features of the world in which we dwell, adapted, I mean, so as to secure our safety and prosperity in its midst. (James 1900: 3)
This evolutionary embeddedness of the organism within its changing environments, and the development of thought in response to such changes, ties mind inextricably to body and environment. The changes entailed by such a view are revolutionary. From the very beginning of life, the problem of knowledge is not how so-called internal ideas can re-present external realities. Instead, the problem of knowledge is to explain how structures and patterns of organism-environment interaction can be adapted and transformed to help deal constructively with changing circumstances that pose new problems, challenges and opportunities for the organism. On this view, mind is never separate from body, for it is always a series of bodily activities immersed in the ongoing flow of organism-environment interactions that constitutes experience. In Dewey's words: Since both the inanimate and the human environment are involved in the functions of life, it is inevitable, if these functions evolve to the point of thinking and if thinking is naturally serial with biological functions, that it will have as the material of thought, even of its erratic imaginings, the events and connections of this environment. (Dewey 1925: 212-213)
Another way of expressing this rootedness of thinking in bodily experience and its connection with the environment is to say that there is no rupture in experience between perceiving, feeling and thinking. In explaining ever more complex "higher" functions, such as consciousness, self-reflection and language use, we do not postulate new ontological kinds of entities, events, or processes that are non-natural or super-natural. More complex levels of organic functioning are just that - levels - and nothing more, although there are emergent properties of "higher" levels of functioning. Dewey names this connectedness of all cognition the principle of continu-
We are live creatures
23
ity, which states that "there is no breach of continuity between operations of inquiry and biological operations and physical operations. 'Continuity' [...] means that rational operations grow out of organic activities, without being identical with that from which they emerge" (Dewey 1938: 26). What the continuity thesis entails is that any explanation of the nature and workings of mind, even the most abstract conceptualization and reasoning, must have its roots in our organismic capacities for perception, feeling, object manipulation and bodily movement. Furthermore, social and cultural forces are required to develop these capacities to their full potential, including language and symbolic reasoning. Infants do not speak or discover mathematical proofs at birth; Dewey's continuity thesis requires both evolutionary and developmental explanations. For James and Dewey, this meant that a full-fledged theory of human cognition must have at least three major components:
(1)
(2)
(3)
There must be an account of the emergence and development of meaningful patterns of organism-environment interactions - patterns of sensorimotor experience shared by all organisms of a certain kind and meaningful for those organisms. Such patterns must be tied to the organism's attempts to function within its environment. There must be an account of how we can perform abstract thinking using our capacities for perception and motor response. There would need to be bodily processes for extending sensorimotor concepts and logic for use in abstract reasoning, as well as an account of how the processes embodying such abstract reasoning capacities are learned during organismic development. This story has at least two parts: (a) an evolutionary and physiological account explaining how an adult human being's abstract reasoning utilizes the brain's perceptual and motor systems, and (b) a developmental and anthropological account of how social and cultural behaviors educate the sensorimotor systems of successive generations of children so that they may speak and perform abstract reasoning. There must be an account of how values and behavioural motivations emerge from the organism's ongoing functioning. This explanation will include (a) the physical and social makeup of organisms, (b) the nature of their emotional responses, and (c) the kinds of environments (e.g., material, social, cultural) they inhabit. In the present space we are able to offer only a very compressed and partial treatment of such an account.
24
Mark Johnson and Tim Rohrer
3.
Organism-environment coupling
3.1.
Maturana and Varela: From chemotaxis to the nervous system
Dewey's principle of continuity states that there are no ontological gaps between the different levels of an organism's functioning. One way to see what this entails is to survey a few representative types of organismenvironment couplings, starting with single-cellular organisms and moving up by degrees to more complex animals. In every case we can observe the same adaptive process of interactive co-ordination between a specific organism and recurring characteristics of its environment. But does that mean that we can trace human cognition all the way back to the sensorimotor behavior of single-cellular organisms? On the face of it, this seems preposterous - viewed from an evolutionary biologist's perspective, there are clear differences in the size, complexity and structural differentiation of human beings as compared with single-cellular organisms like bacteria. Single-cellular organism behaviour is not ordinarily relevant to the behaviour of multi-cellular organisms - except insofar as there might be structural morphological analogies between the sensorimotor activity of singlecellular organisms and particular sensorimotoric cells within the multicellular body. Just this sort of morphological analogy plays a key role in Maturana and Varela's argument that central nervous systems evolved in multi-cellular organisms to co-ordinate sensorimotor activity (Maturana and Varela 1998: 142-163). In a single-cellular organism locomotion is achieved by dynamically coupling the sensory and motoric surfaces of the cell membrane. When an amoeba engulfs a protozoan, its cell membranes are responding to the presence of the chemical substances that make up the protozoan, causing changes in the consistency of the amoeba's protoplasm. These changes manifest as pseudopods - digitations that the amoeba extends around the protozoan as it prepares to feed upon it. Similarly, certain bacteria have a tail-like membrane structure called a flagellum that is rotated like a propeller to move the bacterium. When the flagellum is rotated in one direction the bacterium simply tumbles, while reversing the direction of rotation causes the bacterium to move. If a grain of sugar is placed into the solution containing this bacterium, chemical receptors on the cell membrane sense the sugar molecules. This causes a membrane change in which the bacterium changes the direction of rotation of its flagellar propeller and gradually moves toward the greatest concentration of the sugar molecules (che-
We are live creatures
25
motaxis). In both cases, changes in the chemical environment cause sensory perturbations in the cellular membrane, which invariably produces movement. The key point here is that, without anything like an internal representation, single-cellular organisms engage in sensorimotor coordination in response to environmental changes. Even at this apparently primitive level, there is a finely tuned ongoing coupling of organism and environment. Multi-cellular organisms also accomplish their sensorimotor coordination by means of changes in their cell membranes. However, the cellular specialization afforded by a multi-cellular organism means that not every cell needs to perform the same functions. Maturana and Varela (1998) discuss the example of an evolutionarily ancient metazoic organism called the Hydra (a coelenterate). The Hydra, which lives in ponds, is shaped like a two-layered tube with four or six tentacles emanating from its mouth. On the inside layer of the tube, most cells secrete digestive fluids, while the outside layer is partly composed of radial and longitudinal muscle cells. Locomotion is accomplished by contracting muscle cells along the body of the organism: some of these contractions cause changes in the hydrostatic pressure within the organism, changing its shape and direction of locomotion. Between the two layers of cells, however, are specialized cells - neurons - with elongated membranes that can extend over the length of the entire organism before terminating in the muscle cells. These tail-like cellular projections are the axons, and evolutionarily speaking they are the flagella of the multi-cellular organism.! Changes in the electrochemical state in other, smaller cellular projections of the cells (the dendrites) cause larger changes in the electrochemical state of the axonal membrane, which in turn induces the muscle cells to contract. These neural signals typically originate in either the tentacles or the "stomach" of the Hydra, such that their electrochemical state responds to the molecules indicating the presence or absence of food and/or excessive digestive secretions. These neurons consistently terminate in the longitudinal and radial muscles that contract the Hydra body for locomotion or for swallowing. The topology of how the nerve cells interconnect is crucially important: when touched, a chain of neurons fire sequentially down a Hydra tentacle toward its mouth
1. Recent research shows that this may be more than a surface morphological analogy: all microtubular cellular projections stem from a common ancestor (Erickson et al. 1996; Goldberg 2003).
26
Mark Johnson and Tim Rohrer
and cause the muscle cells to curl the tentacle about its prey even as its mouth begins to open. The Hydra does not "represent" an external world; instead, the structural coupling between organism and environment allows the Hydra to contract the correct muscles to swallow, or to move up and left, or right and down. Like the Hydra opening its mouth as a reflexive part of bringing food to it with its tentacles, we humans think in order to act and we act as part of our thinking - cognition is action. But how is it that we humans can learn new behaviors, while the Hydra generally cannot?
3.2.
From neural maps to neural plasticity
Although still surprisingly continuous with the Hydra, human cognition is a little more similar to what happens in frogs, owls and monkeys in that all of these organisms have nervous systems that include neural maps and adaptive neural plasticity. Frogs have a certain regularly occurring pragmatic problem - they need to extend their tongues to eat a fly - which was the subject of a classic experiment in the early history of neurobiology (Sperry 1943). When a frog is still a tadpole, it is possible to rotate the frog's eye 180 degrees, making sure to keep the optic nerve intact. The tadpole is then allowed to develop normally into a frog. The frog's tongue extends to exactly the opposite point of the frog's visual field from where the fly is located. No amount of failure at catching the fly will teach the frog to move its tongue differently; the frog acts entirely on the basis of the rewired neural connections between the retinal image and the tongue muscles. Maturana and Varela conclude thatfor thefrog "there is no such thing as up or down, front and back, in reference to an outside world, as it exists for the observer doing the study" (1998: 125-126). The frog has no access to our notion of the external world and our 180-degree rotation of its eye; it has only its experience of the world found in the neurons comprising its (experimentally inverted) retinal map. One of the most profound findings in neuroscience is that nervous systems exploit topological and topographic organization. In other words, organisms build neural "maps". In neural maps, adjacent neural cells (or small groups of neural cells) fire sequentially when a stimulus in adjacent positions within a sensory field moves. For example, scientists have stimulated the frog's visual field and measured the electrical activity of a region of its brain to show that as one stimulates the frog's visual field, the
We are live creatures
27
neurons of its optic tectum will fire in co-ordination with the visual stimulus. Fraser (1985) covered the frog's optic tectum with a 24-electrode grid, with each electrode recording electrical activity that was the sum of the signals from a receptive field containing many optic nerve fiber terminals. When a point of light was moved in a straight line from right to left and then from bottom to top in the frog's right visual field, the electrode grid recorded neuronal activity in straight lines, firing sequentially, first from the rostral (front) to the caudal (back) and then from the lateral to the medial. We call this the frog's retinal (or retinotectal) map because it encodes environmental visual stimuli in a topographically consistent manner. The spatial orientation of this topography is rotated in various ways. Thus visual right-to-Ieft has become front-to-back and so on, but the topographic mapping between movement in the vertical visual plane and the plane of the retinotectal neural map remains consistent. Even though there is considerable spatial distortion in the neural map, the key relational structures are preserved. In some other cases, such as some auditory maps and colour maps, where the correspondences can be less about shape and position, the organization is more properly called topologic than topographic, but the organizing principle of the neural mapping of sensation still holds. The degree to which such neural maps might be plastic has been the subject of much recent study. In the case of rotating the eye of the frog, Sperry performed a radical and destructive intervention that is outside the realm of "normal" Darwinian deviation - in other words, if this were to occur by natural selection such a frog would die quickly without passing on its genes. However, interventions which are less radical and perhaps more likely to occur in nature, such as cutting the optic nerve and destroying part of the optic tectum of a goldfish, result in a recovery of function in which the optic nerve axons regenerate to make a complete retinal map in the remaining part of the tectum (Gaze and Sharma 1970). Although radical interventions can "break" the neural maps, even the more evolutionarily determined neural networks exhibit some range of adaptive neural plasticity to environmental factors. Plasticity is particularly profound in cross-modal neural maps. Consider another subtle environmental intervention: suppose we were to have an owl wear glasses that changed its perception of the visual field. Similar to the frog, owls have developed an extremely accurate method of attacking prey. The owl hears a mouse rustling on the ground and locates the mouse using the tiny difference in time it takes for a sound to reach one ear versus the time it takes the sound to reach the owl's other ear. This establishes the
28
Mark Johnson and Tim Rohrer
mouse's approximate position in the owl's retinotectal map, and the diving owl then visually confirms the exact location of its prey before it strikes. Knudsen and colleagues (Knudsen 1998, 2002) put prismatic glasses on adult and juvenile owls which distorted the owls' vision by 23 degrees. After 8 weeks with glasses, adults raised normally never learned to compensate, but juveniles were able to learn to hunt accurately. Moreover, when the glasses were reintroduced to the adult owls who had worn them as juveniles, they were then able to readjust to the glasses in short order; in other words, the prism-reared owls could successfully hunt with or without glasses. These behavioural adaptations have anatomical underpinnings in the plasticity of the neural maps. When injected with an anatomical tracing dye, comparison of the neural arbors from normally-reared and prismreared owls revealed a different pattern of axonal projections between auditory and spatial neural maps, "showing that alternative learned and normal circuits can coexist in this network" (Knudsen 2002: 325). In other words, in order to deal with wearing glasses, the owl brain had grown permanent alternative axonal connections in a cross-modal neural map of space located in the external nucleus of the inferior colliculus (ICX). The ICX neural arbor of prism-reared owls was significantly denser than in normally developing owls, with neurons typically having at least two distinct branches ofaxons (DeBello, Feldman and Knudsen 2001). By contrast, the retinotectal maps of the visual modality alone do not exhibit the same plasticity, either in owls (whose retinotectum did not change) or in frogs. Analogous anatomical research on frogs reared and kept alive with surgically rotated eyes has shown that after five weeks, the retinotectal neural arbors initially exhibited a similar pattern of "two-headed axons" that is, they had two major axonal branches. However, after ten weeks the older axonal connections are starting to decay and disappear, while after sixteen weeks no two-headed axons could be traced (Guo and Udin 2000). Apparently, the frog's single-modal retinotectal maps do not receive enough reentrant neural connections from other sensory modalities to sustain the multiple branching neural arbors found in the cross-modal map of the prism-reared owls. Working on neural plasticity in adult squirrel and owl monkeys, Merzenich and colleagues (Merzenich et al. 1987; reviewed in Buonomano and Merzenich 1998) have shown that it is possible to dynamically reorganize the sensorimotor cortical maps subject to certain bodily constraints. Similar to the owls and frogs that grew dual arborizations, these monkeys exhibited
We are live creatures
29
a plasticity based on their brains' ability to select which parts of their neural arbors to use for various kinds of input. In a series of studies, Merzenich and colleagues altered the monkey's hand sensory activity by such interventions as (1) cutting a peripheral nerve such as the medial or radial nerve and (la) allowing it to regenerate naturally or (lb) tying it off to prevent regeneration; (2) amputating a single digit; and (3) taping together two digits so that they could not be moved independently. The results show that cortical areas now lacking their previous sensory connections (or independent sensory input in the third condition) were "colonized" in a couple of weeks by adjacent neural maps with active sensory connections. In other words, the degree of existing but somewhat dormant neural arbor overlap was large enough to permit reorganization. And in the case of (la), where the nerve was allowed to regenerate, the somatosensory map gradually returned to occupy a similar-sized stretch of cortex, albeit with slightly different boundaries. Learning in adults is accomplished in part by neural gating between redundant and overlapping neural arbors. All of these examples of ontogenetic neural change suggest that there is a process of neural arbor selection akin to natural selection taking place in concert with specific patterns of organism-environment interactions. On precisely these grounds the neurobiologist Gerald Edelman (1987) has proposed a theory of "Neural Darwinism", or "neuronal group selection", to explain how such neural maps are formed in the organism's embryonic development. Different groups of neurons compete to become topological neural maps as they migrate and grow during neural development. Successful cortical groups, driven primarily by regularities in the environment passed on from those neurons that are closer to a sensory apparatus, will fire together and wire together in a process of axonal sprouting and synaptogenesis. Some neuronal groups will fail to find useful topological connections, and they eventually die and are crowded out by the successful neuronal groups, while others will hang on in something of an intermediate state of success (Edelman 1987: 127-140). In the adult organism, the latent axonal arbors from only partly successful attempts to wire together lay dormant, ready to reorganize the map as needed by means of further synaptogenesis. Edelman (1987: 43-47) calls these latent reorganizations of the neuronal groups secondary repertoires, as distinguished from their normal primary repertoires. Like frogs, owls and monkeys, we humans have sets of visual, auditory and somatosensory neural maps. The more obvious of these map perceptual space in fairly direct analogs - preserving topologies of pitch, the retinal
30
Mark Johnson and Tim Rohrer
field, colour, the parts of the body and so on - but subsequent maps preserve increasingly abstract topological structure (or even combinations of structure) such as object shape, edges, orientation, direction of motion and even the particular degree of the vertical or horizontal. Like the frog, we live in the world of our maps. Topologically speaking, our bodies are in our minds, in the sense that our sensorimotor maps provide the basis for conceptualization and reasoning. We perceive the patterns of our daily organism-environment interactions in image-like fashion, constantly seeking out various topological invariances in those patterns that prove useful to us. In the following section we will show how our imagination and our reason are constituted by patterns of activation within these neural maps. But before proceeding to human cognition, we must first address why neural "maps" are not classical Representations.
3.3.
Neural maps are not internal representations
Some people might suppose that talk of neural "maps" would necessarily engender Representationalist theories of cognition. On this view, the map would be construed as an internal representation of some external reality. But the account we have been giving does not entail any of the traditional metaphysical dualisms that underlie Representationalist views - dichotomies such as inner/outer, subject/object, mind/body, self/world. Such dichotomies might describe aspects of organism-environment interactions from an observer's perspective, but they do not indicate different ontological entities or structures. According to our interactionist view, maps and other structures of organism-environment co-ordination are prime examples of non-representational structures of meaning, understanding and thought. 2 2. We are certainly not suggesting that neuroscientists should purge the term "representation" from their vocabulary. Nor are we suggesting that there is no sense in which it would be appropriate to say that some neuronal structure is a representation from the perspective of the scientist who is studying cognitive processes. For example, we do not object to neuroscientists saying that a particular neural map in the auditory cortex can "represent" various pitch relations among musical tones, though we prefer to employ more enactive terms such as "map" and "activation contours". However, such casual usage doesn't necessarily entail the Representational Theory of Mind that we are challenging here. Instead, we argue that Representationalism is based on a mistaken philosophical analogy
We are live creatures
31
Maturana and Varela (1998: 125-126) make this important philosophical point quite clear. We must not read our scientific or philosophical perspectives (i.e., our theoretical stance) on cognition back into the experience itself that we are theorizing about. We must not uncritically assume that distinctions we make in explaining a certain cognitive experience are thereby part of the person's experience. To do so is to fall prey to what James termed the "Psychologist's Fallacy". In observing something scientifically, one must always consider the standpoint of the scientist in relation to the object of study. When we use terms such as "retinal map", "pitch maps", "sensorimotor maps", "color maps" and so forth to describe the operations of various neural arrays in a frog's nervous system, or in human nervous systems, we are doing so from our standpoint as observers and theorists who can see mappings between those neural structures and our own experience of the "external world". But for the frog, and for the human in the act of perceiving, that map is the basis for its experience of the world. The map constitutes the sensorimotor experience of a certain part of the frog's world. The frog's neural map itself has its origin not in the immediate mappings that we observers see in the moment, but in a longitudinal evolutionary and developmental process during which those neural connections were "selected for" by Darwinian or neo-Darwinian mechanIsms. In short, what we (as scientists) theoretically recognize and describe as an organism's "maps" are not for that organism internal representations. Rather, what we' call sensorimotor and somatosensory maps (whether in multi-cellular organisms, monkeys, or humans) are for that organism precisely the structures of its experienced world! Consequently, we must be careful not to be misled by philosophers of mind and language who would treat these maps as internal representations of external realities, thereby surreptitiously introducing an "inner/outer" split that does not exist in reality for the organism. (namely "the language of thought" framework in which a mental state refers to the world much as a word supposedly simply refers to an object or a state-ofaffairs in the world). In order to undermine such Representationalist theories, we argue that actual neural representations are perpetually situated in dynamic organism-environment interactions that are continually changing along experiential, developmental and phylogenetic timelines. Hence, it is a mistake to think that neural maps are representations in virtue of an immediate word-world referential mapping, whether that word is a linguistic entity or a mental entity in a "language of thought".
32
4.
Mark Johnson and Tim Rohrer
Ontological continuity and human thought: Image schemas and amodal perception
Since the earliest episodes of ancient Greek philosophy, humans have been distinguished from "brute" animals and all lower organisms by their supposedly unique capacity for abstract conceptualization and reasoning. According to this view, human reason is what makes it possible for us to form abstract mental representations that stand for and point to states of affairs that are either external to us or are not currently present in our experience (i.e., are past or future). But the Pragmatists' Continuity Thesis denies the inner/outer dichotomy upon which Representationalist theories are based. Consequently, the problem for an embodied view of cognition is how to explain our marvellous human feats of abstraction, reasoning and symbolic interaction, yet without positing an ontological rupture between "lower" animals and humans. The key, once again, is the coupling (the interactive co-ordination) of an organism (here, a human one) and its environment. Recurring adaptive patterns of organism-environment interaction are the basis for our ability to survive and flourish. In humans, these patterns are no more "internal" representations than they are in other creatures. Let us consider briefly some of the most basic kinds of structural couplings that make up a human being's experience of its world.
4.1.
Image schemas and cross-modal perception
The character of our experience is delineated in large part by the nature of our bodies and brains, the kinds of environments we inhabit, and the values and purposes we have. The patterns of our ongoing interactions (or "enactions" as Varela, Rosch and Thompson (1991) have called them, to stress their active, dynamic character) define the contours of our world and make it possible for us to make sense of, reason about, and act reliably within this world. Thousands of times each day we see, manipulate and move into and out of containers, so containment is one of the most fundamental patterns of our experience. Because we have two legs and stand up within a gravitational field, we experience verticality and up-down orientation. Because the qualities (e.g., redness, softness, coolness, agitation, sharpness) of our experience vary continuously in intensity, there is a scalar vector in our world. For example, lights can grow brighter or dimmer, stoves get
We are live creatures
33
hotter or cooler, iced tea gets sweeter as we add sugar. We are subject to forces that move us, change our bodily states and constrain our actions, and all of these forces have characteristic patterns and qualities. We are bound inextricably to our world interactively (enactively) by means of these recurring patterns that are the very conditions for us to survive, grow and find meaning. Without such patterns, and without neural maps of such characteristic patterns, each moment of our experience would be utterly chaotic, as though we had to make sense of our world from scratch, over and over again as each new moment arose. What Johnson (1987) and Lakoff (1987) called "image schemas" are precisely these stable recurring patterns of sensorimotor experience by which we engage a world that we can understand and act within to further our purposes. There are numerous sources of evidence for the existence of image schemas, ranging from experimental psychology to linguistics to developmental psychology. We hypothesize that these image schemas are neurally embodied as patterns of activation in and between our topological neural maps. Image schemas are thus part of our non-representational coupling with our world, just as barn owls and squirrel monkeys have image schemas that define their types of sensorimotor experience. Image schematic structure is the basis for our understanding of all aspects of our perception and motor activities. An example from Lakoff and Nufiez (2000) illustrates this image-schematic basis of spatial concepts in humans. What we call our concept IN is defined for us by a CONTAINER image schema that consists generically of (1) a boundary that demarcates (2) an interior from (3) an exterior. When we say The car is in the garage, we understand the garage as a bounded space, we profile (Langacker 1986) the interior of that space, and we regard the car as what cognitive linguists call a trajector within that space, with the garage (as container) serving as a landmark in relation to which the trajector is located. Similarly, when we hear the sentence Grandpa walked from the outhouse to the garage, we understand that situation via a SOURCE-PATH-GOAL schema that consists of (a) a starting point, (b) a destination (endpoint), and (3) a path from the starting location to the destination. In other words, the from-to construction is imageschematic. The English word into is understood via a superimposition of the SOURCE-PATH-GOAL schema on the CONTAINER schema, as follows:
34
Mark Johnson and Tim Rohrer
in activates a CONTAINER schema with the interior profiled. to activates a SOURCE-PATH-GOAL schema with the GOAL (endpoint) profiled. The GOAL (endpoint) is mapped onto the interior of the CONTAINER schema. We thus understand Grandpa's (as trajector) movement as beginning outside the garage (CONTAINER) and terminating inside the garage (as landmark), as a result of motion along a path from the exterior to the interior. into in English is thus an elementary composition of two image schemas. Image schemas are realized as activation patterns (or "contours") in human topological neural maps. As with much interdisciplinary research in the neurosciences, the evidence for this first emerged from intracranial neuronal recordings on monkeys and was later extended to humans via analogous neuroimaging studies. When Rizzolatti and colleagues (Fogassi et al 2001; see review in Rizzolatti, Fogassi and Gallese 2002) showed macaque monkeys visual imagery of another monkey grasping a banana with their hands, they were able to record activity from "mirror" neurons in the same areas of secondary somatomotor cortex that would be implicated if the monkey himself were performing the particular grasping action. Analogous human neuroimaging experiments (Buccino et al. 2001) in which participants watched a video clip of another person performing an action showed increased activation in the human secondary somatomotor cortices that are known to map human hand and arm grasping motions. Along with Rizzolatti's colleague Gallese, we interpret these and related results as having shown that these neural maps contain image schematic sensorimotor activation patterns for grasping (Gallese and Lakoff 2005). An explicit attempt to model image schemas using known facts about our neural maps can be found within the neurocomputational modelling literature. Regier (1996) has developed what he calls "structured" or "constrained" connectionist neural models for a number of image schemas. "Constrained" neurocomputational connectionism builds into its neural models a small number of structures that have been identified in research on human visual and spatial processing. These include center-surround cell arrays, spreading activation, orientation-sensitive cells and neural gating.
We are live creatures
35
Regier has shown how these constrained connectionist models of image schemas can learn spatial relations terms. 3 There is also a growing body of research from developmental psychology suggesting that infants come into the world with capacities for experiencing image-schematic structures. Stem (1985) described certain types of experiential structures that infants are able to detect, and he argues, first, that these capacities form the basis for meaning and the infant's sense of self; and, second, that these capacities continue to play a central role in meaning, understanding and thinking even in adults who are capable of propositional thinking. Let us briefly consider two of these basic structures: (1) cross-modal perception, and (2) vitality affect contours. Stem begins with a well-known experiment (Meltzoff and Borton 1979) in which blind-folded infants were given one of two pacifiers to suck. One was the typical smooth pacifier, while the other had protruding nubs. When the blindfolds were removed and smooth and nubbed pacifiers were placed on either side of the infant's head, most of the time (roughly 75%) the infant would attend to the nipple of the pacifier just sucked. Based on this and other studies (e.g., Lewkowicz and Turkewitz 1981), Stem suggests that Infants thus appear to have an innate general capacity, which can be called amodal perception, to take information received in one sensory modality and somehow translate it into another sensory modality. [...] These abstract representations that the infant experiences are not sights and sounds and touches and nameable objects, but rather shapes, intensities and temporal patterns - the more "global" qualities of experience. (Stem 1985: 51)
Although he speaks of these structures of cross-modal perception as amodal, abstract "representations", Stem also makes it clear that these perceptual structures are not inner mirrorings of external things but rather are the contours of the infant's experience: the cross-modal shapes, intensities and temporal patterns that we call image schemas. Like infants, we adults have a ROUGH/SMOOTH image schema, which we can use as we anticipate the change in surface texture as we walk. For 3. This does not, of course, prove that human cognition necessarily works this way, but Regier's use of computational neural models built on known human neural architectures offers a number of advantages over traditional PDP connectionist models. Moreover, Regier's models can be appropriated into programs that allow robots to perform certain bodily movements.
36
Mark Johnson and Tim Rohrer
example, we can see where we will step from the rough carpet of the hallway onto the slippery tile of the bathroom, and we transfer this information from the visual to the somatomotor system so that our feet will not slip. Such patterns of cross-modal perception are especially clear examples of how image schemas differ from being just a topographically mapped image in a neural map; they are sensorimotoric patterns of experience which are instantiated in and coordinated between unimodal neural maps. Our image schematic experience may, as in the case of the owl, become instantiated in its own cross-modal neural map; or, as in the case of monkeys, it might consist of coordinated activation patterns between a network of more modal neural maps, including possibly calling on the secondary rather than primary repertoires of those maps. We predict that cases analogous to each will be observed in human neuroanatomical studies. A second type of pattern that makes up the infant's (and adult's) imageschematic experience is what Stem (1985) calls "vitality affect contours". Stem illustrates this with the notion of a "rush", or the swelling qualitative contour of a felt experience. We can experience an adrenaline rush, a rush of joy or anger, a drug-induced rush, or the rush of a hot-flash. Even though these rushes are felt in different sensory modalities, they are all characterizable as a rapid, forceful building up or swelling contour of the experience across time. Stem notes that understanding how such affect contours are meaningful to creatures like us gives us profound insight into meaning generally, whether that meaning comes via language, vision, music, dance, touch, or smell. We crave the emotional satisfaction that comes from pattern completion, and witnessing even just a portion of the pattern is enough to set our affect contours in motion. The infant just needs to see us begin to reach for the bottle, and she already begins to quiet down - the grasping image schema does not even need to be completely realized in time before the infant recognizes the action. When as adults we hear a musical composition building up to a crescendo, this causes increasing emotional tension that is released at the musical climax. The emotional salience of the vitality affect contours in image schemas shows that image schemas are not mere static "representations" (or "snapshots") of one moment in a topographic neural map (or maps). Instead, image schemas proceed dynamically in and through time. To summarize, image schemas can be characterized more formally as:
We are live creatures
(1) (2) (3) (4) (5) (6)
37
recurrent patterns of bodily experience, "image"-like in that they preserve the topological structure of the perceptual whole, as evidenced by pattern-completion, operating dynamically in and across time, realized as activation patterns (or "contours") in and between topologic neural maps, structures which link sensorimotor experience to conceptualization andlanguage,and structures which afford 'normal' pattern completions that can serve as a basis for inference.
Image schemas constitute a preverbal and pre-reflective emergent level of meaning. They are patterns found in the topologic neural maps we share with other animals, though we as humans have particular image schemas that are more or less peculiar to our types of bodies. However, even though image schemas typically operate without our conscious awareness of how they structure our experience, it is sometimes possible to become reflectively aware of the image-schematic structure of a certain experience, such as when I am consciously aware of my cupped hands as forming a container, or when I feel my body as being off balance.
4.2.
Abstract conceptualization and reasoning
Pragmatism's Continuity Thesis claims that we must be able to move, without any ontological rupture, from the body-based meaning of spatial and perceptual experience that is characterizable by image schemas and affect contours, all the way up to abstract conceptualization, reasoning and language use. Although there is not yet any fully worked out theory of how all abstract thought works, some of the central mechanisms are becoming better understood. One particularly important structure is conceptual metaphor (Lakoff and Johnson 1980; 1999). The most sweeping claim of Conceptual Metaphor Theory (CMT) is that what we call "abstract" concepts are defined by systematic mappings from bodily-based sensorimotor source domains onto abstract target domains. These metaphor mappings are found in patterns motivated by image schematic constraints - for example, if we map an interior from the source domain, we can expect to map the exterior as well; if we have source and destination mappings, we can expect a path mappIng.
38
Mark Johnson and Tim Rohrer
Consider the sentence We have a long way to go before our theory is finished. Why can we use the phrase a long way to go, which is literally about distance in motion through space, to talk about the completion of a mental project (i.e., developing a theory)? The answer is that there is a conceptual metaphor PURPOSEFUL ACTIVITIES ARE JOURNEYS, via which some cultures understand progress toward some nonphysical goal as progress in moving toward a destination. The metaphor consists of the following conceptual mapping: The PURPOSEFUL ACTIVITIES ARE JOURNEYS METAPHOR Source (motion in space) Target (mental activity) Starting point A -7 Initial state Ending location B -7 Final State Destination -7 Purpose to be achieved Motion from A to B -7 Process of achieving purpose Obstacles to motion -7 Difficulties in achieving goals This conceptual mapping also makes use of one of our culture's most basic metaphors for understanding the passage of time, in which temporal change is understood metaphorically as motion along a path to some location. In this metaphor, the observer moves along a time line, with the future arrayed as the space in front of her and the past as the space behind. Consequently, when we hear We have a long way to go until our campaign fund drive is finished, we understand ourselves metaphorically as moving along a path toward the destination (completion of the fund drive), and we understand that there can be obstacles along the way that would slow our progress. Conceptual Metaphor Theory proposes that all abstract conceptualization works via conceptual metaphor, conceptual metonymy and a few other principles of imaginative extension. To date there is a rapidly growing body of metaphor analyses of key concepts in nearly every conceivable intellectual field and discipline, including the physical and biological sciences, economics, morality, politics, ethics, philosophy, anthropology, psychology, religion and more. For example, Lakoff and Nlinez (2000) have carried out extensive analyses of the fundamental metaphorical concepts that underlie mathematics, from simple models of addition all the way up to concepts of the Cartesian plane, infinity and differential equations. Winter (2001) analyzes several key metaphors that define central legal concepts and are the basis for legal reasoning. Grady (1997) examines
We are live creatures
39
"primary metaphors" (such as PURPOSES ARE DESTINATIONS) that are combined systematically into more complex metaphors (such as PURPOSEFUL ACTIVITIES ARE JOURNEYS). The reason that conceptual metaphor is so important is that it is our primary means for abstract conceptualization and reasoning. Pragmatism's principle of continuity claims that abstract thought is not disembodied; rather, it must arise from our sensorimotor capacities and is constrained by the nature of our bodies, brains and environments. From an evolutionary perspective this means that we have not developed two separate logical and inferential systems, one for our bodily experiences and one for our abstract reasoning (as a pure logic). Instead, the logic of our bodily experience provides all the logic we need in order to perform every rational inference that we do. In our metaphor-based reasoning, the inferences are carried out via the corporeal logic of our sensorimotor capacities, and then, via the sourceto-target domain mapping, the corresponding logical inferences are drawn in the target domain. For example, there is definite spatial or bodily image-schematic logic of containment that arises in our experience with containers: (a) (b)
An entity is either inside the container or outside it, but not both at once. If I place an object 0 within a physical container C and then put container C inside of another container D, then 0 is in D.
In other words, our bodily encounters with containers and objects that we observe and manipulate teach us the spatial logic of containers. Next, consider the common conceptual metaphor CATEGORIES ARE CONTAINERS, in which a conceptual category is understood metaphorically as an abstract container for physical and abstract entities. For example, we may say: The category 'human' is contained in the category 'animals,' which is contained in the category 'living things. ' Similarly, we may ask Which category is this tree in? Based on the inferential image-schematic structure of the source domain, and via the source-to-target mapping, we then have corresponding inferences about abstract concepts: (a')
An entity either falls within a given category, or falls outside it, but not both at once [e.g., Charles cannot be a man and not a man at the same time, in the same place, and in the same manner]. (The Law of the Excluded Middle).
40
(b')
Mark Johnson and Tim Rohrer
If an entity E is in one category C', and C' is in another category D', then that entity E is in category D' [For example, All men are mortal (C' is in D~ and Socrates is a man (E is in C~, therefore Socrates is mortal (E is in D~].
Thus, according to CMT we would then predict that the abstract inferences are "computed" using sensorimotor neural maps, and those inferences are activated as target-domain inferences because there are neural connections from sensorimotor areas of the brain to other areas that are responsible for so-called "higher" cognitive functions. The hypothesis is that human beings don't run an inferential process at the sensorimotor level and then perform an entirely different inferential process for abstract concepts; rather, human beings utilize the inference patterns found in the sensorimotor brain regions to perform "abstract" reasoning. Just as the Pragmatist Principle of Continuity requires, there is no need to introduce a new kind of reasoning (with a different ontological basis) to explain logical reasoning with abstract concepts.
4.3.
Evidence for conceptual metaphor and abstract reasoning using conceptual metaphors
Recently several new sources of evidence have become available to explain the possible neural bases for the image-schematic mappings that operate in conceptual metaphors. The new evidence comes from both the patientbased neurological literature and neuroimaging studies of normal adults. While we have long known that patients can develop anomias reflecting selective category deficits for animals, tools and plants (Warrington and Shallice 1984), several recent studies have reported a selective category deficit for body part terms (Suziki, Yamadori and Fujii 1997; Shelton, Fouch and Caramazza 1998; Coslett, Saffran and Schwoebel 2002). The deficit work suggests that lesions in the secondary motor cortices, in regions which likely contain both somatotopic and egocentric spatial maps, can cause difficulties in tasks such as body part naming, naming contiguous sections of the body, and so on. This finding suggests that the comprehension of body part terms requires the active participation of these neural maps. Two other neuroimaging studies also show that we can drive the human somatomotor maps with both metaphoric and literal linguistic stimuli re-
We are live creatures
41
lating to the body. In an fMRI study, Hauk, Johnsrude and Pulvermiiller (2004) have shown that single word terms such as "smile", "punch" and "kick" differentially activate face, armlhand and leg regions within the somatomotor maps, suggesting that literal language can differentially activate body-part related somatomotor neural maps. Similarly, an fMRI neuroimaging study by Rohrer (2001 b, 2005) shows that both literal and metaphoric sentences using hand terms (e.g., She grasped the apple and He grasped the theory) activate primary and secondary hand regions within the primary and secondary sensorimotor maps. After the presentation of the linguistic stimuli, Rohrer also mapped the hand somatic cortex of each study participant using a tactile hand-stroking task. A comparison between the tactile and the sentential conditions shows a high degree of overlap in the primary and secondary somatomotor cortex for both language tasks, cf. Figure 1.
Metaphoric Hand Sentences
Figure 1.
fMRI activation courses in response to literal and metaphoric action sentences. Areas active and overlapping from a hand somatosensory task were outlined in white (Rohrer 2001b).
There is also evidence from neurocomputationally inspired models of conceptual metaphor and abstract reasoning. Building on Regier's work on modelling the image-schematic character of spatial relation terms, Naraya-
42
Mark Johnson and Tim Rohrer
nan (1997; Feldman and Narayanan 2004) developed a constrained connectionist network to model how the bodily logic of our sensorimotor systems enables us to perform abstract reasoning about international economics using conceptual metaphors. For example, the system was able to successfully interpret both In 1991, the Indian government deregulated the business sector and In 1991, the Indian government loosened its stranglehold on business. Narayanan's model can perform inferences either entirely within the sensorimotoric domain or in the linguistic domain using common conceptual metaphor mappings. Taken together with the neurophysiological and neuroimaging evidence for image schemas and conceptual metaphors, these neurocomputational models support the image-schematic and metaphoric basis of our language and abstract reasoning.
5.
The continuity of embodied social and cultural cognition
In this chapter, we have been presenting evidence for the embodied char-
acter of cognition, and we have suggested an appropriate Pragmatist philosophical framework for interpreting that evidence. Contra Representationalism, we have argued that cognition is not some inner process performed by the "mind", but rather is a form of embodied action. We argued this by giving examples of how cognition is located in organism-environment interactions, instead of being locked up in some allegedly private mental sphere of thought. However, an exclusive focus on the organism's engagement and coupling with its environment can lead to the mistaken impression that thought is individual, not social. Therefore, we must at least briefly address the crucial fact that language and abstract reasoning are socially and culturally situated activities. Thus far, we have discussed only one socio-cultural dimension, albeit a crucially important one, namely, development. Our brief discussion of development was framed more within the context of nervous systems than within socio-cultural interactions. We stressed the point that epigenetic bodily interactions with the world are what shape our neural maps and the image schemas in them. For humans, a very large and distinctive part of that involves interacting with other humans. In other words, human understanding and thinking is social. This raises the question: How do socially and culturally determined factors come to play a role in human cognition? Perhaps a sceptic might say that the locus of the distinctively human lies in a socially and culturally learned capacity for classical Representational-
We are live creatures
43
ism. Once again, however, the Representationalist proposal rests on two mistakes. First, there is not a radical ontological break from the rest of the animal kingdom with respect to socially and culturally transmitted behaviors, both in general and specifically in the cases of linguistic and symbolic communication. Second, having challenged the "inner mind" versus "outer body" split, we must not then proceed to replace it with another equally problematic dichotomy - that between the "individual" and the "social". We must recognize that cognition does not take place only within the brain and body of a single individual, but instead is partly constituted by social interactions and relations. The evidence to which we now turn comes from cognitive ethology and distributed cognition. Of course there are ways in which our socio-cultural behaviors are peculiarly human, but the story is once again much more complex and multi-dimensional than classical Representationalists suppose. Following Maturana and Varela (1998: 180-184) we would define social phenomena as those phenomena arising out of recurrent structural couplings that require the co-ordinated participation of multiple organisms. They argue that just as the cell-to-cell interactions in the transition from single to multi-cellular organisms afford a new level of intercellular structural coupling, so also recurrent interactions between organisms afford a new level of inter-organism structural coupling. The social insects are perhaps the most basic example of this kind of recurrent inter-organism behaviors. For example, ants must feed their queen for their colony to remain alive. Individual workers navigate their way to and from the nest and food sources by leaving trails of chemical markers, but these markers are not distinctive to the individual ant. When seeking food, an individual ant moves away from markers dropped by other ants. Naturally the density of such markers decreases in proportion to the distance from the nest. But when one finds food they begin to actively seek denser clusters of markers, thus leading them back to the nest. Furthermore, whenever a worker ant eats, their chemical markers change slightly. These chemical markers attract, rather than repel, other ants. Thus the ants gradually begin to form a column leading from a food source to a nest. Note that the ants' cognition is both social, in that it takes place between organisms, and distributed, in the sense that it offloads much of the cognitive work onto the environment. No single ant carries around an "internal representation" or neural map of where the ant colony is. Ant cognition is thus nonrepresentational in that it is both intrinsically social and situated in organism-environment interactions.
44
Mark Johnson and Tim Rohrer
The evolutionarily programmed social cognition of insects, however, does not include the capacity for spontaneous imitation which is so central to human cognition. For a social behavior to become a learned behavior and then continue across generations, a capacity for spontaneous imitation is crucial. However, zoological ethologists have long known that this imitative capacity is not unique to humans. Researchers studying macaques left sweet potatoes on the beach for a colony of wild monkeys who normally inhabit the jungle near the beach. After gradually becoming habituated to the beach and becoming more familiar with the sea, one monkey discovered that dipping the potatoes in a tidepool would cleanse them of the sand that made them unpalatable. This behavior was imitated throughout the colony in a matter of days, but the researchers observed that older macaques were slower to acquire the behavior than the younger ones (Kawamura 1959; McGrew 1998). Maturana and Varela (1998: 203) define cultural behavior precisely as this kind of relatively stable pattern of such transgenerational social behavior. The culturally acquired behavior most often held up by classical Representationalists as the hallmark of the distinctively human is language. However, even here there is not a clear break from the animal kingdom in terms of basic cognitive capabilities, as we see when considering the results of researchers who have been trying to teach symbolic communication to other primates. Instead, their observations are consonant with our theory of how language and image schemas emerge from bodily processes involving cross-modal perception. In experiments done by SavageRumbaugh and colleagues (1988), three chimpanzees who had been trained in symbolic communication were able to make not only cross-modal associations (i.e., visual to tactile), but were able to make symbolic to sensorymodal associations. For example, Kanzi was able to hear a spoken English word and accurately (100% of the time) choose either the corresponding visual lexigram or a visual picture of the word. Sherman and Austin were able to choose the appropriate object by touch when presented with a visuallexigram (100% correct), and conversely they were also able to choose the appropriate visuallexigram when presented with a tactile-only stimulus (Sherman: 96% correct, Austin: 100%) or olfactory-only stimulus (Sherman: 95% correct, Austin 70%: correct). Their ability to perform such symbol to sensory-modality coordination enhanced their performance on tasks measuring solely cross-modal coordination; as Savage-Rumbaugh et al. observe: "these symbol-sophisticated apes were able to perform a variety of cross-modal tasks and to switch easily from one type of task to an-
We are live creatures
45
other. Other apes have been limited to a single cross-modal task" (1988: 623). Although these chimpanzees will never approach the linguistic capabilities of humans, these results show that the continuity of our human capacity for abstract cross-modal thought is shared by at least some members of the animal kingdom. 4 In fact, related recent research on primates suggests that it is the distinctively human socio-cultural environment (and not some great zoological discontinuity in comparative cognitive capacity) that facilitates the cross-modal cognitive capabilities underlying language and abstract reason. We have already noted the neural development of the cross-modal maps of juvenile owls can be modified by epigenetic stimulation, but it is equally important to realize that the cross-modal basis for many of our image schemas require epigenetic stimulation of the kind presented by human parents. Tomasello, Savage-Rumbaugh and Kruger (1993) compared the abilities of chimpanzees and human children to imitatively learn how to perform novel actions with novel objects. They tested 3 conspecific (mother-reared) chimpanzees and 3 enculturated chimpanzees, along with 18 and 30 month-old human children. They introduced a new object into the participant's environment, and after observing the participant's natural interactions with the object, the experimenter demonstrated a novel action with the object with the instruction "Do what I do". Their results showed that the mother-reared chimpanzees were much poorer imitators than the enculturated chimpanzees and the human children, who did not differ from one another. A human-like sociocultural environment is an essential component not only for the development of our capacity for imitation, but also for the development of our capacities for the cross-modal image schemas that underlie language and abstract reasoning (see also Fouts, Jensvold and Fouts 2002). Finally, there is also considerable evidence from cognitive anthropology that adult humans do not think in a manner consistent with the dichotomies posed by classical Representationalism. Like the social insects, we tend to offload much of our cognition onto the environments we create. We tend to accomplish this in two ways - first, we make cognitive artifacts to help us engage in complex cognitive actions, and, second, we distribute cognition among members of a social organization. As an example of the first,
4. This conclusion is further supported by results showing that human children with specific language impairments show deficiencies in their ability to perform cross-modal tasks (Montgomery 1993).
46
Mark Johnson and Tim Rohrer
Hutchins (1995: 99-102) discusses how medieval mariners used the 32point compass rose to predict tides. By superimposing onto the compass rose the 24-hour day (in 45-minute intervals), the mariners could map the lunar "time" of the high tide (the bearing of the full moon when its pull causes a high tide) to a solar time of day. As long as we know two facts the number of days since the last full moon and the lunar high tide for a particular port - we simply count off a number of points on the compass rose equal to the days past the full moon to compute the time of next high tide. Without the schema provided by the cognitive artifact, computing the next high tide is a much more laborious cognitive task. As an example of the second, Hutchins (1995: 263-285) discusses how the partially overlapping knowledge distributions of a group of three navy navigation personnel function cognitively within the team considered as a team. Although no single team member is expected to constantly maintain a complete internal representation of all the navigational data, Hutchins shows how the social distribution of the cognitive tasks functions as a brake on serious navigational errors that could imperil the ship, because the participants each know some of the spatial relations and procedures immanent to another team member's job. In short, the offloading of some of the cognitive load onto the environment, as found both in cognitive artifacts and the social distribution of cognitive tasks, is crucial to many of our daily cognitive activities. A fully adequate treatment of the social dimension of thought would require substantially more evidence and analysis than we can provide here. We have only attempted to suggest that sociocultural cognition in general is not unique to humankind, that the common bases for cross-modal cognition and symbolic/linguistic communication are not unique to humans, and that human cognition cannot be locked up within the private workings of an individual mind. Since thought is a form of co-ordinated action, it is spread out in the world, co-ordinated with both the physical environment and the social, cultural, moral, political and religious environments, institutions and shared practices. Language - and all forms of symbolic expression - are quintessentially social behaviors. Dewey nicely summarizes the intrinsically social character of all thought in his argument that the very idea of thinking as a kind of inner mental dialogue is only possible because of socially established and preserved meanings, values and practices: When this introspectionist thinks he has withdrawn into a wholly private realm of events disparate in kind from other events, made out of mental stuff, he is only turning his attention to his own soliloquy. And soliloquy is
We are live creatures
47
the product and reflex of converse with others; social communication not an effect of soliloquy. If we had not talked with others and they with us, we should never talk to and with ourselves. Because of converse, social give and take, various organic attitudes become an assemblage of persons engaged in converse, conferring with one another, exchanging distinctive experiences [...]. Through speech a person dramatically identifies himself with potential acts and deeds; he plays many roles, not in successive stages of life but in a contemporaneously enacted drama. Thus mind emerges. (Dewey 1925: 135)
HThus mind emerges!" It emerges as, and is enacted through, social cognition. There is no radical rupture with our bodily experience of meaning; instead, that meaning is carried forward and given voice through language and other forms of social symbolic interaction and expression.
6.
Embodied meaning, thought and language
We have been arguing against disembodied views of mind, concepts and reasoning, especially as they underlie Representationalist theories of mind and language. Our alternative view - that cognition is embodied - has roots in American Pragmatist philosophy and is being supported and extended by recent work in second-generation cognitive science. Pragmatists like James and Dewey understood that philosophy and empirical science must develop in mutual cooperation and criticism, if we are ever to have an empirically responsible understanding of the human mind and all of its marvelous capacities and acts. Pragmatism is characterized by (1) a profound respect for the richness, depth and complexity of human experience and cognition, (2) an evolutionary perspective that appreciates the role of dynamic change in all development (as opposed to fixity and finality), and (3) recognition that human cognition and creativity arise in response to problematic situations that involve values, interests and social interaction. The principle of continuity encompasses the fact that apparently novel aspects of thought and social interaction arise naturally via increased complexity of the organismenvironment interactions that constitute experience. Pragmatists thus argue that all of our traditional metaphysical and epistemological dualisms (e.g., mind/body, inner/outer, subject/object, concept/percept, reason/emotion, knowledge/imagination and theory/practice) are merely abstractions from the interactive (enactive) process that is experience. Such distinctions are not absolute ontological dichotomies. Sometimes they serve us well, but
48
Mark Johnson and Tim Rohrer
oftentimes they serve us quite poorly, depending on what problems we are investigating, what values we have, and what the socio-cultural context is. In recent years the number of researchers engaged in some variation of "embodied cognition" has swelled prodigiously. Once upon a time, cognitive science seemed defined by the Representationalist view that the body is inconsequential to the study of the mind. But that has changed dramatically. Some Representationalists have recently argued for a very limited sense of embodiment that would keep intact much of the first generation of cognitive science's representational baggage (Clark 1998). Today we are witnessing a new generation of cognitive science emerging which defines "embodied cognition" as a fundamentally non-representational project. Contributions to a radical theory of embodied cognition are being made by dynamic systems theorists who argue that cognition, though amenable to mathematical description, is not computational (van Gelder 1995), by neurobiologists whose experiments show us how metaphors of information transfer mislead us in understanding the population dynamics behind neural organization (Edelman 1992), and by cognitive roboticists who understand that having a body is perhaps not such a bad thing after all (Brooks 1991; Brooks and Stein 1994). Even Alan Turing, a leader among that lost first generation who so errantly steered cognitive science toward disembodiment, was willing to admit he might be wrong when it came to how we might teach a robot language: It can also be maintained that it is best to provide the machine with the best sense organs that money can buy, and then teach it to understand and speak English. That process could follow the normal teaching of a child. Things would be pointed out and named, etc. Again, I do not know what the right answer is, but I think both approaches should be tried. (Turing 1950: 460)
We have already tried the disembodied Representationalist approach, and its failures have breathed new life into the Pragmatist approach to embodied cognition. The themes we have been tracing throughout this chapter - our animal engagement and cognition, our ongoing coupling and our falling in and out of harmony with our surroundings, our active value-laden inquiry to reestablish harmony and growth, and our community of social interactions are beautifully encapsulated by Dewey in his attempt to recover the value of the aesthetic dimensions of meaning in human life: At every moment, the living creature is exposed to dangers from its surroundings, and at every moment, it must draw upon something in its sur-
We are live creatures
49
roundings to satisfy its needs. The career and destiny of a living thing are bound up with its interchanges with environment, not externally but in the most intimate needs. The growl of a dog crouching over his food, his howl in time of loss and loneliness, the wagging of his tail at the return of his human friend are expressions of the implication of a living in a natural medium which includes man along with the animal he has domesticated. Every need, say for hunger for fresh air or food, is a lack that denoted at least a temporary absence of adequate adjustment with surroundings. But it is also a demand, a reaching out into the environment by building at least a temporary equilibrium. Life itself consists of phases in which the organism falls out of step with the march of surrounding things and then recovers unison with it - either through effort or some happy chance [... ]. These biological commonplaces are something more than that [mere biological consequences]; they reach to the roots of the esthetic in experience. (Dewey 1934: 13-14)
We humans are live creatures. We are acting when we think, perhaps falling in and out of step with the environment, but never are our thoughts outside of it. Via our bodily senses the environment enters into the very shape of our thought, sculpting our most abstract reasoning from our embodied interactions with the world.
Acknowledgements
The authors would like to acknowledge the many insightful comments by the reviewers and the editors of this volume. Tim Rohrer would like to acknowledge the stimulating intellectual environment of the Sereno and Kutas cognitive neuroscience laboratories at UCSD as he conducted the fMRI research outlined here, as well as the generous research support of a Nlli fellowship.
References Anderson, Michael 2003 Embodied cognition: A field guide. Artificial Intelligence 149: 91130.
50
Mark Johnson and Tim Rohrer
Brooks, Rodney A. 1991 Intelligence without representation. Artificial Intelligence Journal 47: 139-159. Brooks, Rodney A. and Anita M. Flynn 1989 Fast, cheap and out of control: A robot invasion of the solar system. Journal ofthe British Interplanetary Society 42: 478-485. Brooks, Rodney. A. and Lynn A. Stein 1994 Building brains for bodies. Autonomous Robots 1 (1): 7-25. Buccino, Giorgio, F. Binkofski, Gereon R. Fink, Luciano Fadiga, Leonardo Fogassi, Vittorio Gallese, Riidiger J. Seitz, Karl Zilles, Giacomo Rizzolatti and Hans-Joachim Freund 2001 Action observation activates premotor and parietal areas in a somatotopic manner: An fMRI study. European Journal of Neuroscience (2): 400-4004. Buonomano, Dean V. and Michael M. Merzenich 1998 Cortical plasticity: From synapses to maps. Annual Review ofNeuroscience 21: 149-186. Clark, Andy Being There: Putting Brain, Body and World Together Again. Cam1998 bridge, MA: MIT Press. Coslett, H. Branch, Eleanor M. Saffran and John Schwoebel 2002 Knowledge of the human body: a distinct semantic domain. Neurology 59: 357-363. DeBello, William M., Daniel E. Feldman and Eric I. Knudsen 2001 Adaptive axonal remodeling in the midbrain auditory space map. Journal ofNeuroscience 21 (9): 3161-3174. Deneubourg, Jean-Louis, Jacque Marie Pasteels and Jean-Claude Verhaeghe 1983 Probabilistic behavior in ants: A strategy of errors. Journal of Theoretical Biology 105: 259-271. Dewey, John 1925 Experience and Nature. In: John Dewey, The Later Works, 19251953, vol. 1, Jo Ann Boydston (ed.). Carbondale: Southern Illinois University Press, 1981. 1934 Art as Experience. New York: Penguin-Putnam Inc., reprinted 1980. 1938 Logic: The Theory of Inquiry, In: John Dewey, The Later Works, 1925-1953, vol. 12, Jo Ann Boydston (ed.). Carbondale: Southern Illinois University Press, 1981. Edelman, Gerald 1987 Neural Darwinism. New York: Basic Books. 1992 Bright Air, Brilliant Fire: On the Matter of Mind. New York: Basic Books.
We are live creatures
51
Erickson, Harold P., Dianne W. Taylor, Kenneth A. Taylor and David Bramhill 1996 Bacterial cell division protein FtsZ assembles into protofilament sheets and minirings, structural homologs of tubulin polymers. Proceedings of the National Academy of Sciences USA. 93 (1): 519523. Feldman, Jerome and Srini Narayanan 2004 Embodied meaning in a neural theory of language. Brain and Language 89 (2): 385-392. Fodor, Jerry Psychosemantics: The Problem of Meaning in the Philosophy of 1987 Mind. Cambridge, MA: MIT Press. Fogassi, Leonardo, Vittorio Gallese, Giorgio Buccino, Laila Craighero, Luciano Fadiga and Giacomo Rizzolatti Cortical mechanism for the visual guidance of hand grasping move2001 ments in the monkey: A reversible inactivation study. Brain 124 (3): 571-586. Fouts, Roger S., Mary Lee A. Jensvold and Deborah H. Fouts 2002 Chimpanzee signing: Darwinian realities and Cartesian delusions. In: Marc Bekoff, Colin AlIen and Gordon M. Burghardt (eds.), The Cognitive Animal: Empirical and Theoretical Perspectives in Animal Cognition. Cambridge, MA: MIT Press. 285-291. Fraser, Scott E. 1985 Cell interaction involved in neural patterning: An experimental and theoretical approach. In: Gerald Edelman, W. E. Gall and W. M. Cowan (eds.), Molecular Bases of Neural Development, 581-507. New York: Wiley. Gallese, Vittorio and George Lakoff 2005 The brain's concepts: The role of the sensory-motor system in reason and language. Cognitive Neuropsychology 22: 455-479. Gaze, R. M. and Sansar C. Sharma 1970 Axial differences in the reenervation of the goldfish optic tectum by regenerating optic nerve fibers. Experimental Brain Research 10: 171-181. Goldberg, Jeffrey L. 2003 How does an axon grow? Genes and Development 17 (8): 941-958. Grady, Joseph 1997 Foundations of meaning: primary metaphors and primary scenes. Ph.D. Dissertation, University of California. Guo, Yajin and Susan B. Udin 2000 The development of abnormal axon trajectories after rotation of one eye in Xenopus. Journal ofNeuroscience 20 (11): 4189-4197.
52
Mark Johnson and Tim Rohrer
Hauk, Olaf, Ingrid Johnsrude and Friedemann Pulvermfiller 2004 Somatotopic representation of action words in human motor and premotor cortex. Neuron 41 (2): 301-307. Hodges, Andrew 1983 Alan Turing: The Enigma. New York: Walker and Co. Hutchins, Edwin 1995 Cognition in the Wild. Cambridge, MA: MIT Press. James, William 1900 Psychology (American Science Series, Briefer Course). New York: Henry Holt and Company. Johnson, Mark 1987 The Body in the Mind: The Bodily Basis of Meaning, Imagination and Reason. Chicago: University of Chicago Press. Kawamura, Syunzo 1959 The process of subculture propagation among Japanese macaques. Primates 2: 43-60. Knudsen, Eric 1. 1998 Capacity for plasticity in the adult owl auditory system expanded by juvenile experience. Science 279 (5356): 1531-1533. 2002 Instructed learning in the auditory localization pathway of the barn owl. Nature. 417 (6886): 322-328. Lakoff, George 1987 Women, Fire and Dangerous Things. Chicago: University of Chicago Press. Lakoff, George and Mark Johnson 1980 Metaphors We Live By. Chicago: University of Chicago Press. 1999 Philosophy in the Flesh: The Embodied Mind and Its Challenge to Western Thought. New York: Basic Books. Lakoff, George and Rafael Nunez 2000 Where Mathematics Comes From: How the Embodied Mind Brings Mathematics Into Being. New York: Basic Books. Langacker, Ronald 1986 Foundations of Cognitive Grammar. 2 vols. Stanford: Stanford University Press. Lewkowicz, David J. and Gerald Turkewitz 1981 Intersensory interaction in newboms: modification of visual preferences following exposure to sound. Child Development 52 (3): 827832. Maturana, Humberto R. and Francisco J. Varela 1998 The Tree of Knowledge: The Biological Roots of Human Understanding, revised edition. (Robert Paolucci, trans. 1987 Arbol del conocimiento.) Boston: Shambhala Press.
We are live creatures
53
McGrew, William C. 1998 Culture in nonhuman primates? Annual Review ofAnthropology 27: 301-328. Meltzoff, Andrew N. and Richard W. Borton 1979 Intermodal matching by human neonates. Nature 282 (5737): 403404. Merzenich, Michael M., Randall J. Nelson, Jon H. Kaas, Michael P. Stryker, Max S. Cynader, A. Schoppmann and John M. Zook 1987 Variability in hand surface representations in areas 3b and 1 in adult owl and squirrel monkeys. Journal Comparative Neurology 258 (2): 281-296. Montgomery, James W. 1993 Haptic recognition of children with specific language impairment: effects of response modality. Journal Speech Hearing Research 36 (1): 98-104. Narayanan, Srini 1997 KARMA: Knowledge-based active representations for metaphor and aspect. Ph.D. Dissertation, University of California at Berkeley. Newell, Alan and Herbert Simon 1976 Computer science as empirical inquiry: Symbols and search. Communications ofthe ACM 19: 113-126. Regier, Terry 1996 The Human Semantic Potential Spatial Language and Constrained Connectionism. Cambridge, Mass.: MIT Press. Rizzolatti, Giacomo, Leonardo Fogassi and Vittorio Gallese 2002 Motor and cognitive functions of the ventral premotor cortex. Current Opinion in Neurobiology 12 (2): 149-154. Rohrer, Tim 2001 a Pragmatism, Ideology and Embodiment: William James and the philosophical foundations of cognitive linguistics. In: Rene Dirven, Bruce Hawkins and Esra Sandikcioglu (eds.), Language and Ideology: Cognitive Theoretic Approaches: Volume 1, 49-81. Amsterdam: John Benjamins. 2001 b Understanding through the Body: fMRI and of ERP studies of metaphoric and literal language. Paper presented at the 7th International Cognitive Linguistics Association Conference, July 2001. 2005 Image schemata in the brain. In: Beate Hampe (ed) From Perception to Meaning: Image Schemas in Cognitive Linguistics, 165-196. Berlin: Mouton de Gruyter. in press Embodiment and experientialism. In: Dirk Geeraerts and Hubert Cuyckens (eds), Handbook of Cognitive Linguistics. New York: Oxford University Press.
54
Mark Johnson and Tim Rohrer
Savage-Rumbaugh, Sue, Rose A. Sevcik and William D. Hopkins 1988 Symbolic cross-modal transfer in two species of chimpanzees. Child Development 59 (3): 617-625. Shelton, Jennifer R., Erin Fouch and Alfonso Caramazza 1998 The selective sparing of body part knowledge. A case study. Neurocase 4: 339-351. Sperry, Roger W. 1943 Effect of a 180-degree rotation of the retinal field on visuomotor coordination. Journal ofExperimental Zoology 92: 263-279. Stem, Daniel 1985 The Interpersonal World ofthe Infant. New York: Basic Books. Suziki, K., A. Yamadori and T. Fujii 1997 Category specific comprehension deficit restricted to body parts. Neurocase 3: 193-200. Tomasello, Michael, Sue Savage-Rumbaugh and A. C. Kruger 1993 Imitative learning of actions on objects by children, chimpanzees and enculturated chimpanzees. Child Development 64 (6): 1688-1705. Turing, Alan M. 1937 On computable numbers, with an application to the Entscheidungsproblem. Proceedings of London Mathematical Society, Series 2, 42:230-265 1950 Computing machinery and intelligence. Mind 59 (236): 433-460. van Gelder, Tim 1995 What might cognition be, if not computation? Journal ofPhilosophy 92 (7): 345-381. Varela, Francisco, Evan Thompson and Eleanor Rosch 1991 The Embodied Mind: Cognitive S~ience and Human Experience. Cambridge: MIT Press. Warrington, Elizabeth K. and Tim Shallice 1984 Category specific semantic impairments. Brain 107: 859-854. Winter, Steven 2001 A Clearing in the Forest: Law, Life and Mind. Chicago: University of Chicago Press. Ziemke, Tom 2003 What's that thing called embodiment? Proceedings of the 25th Annual Meeting of the Cognitive Science Society, 1305-1310. Lawerence Erlbaum.
Bringing the body back to life: James Gibson's ecology of agency
Alan Costall " [...] the self, or rather the soul, by which I am what I am, is entirely distinct from the body, is indeed easier to know than the body, and would not cease to be what it is, even if there were no body." (Descartes 1637 [1960]: 61).
Abstract In its supposed revolutionary move beyond "mechanistic behaviourism", modem psychology replaced the image of the passive subject with that of a highly active information processor modelled upon the general purpose computer. But such activity has been envisaged as essentially subcutaneous. For, in addition to the explicit and modem metaphor of the mind as computer, modem psychology simply retained the traditional mechanistic metaphor of the body as a stimulus-response machine. Thus, even though the ghost inside the machine has been replaced by a highly active "processor of information", the passive, mechanical body that harboured the ghost continues to structure psychological theory. Gibson's early work was an explicit attempt to develop a pure stimulusresponse psychology, but he eventually came to realize that this schema should not be merely repaired by invoking mediating cognitive processes but had, instead, to be rejected in its entirety. He replaced the stimulus-response formula still so fundamental to mainstream psychology with an "ecology of embodied agency": an exploration of the material conditions - affordances and information - that support our "being in the world". Keywords: embodiment, James Gibson, affordances, proprioception, information, nature-culture dualism, mutualism.
56
1.
A/an Costall
Introduction
Psychophysical dualism, as formulated by Rene Descartes (1596-1650) excluded the mind from the realm of natural science, and at the same time promoted the ambitious claim of the new mechanistic science to include everything else. Indeed, it was precisely by excluding mind from the physical order of things that the universal claim of the new science could be protected. For, as Margaret D. Wilson (1980: 42) has put it, an "immaterial principle is by definition not a part of 'nature''', and so does not count as a proper object for science. Thus the apparent intractability of mind to mechanistic treatment could hardly be deemed a failure on the part of science, but a reflection, instead, upon the "non-scientific" nature of mind itself. Descartes did, however, leave our bodies to science. And, since within Cartesian theory the mind was supposed to be the sole source of activity the sole mover - the body had to be regarded as not only separate and alien from ourselves, but also entirely inert. Falling squarely on the physical side of the Cartesian psychophysical divide, the body - along with the rest of extended matter - was relegated to the realm of the essentially passive. The body thus ends up in Cartesian theory much as it had first made its appearance within modem science, inert and alien - as a corpse. Modem psychologists would insist that psychological theory has come a very long way since Descartes. His dualisms of mind and body, and of mind and world, are, they would patiently explain, very much part of the prescientific history of our discipline, and nothing to do with present-day psychology. In this chapter, I will argue that a dualism between body and mind persists in modem cognitive psychology in a blindingly obvious way, and which amounts to a dualism more extreme and more systematic than anything to be found in the writings of Descartes. In a crucial sense, the body has largely disappeared from mainstream psychological theory, figuring as nothing more than a passive channel of "stimuli" and "responses". I will then turn to the work of one of psychology's most centrally placed and interesting misfits, James Gibson (1904-1979), who over a long career not only subjected mainstream psychology to sustained criticism, but sought to construct a non-dualistic alternative, which I have come to describe as an ecology of agency. Gibson is important to the current debates about embodiment because, by challenging the "stimulus-response" thinking that still constrains much of modem psychological theory, he has helped to
Bringing the body back to life: James Gibson's ecology ofagency
57
bring psychology's moribund body back to life - a revived body that is active, social and lived.
2. Was Descartes a "Cartesian"? So far, I have presented Descartes as a good, straightforward Cartesian. But, in fact, he was much more complicated and sensible than he appears in most of the standard textbook accounts. For example, it turns out that he was never very much bothered by the "problem of other minds" that would seem to ensue from a stark division of body and mind (Avramides 2001). Maybe he was not even a dualist (Baker and Morris 1996). Sometimes he even entertained the possibility that the mind might ultimately be understood in mechanistic terms (Cottingham 1992). Perhaps he was even a "proto-phenomenologist" (Tibbets 1973). My purpose here in this section is not to provide a review of the revisionist accounts of Cartesian philosophy that have appeared over the last fifty years or so. But I do want to draw attention to some important and surprising differences between "Cartesian dualism" as presented in the official potted histories, and the more subtle, possibly inconsistent, things Descartes himself had to say about the relation of mind and body. Descartes has most often been criticized for failing to provide an explanation of the relation of body and mind, where, by explanation, is meant a mechanical account of how the two relate. But, of course, that criticism completely misses his fundamental point, that the mind is not part of the physical order of things (Richardson 1982). Indeed, he makes very clear that he does not see that there is any general problem that needs explanation. The relation of body to mind is - and perhaps has to be - taken for granted. Mind and body have to be intimately related if we are to respond to injury and also to hunger and thirst with urgency and immediacy. As he strikingly put it in his Meditations: For nature also teaches me by these feelings of pain, anger, thirst, and so on, that I am not just lodged in my body like a pilot in his ship, but that I am intimately united with it, and so confused and intermingled with it that I and my body compose, as it were, a single whole. If this were not so, I should feel no pain when my body was injured; I should simply note the injury with my understanding, as a pilot sees with his eyes any damage done to his ship. (Descartes 1641 [1960]: 161)
58
Alan Costall
Evidently, Descartes was, at the very least, in two minds about dualism. Indeed, he eventually discloses that he did not regard the union of body and mind as a real problem at all, but one created by reflective thought. It cannot be understood in an external way, but can only be lived: [...] what belongs to the union of soul and body can be understood only in an obscure way either by pure intellect or even when the intellect is aided by imagination, but is understood very clearly by means of the senses. [...] it is just by means of ordinary life and conversation, by abstaining from meditating and from studying things that exercise the imagination, that one learns to conceive the union of soul and body. (Descartes 1643 [1971]: 279-280)
Clearly, there is a need to be wary about our supposed Cartesian past, but not simply to put it aside. It is precisely the distorted, mythical, "official" history of psychology, even about distant figures such as Descartes, that really counts when trying to understand historically how the current discipline of psychology has been structured. After all, agenda setting, including self-congratulation and the induction of new students, is what "official" disciplinary history is primarily about. The dodgy history is powerful historical stuff. So, here is my attempt to distil the already potted essence of the Cartesian dualism of mind and body as it exists within the collective memory of psychologists: 1. 2. 3.
3.
There is a stark ontological dualism between mind and body: they are entirely different kinds of "stuff'. The body is a passive stimulus-response mechanism.} The mind is the only active principle, and mediates between stimulus and response (in the case of voluntary action).
Psychology's missing body
Modem psychologists would certainly not identify themselves as Cartesians in the above sense. They would argue that they have emphatically rejected ontological dualism by developing naturalistic accounts of mind, primarily based upon analogies with reassuringly substantial devices, such as computers. Furthermore, given their rejection of stimulus-response be1. Descartes' place in the history of the reflex and of "stimulus-response thinking" within psychology is, again, a good deal more complex than the standard accounts imply (cf. Danziger 1983).
Bringing the body back to life: James Gibson's ecology ofagency
59
haviourism, they would insist they have also rejected the old mechanistic psychologies based on the concept of the reflex. And if this were all true, you would have to conclude that modem psychologists have undermined the old mechanistic psychology by a thoroughly new mechanistic psychology, based on analogies with new kinds of machines never envisaged by the early proponents of a mechanistic psychology. As Stephen Toulmin (1993: 146) has put it: If Descartes, Newton or Leibniz had been shown a late 20th century computer, [...] they could only have reacted by declaring: "That's not [what I would have called] a 'machine' at all."
Yet, it is not the case that mainstream cognitive psychology entirely replaced traditional mechanism. It retains the old mechanistic image of the body. The new mechanism of mind has been merely assimilated to the old dualism of mind and body, along with the existing conception of the body as a passive machine. This dualism is now, however, reformulated in terms of two radically different kinds of machines - a machine within a machine, a new mechanical mind implanted with the old mechanical body. However, we have all been so fixated upon how to theorize the mind in terms of the new mechanism that this retention of the old mechanistic schema of the body has been systematically overlooked. Ever since the rise of cognitive psychology, with its self-conscious rhetoric of Kuhnian revolution, its proponents have claimed to be rejecting stimulus-response psychology, whilst in the same breath reinstating its basic schema: The past few years have witnessed a noticeable increase in interest in an investigation of the cognitive processes. [...] It has resulted from a recognition of the complex processes that mediate between the classical "stimuli" and "responses" out of which stimulus-response learning theories hoped to fashion a psychology that would by-pass anything smacking of the "mental". (Bruner, Goodnow and Austin 1956: vii [emphasis added]) It seems obvious to us that a great deal more goes on between the stimulus and response than can be accounted by a simple statement about associative strengths. (Miller, Galanter and Pribram 1960: 9 [emphasis added]) [...] the dramatic shift away from behaviorism, which dominated the field for over thirty years, to cognitivism, .. [has] allowed one to study not just learning but memory, not just speech but language, and notjust stimulus and response but the processes that mediate them. (Hirst 1988: vii [emphasis added])
60
Alan Costall [...] in contrast to early behaviorists, but like their intellectual descendants such as cognitive psychologists and ethologists [... ], I believe that scientists can and should construct models for elucidating the knowledge and cognitive processes that connect stimulus and response. (Schiffer 1999: 8 [emphasis added])
In an early assessment of the cognitive revolution, Donald Hebb, in his presidential address to the American Psychological Association, claimed that the stimulus-response formula was indeed essential to the new cognitive approach because it served to define what psychologists could possibly mean by "cognitive": [...] the whole meaning of the term "cognitive" depends on [the stimulusresponse idea], though cognitive psychologists seem unaware of the fact. The term is not a good one, but it does have meaning as a reference to features of behavior that do not fit the S-R formula; and no other meaning at all as far as one can discover. The formula, then, has two values: fIrst, it provides a reasonable explanation of much reflexive human behavior, not to mention the behavior of lower animals; and secondly, it provides a fundamental analytical tool, by which to distinguish between lower (noncognitive) and higher (cognitive) forms of behavior. (Hebb 1960: 737)
Here is a very recent formulation of the modem "alternative" to stimulusresponse psychology by Rom Harre (2002: 104), an influential critic of mainstream psychology. He is using the following example of word recognition to make a much more general point about how we should theorize in psychology: Instead of the behaviorist pattern: Stimulus (retinal sensation) ~ Response (perception of word) we must have Observable stimulus (retinal sensation) together with unobservable Cognitive process ('knowledge utilization') ~ Observable response (recognition of word)
Cognitive psychologists keep presenting their own position as an alternative to stimulus-response behaviouristic psychology, when what they are actually offering is no more than an elaboration of that traditional framework. As Edward Reed has put it, cognitivism has largely ended up as the "flip-side" of stimulus-response behaviourism, where "mental processes" are defined as whatever "is left over after one tries to stuff all psychological phenomena into the S-R box" (Reed 1997: 267). And because of this
Bringing the body back to life: James Gibson's ecology ofagency
61
legacy, psychologists keep failing to recognize real alternatives when they come along, because they cannot understand alternatives to their own position as anything other than reversions to the traditional "pure" version of the stimulus-response framework. Thus, in their otherwise insightful overview of theory and research on visual perception, Bruce and Green (1990: 389) claim that the newer developments (including Gibson's ecological psychology) are giving rise to an important new debate "about the necessity for a cognitive or psychological level of theory to stand between the "stimulus" and the "response"". But why have cognitive psychologists been so slow to notice their retention of the stimulus-response schema? First of all, the very rhetoric about "cognitive revolution" and liberation from behaviourism has given rise to a remarkably long-lasting bout of intellectual complacency. In fact, modem cognitive psychology has retained a number of the central features of the behaviourism it claims to have replaced: the attempt to repair stimulus-response psychology by appeal to mediating processes (a central feature of neo-behaviourism), a commitment to the "hypothetico-deductive method", and also "methodological behaviourism", the assumption that the data of psychology are confined to meaningless "behaviour" (cf. Still and Costa111991; Leahey 1992; Neisser 1997: 248). Secondly, the stimulus-response schema also conforms perfectly to the procedures by which psychologists still conduct most of their research, and it is enshrined within those procedures. Thus the participants in experiments are "presented" with a stimulus or an "independent variable" and are required to respond to that stimulus. They are not expected to transform the "conditions" imposed upon them, nor take themselves off to quite a different one. For the duration of the experiment, participants agree, in effect, to "lend" their agency to the investigator. Admittedly, such a state of passivity is not purely an artifact of the laboratory - it is commonplace in our lives at work and school. But, fortunately, it is not our sole mode of being in the world. As George Kelly (who, like Gibson, was another member of psychology's "awkward squad") once warned: Behaviour is man's way of changing his circumstances, not proof that he has submitted to them. What on earth, then, can present-day psychology be thinking about when it says it intends only to predict and control behaviour scientifically? Does it intend to halt the human enterprise in its tracks? (cited by Westland 1978: 69).
It is, however, the standard metaphor of the computer that has given the stimulus-response formula (now reformulated in the language of input and
62
Alan Costall
output) its new lease of life, and, hence, made the body largely disappear from psychological theory (see also Costall 1991). First of all, the person/computer is purely an information processor, passively receiving an "input" (in other words, a stimulus), operating upon it, and producing output. Such a person/computer does not get about, and certainly does not have much of a life, and so has no need of a body, other than to support "cognitive processing" and to provide an "interface" to the outer world - a passive "channel" for both input and output. 2 There is also a surprisingly neglected theoretical aspect of the standard computer metaphor that has led to the disappearance of the body. This metaphor gives rise to a two-part division between the software (the program) and hardware (the actual computing machine). According to this reassuringly substantial metaphor of the computer, with its tidy division of program and machine, the business of the psychologist is neatly and conveniently circumscribed: to infer the software of the mind without reference to the "machinery" of the body. The software, after all, exists in quite a different order from the computer, and cannot be reduced to it, and so psychologists need no longer be anxious that they are engaged in a spurious and unscientific undertaking, nor fear that they might eventually be taken over by the biochemists or the neurologists. As one of its many enthusiasts proudly proclaimed, cognitivism provides us with "a science of structure and function divorced/rom material substance" (Pylyshyn 1986: 68 [emphasis added]). On the face of it, the current obsession within cognitive psychology about the neural localization of psychological functions would seem to represent a radical move against disembodied theorizing, yet this new kind of phrenology has done nothing to bring the body seriously back into psychological theory, since the "body" in question is nothing more than bits of the brain.
2. This standard computer metaphor has been succeeded by other computer models, but not, within cognitive psychology, superseded by them. Connectionism, which has been widely taken up within cognitive psychology, is generally presented as complementary to the classical, symbol-based, approach to information processing, and, in any case, also treats the body as passive in relation to the world. In contrast, the more recent research upon robotics and "autonomous agents" has made remarkably little impact within mainstream cognitive psychology, and is seldom mentioned in the textbooks. However, we cannot safely assume that even "the new AI" has escaped from the schema of the-machinewithin-the-machine (cf. Ziemke 2001).
Bringing the body back to life: James Gibson's ecology ofagency
63
But let us now look more closely at the other side of the standard computer metaphor, the hardware or computer. What precisely is the hardware supposed to represent? Presumably not the mind. But is it the entire body, or the brain, or, indeed, just those structures that specifically support cognitive processing? Psychologists have been remarkably unconcerned about this side of the analogy, and about the kind of theoretical work this side of the computer analogy is supposed to be doing. For, according to the ideal of the computer as a general purpose machine, the computer itself (assuming it is switched on and operating properly) imposes absolutely no constraints on how it actually functions. It is the program that entirely specifies how the machine will operate and this is why it offers the obvious and sufficient explanation of what is going on when the program is "running" a computer. 3 According to the standard computer metaphor, therefore, the body as computer ends up doing no explanatory work at all, any more than does the paper on which physicists write down their equations. And so, to the extent that theorists wish (within the terms of the traditional computer metaphor) to theorize the body as anything other than as a transparent, completely unconstraining entity, they would have to represent it as part of the software. To repeat, it is the program, according to the ideal of the computer as a general purpose machine, that is the only source of constraint! Finally, the theoretical focus of the cognitivist research project itself turns our attention resolutely away from the body and bodily activity. Even if what people are actually doing necessarily provides the data for psychology, their behaviour is not supposed to constitute the real object of inquiry: To take behavior as the focus of attention for psychology is as big an error as to take tracks in cloud chambers as the main object of study in particle physics. Such tracks are interesting only as clues to the existence of certain particles and to their properties. (Macnamara 1999: 241.)
What people can be observed to be doing can thus only be regarded as merely an indirect "clue" or index of an underlying mental apparatus, and 3. The virtue of programs as a language in which to formulate psychological theories lies precisely in the fact that they have to do all the theoretical work, just as, in astronomy, differential equations do all the explanatory work in accounts of planetary motion - though it does not follow that, to the extent that programs provide an effective language for theorizing in psychology, the objects of our theorizing are themselves necessarily "running programs", any more than the planets are staying on course by solving differential equations.
64
Alan Costall
is not, in itself, of any focal psychological interest. And, when you think about it, what people are required to do in most psychological experiments could be of no other interest, given that what they are usually doing is nothing more than just pushing buttons or else keys on computer keyboards. Such "responses" do not have any significance at all, except as indices with a meaning tied to the particular experiment in question. And, thus, once again, the structure of the psychology experiment not only represents but in turn reinforces the dualism of modem cognitivism: that behaviour, as bodily movement, has no intrinsic or manifest meaning, its meaning deriving, instead, from underlying and unobservable mental structures which are deemed to be "the main object of study" in the particular experiment in question. The disappearance of the body is also reflected in the illustrations that accompany the textbooks. For example, when Descartes describes how the body responds to injury, he provides us with a figure of a rather chubby naked boy with his toe dangerously close to a fire (Descartes 1664 [1966]: 271). When Warren (1922: 4), in his textbook, Elements of Human Psychology, discusses the stimulus-response arc, he presents the well-defined figure of a man dressed in "plus-fours", the fashionable leisurewear of the time. When Thurstone (1923: 355) discussed the "stimulus-response fallacy" he included an illustration of "our minds", but this mind still takes a definite bodily form, even if merely that of some amoeba-like creature. So, how does the body (dis)appear in the modem textbooks? As an outline box, of course: a schematic interface to the outside world, and a container of many other, much more interesting, boxes representing various cognitive modules and the connections between them. In modem psychology, the body, construed as a receptive stimulus-response machine, has atrophied through many years of intellectual neglect to a shapeless and abstracted container. Psychologists have been so busy mechanizing the mind, they forgot about the other side of their mechanistic theorizing, the stimulus-response body. James Gibson was an important exception.
4.
James Gibson and the ecology of agency
As a student at Princeton, Gibson was greatly influenced by Edwin B. Holt, who had, in turn, been taught and inspired by William James. Gibson was thus familiar with the Darwinian adaptationist orientation of American psychology. But, remarkably, during his early career at Smith College, he
Bringing the body back to life: James Gibson's ecology ofagency
65
was also in close contact with the Gestalt psychologists, Kurt Koffka, Fritz Heider and Kurt Lewin, who had emigrated from Europe. As a consequence, Gibson's approach brought together the functionalist emphasis upon the coordination of animal and environment with the Gestaltist reaction against atomistic analysis in favour of a relational, holistic approach. Gibson also drew upon many other more "exotic" intellectual sources, including Merleau-Ponty and Marx, though he could have been a good deal more forthcoming about such influences (Heft 2001; Reed 1988). His career was a continual project towards undermining the many dualisms within the human sciences, and his work a remarkably productive fusion of American functionalism, behaviourism, Gestaltism, phenomenology and his own remarkable obstinacy. As his wife, Eleanor Gibson approvingly put it, "he adored arguments". (cited in Szokolosky 2004: 277). He was not prepared to take "taken-for-granted assumptions" for granted, not even those once central to his own work. It may seem odd to introduce Gibson in this chapter, and at this point, since according to his most influential critics he was surely a throw-back to pure stimulus-response behaviourism (e.g. Gregory 1997). If they were referring to the Gibson of the 1940s and 1950s, their claim would have some justification. But, as I will explain, like Wittgenstein, there were at least two Gibsons: an "early" and a "later" one. Before the "cognitive revolution" (variously dated from the late 1950s to the early 1970s), Gibson (e.g. 1950, 1958, 1959) had already taken exception to the self-contradictory nature of psychological theorizing, with, on the one hand, its fundamental commitment to a mechanistic stimulusresponse psychology, and then, on the other hand, its attempt to dream up possible processes intervening between the stimulus and response that might possibly explain the lack of any lawful causal relation between the so-called stimuli and responses. He thought this whole business of invoking efficient causality between stimulus and response and then invoking a deus ex machina to "explain" why such efficient causality seldom held true was scientifically disreputable. As he put it, he had "no patience with the attempts to patch up the S-R formula with hypotheses of mediation. In behavior theory as well as psychophysics you either find causal relations or you do not" (Gibson 1967: 132). The "early Gibson" attempted to reinstate stimulus-response theory, or a "perceptual psychophysics", both by redefining the stimulus in a more holistic, Gestalt way, and also by explaining how the higher-order structure of the stimulus was, in turn, determined by the very structure of the envi-
66
Alan Costall
ronment itself. For example, in the case of a terrestrial environment, there is a ground surface extending around the animal which gives rise to a gradient of texture at the retina, and, in an uncluttered environment, also a "visible horizon". Even in his early writings, however, you can already find Gibson beginning to notice something fundamentally wrong with the stimulus-response formula. Having been concerned in the Second World War with aviation, he had come to place great emphasis upon what he later called "optic flow", the streaming of optical texture primarily as a result of our own activity in the world: The normal human being [..] is active. His head never remains in a fixed position for any length of time except in artificial situations. If he is not walking or driving a car, or looking from a train or airplane, his ordinary adjustments of posture will produce some change in the position of this eyes in space. Such changes will modify the retinal images in a quite specific way. (Gibson 1950: 117 [emphasis added])
Once we allow for the active nature of human beings and other animals, the "stimulus" can no longer figure as an efficient cause, nor be considered as the "starting point" of perceiving. After all, if we are to retain the old static language of stimulus and response, it is typically the "response" that precedes the "stimulus", and which gives rise to it. In other words, "one could say that the behavior is the first cause of all the stimulations" (MerleauPonty 1942 [1965]: 13). By the late 1950s, Gibson had rejected the stimulus-response schema entirely. People, he insisted, are not passive recipients of stimuli, except under conditions such as those of the psychological laboratory, where immobility is imposed upon us, either through the instructions or else more intrusively through the use of clamps: The headrest of the laboratory prevents the observer from turning his head and looking around [...]. It also, of course, prevents him from getting up and walking around. (Gibson 1979: 1)
However, when we are doing things, and even when just "observing" our surroundings, we are active not just in our heads (as much of modem theory still insists), but bodily. We are acting upon and exploring our surroundings. Thus, according to Gibson, the visual system, for example, does not just involve the eyes and a brain (cf. Gregory 1997), but must be defined functionally rather than anatomically. The eyes, which themselves are under muscular control, are part of a moving head, which, in turn, is set
Bringing the body back to life: James Gibson's ecology ofagency
67
on top of a body that gets around in the world. Thus, as Gibson liked to put it, the visual system also has legs. Indeed, when we bring an object to our eyes to inspect it more closely, our hands, from this functionalist perspective, should also be regarded as part of the visual system (Cowie 1993). Perversely, the textbooks routinely force Gibson into the category of a "bottom-up" theorist. In such theories, the "processing" of the stimulation is supposed to be completely "data-driven", whereas in "top-down" theories, the input is assumed to be subject to active interpretation or hypothesis construction (Cavanagh 1999). This modem-sounding distinction can, in fact, be found in Kepler's proposal, published in 1604, that the retinal image is the starting point of vision: In what manner this image or picture is brought together by the visual spirits which reside in the retina or in the nerves, and whether it is made to appear before the soul or tribunal of the faculty of vision by a spirit within the cerebral chambers, or whether the faculty ofvision, as a magistrate sent by the soul, goes out from the council chamber ofthe brain to meet this image in the optic nerves and retina descending to a lower court, these things I leave to the natural philosophers [.. ] for disputing. (cited by Straker 1976: 20 [emphasis added])
The underlying continuity between Kepler's account and most modem theories of vision concerns precisely the assumption that the body is a passive recipient of external stimulation. Now, if we really did spend all our lives just waiting for things to happen to us - as the participants in psychology experiments are typically required to do - then whatever "activity" is involved in perceiving would necessarily be confined to "internal processing" (cf. Ben Zeev 1984). Activity would necessarily be "subcutaneous". But this was the very assumption Gibson was rejecting. And this is why the very distinction between "bottom-up" and "top-down" is irrelevant to Gibson's later theory given he rejected the very concept of the stimulus. If we really must talk of Gibson in terms of "ups" or "downs", then Gibson is best regarded (according to a very nice slip of a student's pen) as a "bottom-down theorist". After all, Gibson regarded perceiving as a way of making "contact" with our surroundings, a "reaching out" into the world. 4 4. Arnheim (1956 [1969]: 33) held a rather similar view: "[...] vision is anything but a mechanical recording device. [...] Rather, in looking at an object, we reach out for it. With an invisible finger we move through the space around us, go out to the distant places where things are found, touch them, catch them, scan their
68
Alan Costall
This tactile conception of perceiving was captured in, and probably encouraged, by an important study conducted upon active touch (Gibson 1962; cf. Ikegami and Zlatev this volume). Participants were required to recognize various objects either when these objects were simply placed into their motionless hands, or else when they were given permission to explore them actively in their hands. In passive touch, there is the dull sense of "something" sitting on the surface of the hand: in active touch, there is the vivid sense of a coherent object passing between the palm and fingers. Gibson's contrast between active versus passive touch nicely encapsulates one of the radical shifts in his theoretical position. Having started from an explicit commitment to stimulus-response theory, Gibson was, in fact, one of the few experimental psychologists to reject entirely the mechanistic framework of traditional perceptual theory. After all, the very ideal of experimental investigation in psychology has surely become that of imposing conditions upon our "subjects" and then determining how they react. Seldom are they allowed to explore, let alone change, the situations in which they are placed. Yet, as Gibson came to realize, perceiving is an embodied activity, one involving skill and intelligence.
5.
Resources for agency
The two key concepts developed by Gibson in relation to his ecological approach are information and affordance. Consistent with Gibson's rejection of stimulus-response framework, these should not to be regarded as "efficient causes" but as resources for an embodied and active subject.
5.1.
Affordances
Let us start with Gibson's concept of affordances, a concept that was mainly set out, and then rather sketchily, in his final writings (e.g. Gibson 1979). Its purpose was to undermine the dualisms of the mental and physical, of meaning and materiality, of the world and us, that continue to structure that last outpost of scientism and individualism, modem cognisurfaces, trace their boundaries, explore their texture. It is an eminently active occupation."
Bringing the body back to life: James Gibson's ecology ofagency
69
tivist theory. Yet it is not as though its implications are restricted to that target, for, even when the human sciences, as in social constructivism and postmodernism, try to go their own way, they often manage to retain these traditional dualisms through a failure to engage seriously in a radical examination of all this "modernist" metaphysics. Thus, we find the anthropologist, Roy Ellen (1996: 31), having argued for a discursive view of nature to replace a scientistic one - and in a book devoted precisely to "redefining nature" - coming to the remarkably traditional conclusion that culture "emerges from nature as the symbolic representation of the latter." And we have the social constructivist, Stuart Hall, insisting that meaning is confined to a representational realm of symbols, in opposition to the material world: [...] it is not the material world which conveys meaning: it is the language system or whatever system we are using to represent our concepts. (Hall 1997: 25)
Gibson's concept of affordances attempted to undermine the dualisms of subject and object, matter and meaning by treating animal and environment as mutually defining: The affordances of the environment are what it offers the animal, what it provides or furnishes, either for good or ill. [...] I mean by it something that refers to both the environment and the animal in a way that no existing term does. It implies the complementarity of the animal and the environment. (Gibson 1979: 127)
Affordances are, in essence, relationa1. 5 They concern the meanings of "things" in terms of what could be done with them, and hence implicate an agent. But these are not meanings that are "projected" onto things; they very much have to do with the nature of the objects involved. Take the example of stairs. Stairs are not stimuli. They do not force us to climb them. And there are a limitless number of things we can do with them: sit on them, break them up for firewood, and make a grand entrance down them. Woodworm can even live in and eat them. But the fact that we can do some many different things with things does not imply that we can do anything with anything. Thus to use stairs in their canonical way - for go5. Gibson was blatantly inconsistent on this point, sometimes insisting that affordances (and information) were independent of animals and thus undermining his attempt to go beyond subject-object dualism (cf. Costall 1995). I will return to this point.
70
Alan Costall
ing up and down them - requires (among many other things) that their risers and treads are of the appropriate dimensions. Beyond a certain critical point the stair no longer affords climbing. To ascend a staircase, we need to be able to reach the next step with our foot, and, furthermore, then be able to lift our body so that its weight is centered on that step. Going down stairs is more precarious, since we also need to check that we do not overdo things and end up in a painful fall. Normally, young children do have serious problems going up and down stairs. In fact, one can find age "norms" in the textbooks on motor development and developmental psychology, where the failure or success to achieve that norm is taken to reflect the intrinsic developmental condition of the child (Gesell, Ilg and Ames 1977). There are striking differences in the age norms for going down as opposed to going up stairs, and also for alternating the feet between steps as opposed to moving one foot forward and then gingerly following through by putting the other foot onto that same step. With the normal staircases that children normally encounter, it is not until they are around the late age of four and a half years that they begin to risk alternating their feet when descending. However, although the "climbability" of a staircase is a function of its dimensions, it also depends upon the size of the person in question. People do not come in standard sizes, and, in particular, young children are generally much smaller than adults. Yet these age norms have been based on "normal" staircases - stairs, in other words, designed for adults, not children. It is very curious that in our schools there are child-sized chairs and tables, but not child-sized stairs. In an inspired study, Josep Roca and his colleagues simply checked to see how children would cope with a scaleddown staircase where the steps were just 10 cm high and 20 cm. deep (Roca et al. 1986). The children coped remarkably well. The mean age at which they could climb either up or down the staircase was about twelve months, and alternating the feet between steps was achieved only slightly later at around eighteen months. Even this study did not make allowances for the fact that some of the participants were smaller than others, and not least because of their different ages. Yet it is the precise relation between the dimensions of the stair and of the individual user that is crucial. For example, according to Warren (1995), when the ratio of height of a step to leg length is greater than .88, it is simply no longer possible to step up onto it, and one must then resort to climbing with ones hands and knees. There is also a definite optimum ratio
Bringing the body back to life: James Gibson's ecology ofagency
71
where energy expenditure in climbing is least, and this, again, is a conjoint function of both the stair and the user. Under the specific conditions used in Warren's research, where the diagonal distance between successive stairs was held constant at 14 inches, the optimum ratio of riser height to leg length was .26. Let us take just one more example. The ways in which young children (and indeed other primates) are able to grasp objects has been of great interest to developmental psychologists, paediatricians and primatologists, especially the precision grip where the object is held between thumb and forefinger. Presumably in the sake of the serious pursuit of "scientific control", many of the main studies on grasping have used a target object with a single, standard size. Primates, however, including human children and indeed even adults, do not come in a standard size, and the graspability of the object surely depends on the relation between the object and the agent. Karl Newell and his colleagues presented both young children and adults with a range of cubes of different sizes and recorded the different kinds of ways they picked them up, for example between the flat of the palms, within the palm of one hand with all of the fingers grasping the object, and the prestigious precision grip (NewelI et al. 1989). When these investigators related the frequency of these various grip patterns, not to the absolute size of the cubes, but body-scaled to the individual participants, the transitions between different kinds of grip corresponded to definite ratios common to both the children and adults. Indeed, it was the bodilyscaled dimensions of the cubes, rather than age that accounted for most of the variability between the different grip patterns. Although the children in this study were older than the age at which children have been recorded as first being able to use the precision grip, the results suggest we need to be wary about age norms that make no reference to the child's relation to the test situation. There are critical points, therefore, concerning "body-scale" that define the limits within which we can act upon something in a particular way and beyond which we simply cannot. But scale is not the only issue, and nor does it function in isolation. For example, whether - and how - we might grasp an object depends upon a host of its other characteristics: its fragility, its slipperiness, its mass and also the distribution of its mass, its value, and so on. And what "holds" scale together with all of these other characteristics together and gives them meaning is the animal or person in question, though not so much as a "perceiver" (cf. Gibson 1979: 137; Heft 2001: 132; Ingold 2000: 168) but, much more fundamentally, as an agent:
72
Alan Costal!
[Affordances] have unity relative to the posture and behavior of the animal being considered. So an affordance cannot be measured as we measure in physics. (Gibson 1979: 127-128 [emphasis added])
This, then, is how the concept of affordances helps to undermine, rather than merely "bridge", the old psychophysical dualism. Affordances constitute the material resources for action but they do not fall on the far side of the material-mental divide. They are, as Gibson put it, both physical and mental, because they already implicate the needs and purposes of an agent, who, in turn, is envisaged as existing within - rather than beyond - the natural order of things (Heft 1989). There can be no question that Gibson intended affordances to include the culturally specific, as in his much discussed example of a postbox, which "affords letter-mailing to a letter-writing human in a community with a postal system" (Gibson 1979: 139). Yet for many in the human sciences, the culturally specific - the "conventional" - is somehow supposed to crowd out materiality. Yet, even though a postal system constitutes a highly specific human practice, materiality still matters. Postboxes, for example, as part of this system, have to perform the function of accepting and temporarily storing letters, but they also need to be distinguishable visually from litter bins and other kinds of things. And, of course, we do not discover what postboxes mean just by peering at them! It is through being a member of a community in which postboxes are actually used that we come to understand what they are supposed to afford, including being instructed by other people about how to use them, seeing other people using them, and, if all else fails, consulting a manual on how to use them. Indeed, if, as in the case of autism, we are somehow excluded from connecting with other people, the normal use of objects can be seriously disrupted (Williams, Costall and Reddy 1999).
5.2.
Information
Gibson's concept of information points to a second essential resource for effective agency: experiencing the world in relation to ourselves, including our bodily capacities. According to Gibson, the information available to perceivers is limitless and sufficient to specify the important features of our environments and our relations to it. Traditional perceptual theory, in contrast, not only stresses the poverty and unreliability of the available
Bringing the body back to life: James Gibson's ecology ofagency
73
information but also regards the perception of the world and of ourselves as quite separate. Gibson often quoted, approvingly, Bishop Berkeley's claim that vision enables animals "to foresee the damage or benefit which is like to ensue upon the application of their own bodies to this or that body which is at a distance" (Berkeley 1707 [1975]: 24; cf., for example, Gibson 1950: xiv, 1966: 156, 1979: 232).) There is now a substantial body of research with adults and children showing that we can be very effective in "foreseeing" affordances, for example, whether a staircase is climbable and even the optimum riser height for one" s own height, whether one can walk through a gap either with or without turning one's shoulders, or ducking one's head, and so on (e.g. Mark 1987; Warren 1984, 1995; Warren and Whang 1987). And we typically do this not by first visually "measuring" the object and only then comparing it to our own body. Rather, we experience the object in relation to ourselves. This is obviously the case, for example, when we are about to pick up an object. We do not just see the object nor just our hands, but the object and our hands in relation to one another. The relation of the object to our bodies is also specified in more subtle ways. Thus our eye-height is specified by the "visible horizon" (Gibson 1979: 162-164). Objects extending above the horizon are higher than our own eye-level, and the proportion in which it intersects the horizon corresponds to the height of that object relative to one's eye-height. Like the critical ratios for stepping on stairs, and picking up cubes, these horizon relations relative to the height of the point of observation, lack an objective metric, they are dimensionless and body-scaled. They directly relate to us, and are all the better for that: [...] the "knowledge" of his height that comes to the observer simply from living in his body is both more fundamental and more meaningful to him than the knowledge communicated to him by a statement such as "X is Y feet long". (Sedgwick 1973: 47)6
Complementarily, we develop, and sustain, a relatively stable sense of the limits and capacities of own bodies-in-relation-to-the-world in the very course of our activities (Stins, Kadar and Costall 2001). Interestingly (if
6. The variety of "anthropocentric" measurements of length is remarkable, being based on the size of the hand, the length or width of the fingers, the thumbnail, the fist and outstretched thumb, the foot, the pace of the legs, and so on (Klein 1975; Connor 1987).
74
Alan Costall
not surprisingly), this sense of our own bodies can be disrupted during periods of rapid growth (Heffernan and Thomson 1999).
5.3.
Proprioception.
In traditional theory, perceiving the world and perceiving oneself have been regarded as quite separate issues. Perceiving the world was supposed to be mediated by "exteroceptors" in the eyes, ears, nose and skin, and perceiving the self was supposed to be achieved through specialized "interoceptors" within our muscles, joints and the semicircular canals within the inner ear. Gibson regarded this division - based on a dualism of the objective and subjective - as simply mistaken: "The awareness of the world and one's complementary relations to the world are not separable" (Gibson 1979: 141). As we have seen, this complementarity is fundamental (if often tacit) in Gibson's discussion of the perception of affordances: perceiving what an object affords implicates a "subject" with certain motives and capacities. This matter of "reciprocity" of object and subject is focal, however, in Gibson's treatment of the proprioceptive function of vision: "our awareness of being in the world" (Gibson 1979: 239). The "external sense" of vision is richly informative about ourselves, and our relation to our surroundings (cf. also Butterworth 1995; Neisser 1994).7 The point is central to Gibson's account of "optic flow", the deformations of optical structure that derive from our own movements within the world. The flow structure is, Gibson argued, the basis for our "awareness of movement or stasis, of starting and stopping, of approaching or retreating, of going in one direction or another, and the imminence of an encounter" (Gibson 1979: 236). Similarly, the "visible horizon" expresses the reciprocity of self and environment: it "is neither subjective nor objective" (Gibson 1979: 164). Then, of course, we can literally see ourselves in the world. As I am typing this chapter, I can see my hands and arms, trunk and legs extending in front of me, and also, though less distinctly, the frames of my glasses, eye-brows and nose. Gibson included the famous illustration from Ernst
7. Compare Merleau-Ponty (1993: 37): "Every localization of objects in the world presupposes my locality. In a sense, an object of perception continuously speaks to us of ourselves. As incarnate subjects, we are expressed by the object."
Bringing the body back to life: James Gibson's ecology ofagency
75
Mach's Analysis ofSensations (1885 [1959]) in all of his three main books (Gibson 1950, 1966, 1979). It is a "view" of a man's office that includes the "visible ego" of the observer himself, including his nose and the end of his impressively long moustache. Now, as Hans Lubbe has noted, the implications of this simple image were, indeed, radical: This drawing is an illustrated criticism of the divorce of subject from object which renders theory of perception quite incapable of relating one to the other. The drawing demonstrates how, even in the simple process of seeing, there is no phenomenal world in which the subject itself is not already present, and that there is no subject which is not already present in the world. (Liibbe 1960 [1978]: 115.)
In a neat switch, Gibson also drew our attention to what we cannot see namely, ourselves restricting our own view: Ask yourself what it is that you see hiding the surroundings as you look out on the world - not darkness, surely, not air, not nothing, but the ego! (Gibson 1979: 112)
6.
Conclusion
In this chapter, I have tried to explain the importance of Gibson's critique of the passive and atrophied body behind much of psychological theory, and his attempt to provide a radically different kind of psychology based, not on "stimuli" and "responses", but on an ecology of agency - in other words, on the material resources for our effective and collective being in the world. Many of those who have become discontented with the restrictive cognitivist or representationalist thinking (cf. Johnson and Rorher this volume) that pervades the human sciences have been attracted to Gibson's challenge to the dualisms of subject and object, and of matter and meaning, that constrain much of modem - and post-modem - thought. But, as I have been arguing for some time now, we need to be wary of a fundamental inconsistency in Gibson's work: a vacillation between his insistence upon the mutuality of organism and environment and his retreat into a standard kind of realism that would exclude us from the world to be known (e.g. Costa111981, 1989, 1995,2003,2004; Costall and Still 1989). It is the realist version of Gibson that dominates not just the introductory textbooks (Costall and Morris 2004), but also the advanced literature. On this view, Gibson is indeed committed to a universalistic account of the
76
Alan Costal!
environment, including affordances and information. Thus, even Lakoff (1987: 216), someone highly appreciative of Gibson's emphasis upon embodiment, has claimed that "the Gibsonian environment is monolithic and self-consistent and the same for all people", and the ecological approach cannot make sense of "experiential or cultural categories". But these supposed limitations only follow if we lose sight of the principle of mutuality, and suppose, as many of Gibson's closest followers do, that "animals or humans do not enter the picture, except as a scale/actor" (Mace 1977: 50 [emphases added]). The mutualist alternative is to bring the animated body, the embodied agent, squarely into the picture. To repeat the crucial passage from Gibson that I used earlier: [Affordances] have unity relative to the posture and behavior of the animal being considered. So an affordance cannot be measured as we measure in physics. (Gibson 1979: 127-128 [emphases added])
This principle of relativity holds no matter how historically-specific or indeed idiosyncratic the activity happens to be. Indeed, it is likely that a good deal of what Gibson himself had to say about the visual control of locomotion could be most relevant to forms of high speed movement never experienced by humans before the relatively recent introduction of trains, cars and planes. Nevertheless, materiality still matters, and what the "later Gibson" in his mutualist mode was opening up was a view of materiality and culture as interpenetrating, rather than in opposition (see also Ingold 1996, 2000). Gibson was certainly important for his critique of the dualisms fundamental to so much of psychology and social theory. But as I have been trying to explain, he has also provided us with promising conceptual resources for going beyond those dualisms. There would be no point however in taking our existing unworldly conceptions of the social, the cultural and the symbolic and trying to tag them onto the body as presented to us by the realist Gibson. Although that body is no longer the passive stimulus-response machine of standard psychological theory, it is abstracted from any specific circumstances or history or identity. We would be left with more of the same: the opposition of the "human world" and the "natural world" that keeps leading the human sciences to suppose that the social and the cultural exist solely within our individual or collective heads, and that symbolism could be nothing more than the representation of a separate and inherently meaningless world (e.g. Ross 2004: 65).
Bringing the body back to life: James Gibson's ecology ofagency
77
The idea that we do not belong to the natural order of things has a very long past, extending well beyond Descartes, and it has been central to the metaphysics of modem science. Gibson is important because he has presented us with an alternative mutualist scheme in which we can at last see ourselves as part of nature - part of what nature has, and will, become. Affordances and information are, as Gibson claimed, "in" the world, but only in the sense that we are too. It is not just a question of spatial location. As John Dewey (1958: 295) insisted, we do not exist in the world "as marbles are in a box" but rather "as events are in history, in a moving, growing never finished process".
References Arnheim, Rudolph 1969 Art and Visual Perception. (Rev. ed.) London: Faber. [First published in 1956.] Avramides, Anita 2001 Other Minds. London: Routledge. Baker, Gordon and Morris, Katherine J. 1996 Descartes' Dualism. London: Routledge. Ben Zeev, Aaron 1993 The Perceptual System: A Philosophical and Psychological Perspective. New York: Peter Lang. Berkeley, George 1709 [1975] Philosophical Works including the Works on Vision. [Introduction and notes by M.R. Ayers.] London: Dent. Bruce, Vicki and Patrick Green 1990 Visual Perception: Physiology, Psychology and Ecology. (2nd edition) Hove, UK: Erlbaum. Bruner, Jerome, Jacqueline Goodnow and George Austin 1956 A study ofthinking. New York: Wiley. Butterworth, George 1995 An ecological perspective on the origins of self. In: Jose L. Bermudez, Naomi Eilan and Anthony Marcel (eds.), The Body and the Self, 87-105. Cambridge, MA: MIT Press. Cavanagh, Patrick 1999 Top-down processing in vision. In: Robert A. Wilson and Frank C. Keil (eds.), The MIT Encyclopedia of the Cognitive Sciences, 844845. Cambridge, MA: MIT Press.
78
Alan Costall
Connor, Robert D. 1987 The Weights and Measures of England. London: Her Majesty's Stationery Office (Science Museum). Costall, Alan 1981 On how so much information controls so much behaviour. In: George Butterworth (ed.), Infancy and Epistemology, 30-51. New York: St. Martin's Press/Brighton: Harvester Press. A closer look at 'direct perception'. In: Angus Gellatly, Don Rogers 1989 and John A. Sloboda (eds.), Cognition and Social Worlds, 10-21. Oxford: Clarendon Press. Graceful degradation: Cognitivism and the metaphors of the com1991 puter. In: Arthur Still and Alan Costall (eds.), Against Cognitivism: Alternative Foundations for Cognitive Psychology, 151-170. New York: Harvester-Wheatsheaf. Socializing affordances. Theory and Psychology 5: 467-481. 1995 From direct perception to the primacy of action: A closer look at 2003 James Gibson's ecological approach to psychology. In: Gavin J. Bremner and Alan M. Slater (eds.), Theories ofInfant Development, 70-89. Oxford: Blackwell. From Darwin to Watson (and Cognitivism) and back again: The 2004 principle of animal-environment mutuality. Behavior and Philosophy 32: 179-195. Costall, Alan and Paul Morris 2004 The Textbook Gibson: Misrepresentations of an anti-representationalist. European Workshop on Ecological Psychology, Torri del Benaco, Lake Garda, Verona, Italy, 26-29 June, 2004. Costall, Alan and Arthur Still 1989 Gibson's theory of direct perception and the problem of cultural relativism. Journalfor the Theory ofSocial Behavior 19: 433-441. Cottingham, John 1992 Cartesian dualism: Theology, metaphysics and science. In: John Cottingham (ed.), Cambridge Companion to Descartes, 236-257. Cambridge: Cambridge University Press. Cowie, Roddy 1993 On acting in order to see things better. Paper presented at the Workshop on "The Primacy of Action", International Society for Ecological Psychology, University of Manchester, 13-14 September, 1993. Danziger, Kurt 1983 The schema of stimulated motion: Towards a pre-history of modem psychology. History ofScience 21: 183-210.
Bringing the body back to life: James Gibson's ecology ofagency
79
Descartes, Rene 1960 Meditations. In: Discourse on Method. [A. Wollaston, trans.] London: Penguin Books. Original 1637. 1960 Discourse on Method. [A. Wollaston, trans.] London: Penguin Books. Original 1641. 1971 Correspondence with Princess Elizabeth. In: Elizabeth Anscombe and Peter Geach (eds.), Descartes: Philosophical Writings, 274-282. Indianapolis: Bobbs-Merrill. Original 1643. 1966 On mechanism in human action [translated by Tamar March from L'Homme]. In: Richard Herrnstein and Edwin Boring (eds.), A Source Book in the History of Psychology, 266-272. Cambridge, MA: Harvard University Press. Original 1664. Dewey, John 1958 Experience and Nature. New York: Dover. [Based on the Paul Cams lectures of 1925.] Ellen, Roy Introduction. In: Roy Ellen and Katsuyoshi Fukui (eds.), Redefining 1996 Nature: Ecology, Culture and Domestication, 1-36. Oxford: Berg. Gesell, Amold, Frances L. Ilg and Louise Bates in collaboration with Glenna E. Bullis 1977 The Childfrom Five to Ten. (Rev. ed.) New York: Harper and Row. Gibson, James 1950 The Perception ofthe Visual World. Boston: Houghton Mifflin. 1958 An interpretation of Woodworth's theory of perceiving. In: S. Georgene and John P. Seward (eds.), Current Psychological Issues: Essays in Honour ofRobert s. Woodworth, 39-52. New York: Holt. Perception as a function of stimulation. In: Sigmund Koch (ed.), 1959 Psychology: A Study of a Science, Vol. I, 456-501. New York: McGraw-Hill. Observations on active touch. Psychological Review 69: 477-491. 1962 1966 The Senses Considered as Perceptual Systems. Boston: Houghton Mifflin. Autobiography. In Edwin Boring and Gardner Linzey (eds.), A His1967 tory of Psychology in Autobiography, Vo!. 5, 127-143. New York: Appleton-Century-Crofts. 1979 The Ecological Approach to Visual Perception. Boston: Houghton Mifflin. Gregory, Richard 1997 Eye and Brain: The Psychology of Seeing, 5th ed. Princeton, NJ: Princeton University Press.
80
Alan Costall
Hall, Stuart 1997
Representation: Cultural Representations and Signifying Practices. London: Sage.
Harre, Rom 2002 Cognitive Science: A Philosophical Introduction. London: Sage. Hebb, Donald 1960 The American revolution, American Psychologist 15: 735-745. Heffeman, Dorothy and Thomson, James 1999 Gone fishin': perceiving what is reachable with rods during a period of rapid growth. In: Dorothy Heffeman and James A. Thomson (eds.), Studies in Perception and Action V, 223-228. Mahwah, NJ: Erlbaum. Heft, Harry Affordances and the body: An intentional analysis of Gibson's eco1989 logical approach to visual perception. Journal for the Theory of Social Behaviour 19: 1-30. Ecological Psychology in Context: James Gibson, Roger Barker and 2001 the Legacy of William James's Radical Empiricism. Mahwah, NJ: Lawrence Erlbaum Associates. Hirst, William 1988 Preface. In: William Hirst (ed.) The Making of Cognitive Science: Essays in Honor of George A. Miller, i-xi. Cambridge: Cambridge University Press. Ikegami, Takashi and Jordan Zlatev this vo!. From pre-representational cognition to language. Ingold, Tim Situating action V: The history and evolution of bodily skills. Eco1996 logical Psychology 8: 171-182. The Perception of the Environment: Essays in Livelihood, Dwelling 2000 and Skill. London: Routledge. Johnson, Mark and Tim Rorher this vo!. We are live creatures: Embodiment, American Pragmatism and the cognitive organism Klein, H. Arthur 1975 The World ofMeasurements. London: George AlIen and Unwin. Lakoff, George 1987 Women, Fire and Dangerous Things: What Categories Reveal about the Mind. Chicago: University of Chicago Press. Leahey, Thomas 1992 The mythical revolutions of American psychology. American Psychologist 47: 308-318.
Bringing the body back to life: James Gibson's ecology ofagency
81
Liibbe, Hans 1978 Positivism and phenomenology: Mach and Husserl. [Trans. A. L. Hammond.] In: Thomas Luckman (ed.), Phenomenology and Sociology, 90-118. London Penguin Books. [Article fust published in 1960.] Mace, William 1977 James J. Gibson's strategy for perceiving: Ask not what's inside your head, but what your head's inside of. In: Robert Shaw and John Bransford, (eds.), Perceiving, Acting and Knowing: Toward an Ecological Psychology, 43-65. Hillsdale, NJ: Erlbaum. Mach, Ernst The Analysis of Sensations and the Relation of the Physical to the 1885 Psychical. New York: Dover. Reprint 1959. Macnamara, John 1999 Through the Rearview Mirror: Historical Reflections on Psychology. Cambridge, MA: MIT/Bradford Books. Mark, Leonard 1987 Eyeheight-scaled information about affordances: A study of sitting and stair climbing. Journal of Experimental Psychology: Human Perception and Performance 13: 361-370. Merleau-Ponty, Maurice 1965 The Structure of Behaviour. [trans. Alden L. Fisher.] London: Methuen. [Original French edition published in 1942.] 1993 The experience of others (1915-1952). (F. Evans and H. J. Silverman, trans.) In: Keith Hoeller (ed.), Merleau-Ponty and Psychology, 33-63. New Jersey: Humanities Press. [First English translation of Merleau-Ponty's 1951-1952 lecture course L'Experience d'autrui, given at the University of Paris (Sorbonne).] Miller, George, Eugene Galanter and Karl Pribram 1960 Plans and the Structure of Behavior. New York: Holt, Rinehart and Winston. Neisser, Ulrich 1994 Self-perception and self-knowledge. Psyke and Logos 15: 392-407. 1997 The future of cognitive science: An ecological analysis. In: David M. Johnson and Christina E. Emeling (eds.), The Future ofthe Cognitive Revolution, 247-260. New York: Oxford University Press. Newell, Karl, Diedre Scully, F. Tenenbaum and S. Hardiman 1989 Body scale and the development of prehension. Developmental Psychobiology 22: 1-13. Pylyshyn, Zenon 1986 Computation and Cognition. Cambridge, MA: MIT Press.
82
Alan Costal!
Reed, Edward 1988 James Gibson and the Psychology of Perception. New Haven: Yale University Press. 1997 The cognitive revolution from an ecological view. In: David M. Johnson and Christina E. Emeling (eds.), The Future ofthe Cognitive Revolution, 261-273. New York: Oxford University Press. Richardson, Robert C. 1982 The "scandal" of Cartesian interactionism. Mind 91: 20-37. Roca, Josep, Martinez, Lizandra Mireia, Anna Fabregas Mireia and Anna Cordoner 1986 Registres evolutius motors: Una observaci6 critica. Apunts 6: 61-64. Ross, Norbert 2004 Culture and Cognition: Implications for Theory and Method. Thousand Oaks, CA: Sage. Schiffer, Michael Brian (with Andrea R. Miller) 1999 The Material Life of Human Beings: Artifacts, Behavior and Communication. London: Routledge. Sedgwick, Howard 1973 The visible horizon: A potential source of visual information for the perception of size and distance. Unpublished doctoral dissertation, Comell University, January 1973. Still, Arthur and Alan Costall (Eds.) 1991 Against Cognitivism. New York: Harvester Wheatsheaf. Stins, John, Endre Kadar and Alan Costall 2001 A kinematic analysis of hand selection in a reaching task. Laterality 6: 347-367. Straker, Stephen 1976 The eye made "other"; Durer, Kepler and the mechanisation of sight and vision. In: Louis A. Knafla, Martin S. Staum and T. H. E. Travers (eds.), Science, Technology and Culture in Historical Perspective, 7-25. University ofCalgary Studies in History, No. 1. Szokolosky, Agnes 2004 An interview with Eleanor Gibson. Ecological Psychology 15: 271281. Thurstone, Louis 1923 The stimulus-response fallacy in psychology. Psychological Review 30: 354-369. Tibbetts, Paul 1973 Historical note on Descartes' psychophysical dualism. Journal of the History ofthe Behavioral Sciences 9: 162-165.
Bringing the body back to life: James Gibson's ecology ofagency
83
Toulmin, Stephen 1993 From clocks to chaos: Humanizing the mechanistic world view. In: Herman Haken, Anders Karlqvist and Uno Svedin (eds.), The Machine as Metaphor and Tool, 139-153. Berlin: Springer-Verlag Warren, Howard 1922 Elements ofHuman Psychology. Boston: Houghton Mifflin. Warren, William and S. Whang 1987 Visual guidance of walking through apertures: Body-scaled information for affordance. Journal of Experimental Psychology: Human Perception and Performance 13: 371-383. Warren, William, Jr. 1984 Perceiving affordances: Visual guidance of stair climbing. Journal of Experimental Psychology: Human Perception and Performance 10: 683-703 1995 Constructing an econiche. In: John Flach, Peter Hancock, Jeff Caird and Kim Vicente (eds.), Global Perspectives on the Ecology of Human-Machine Systems, Vo!' 1,210-237). Hillsdale, NJ: Erlbaum. Westland, Gordon 1978 Current Crises in Psychology. London: Heinemann. Williams, Emma, Alan Costall and Vasu Reddy 1999 Children with autism experience problems with both objects and people. Journal of Autism and Developmental Disorders 29: 367378. Wilson, Margaret 1980 Body and mind from the Cartesian point of view. In: Robert W. Rieber (ed.), Body and Mind: Past, Present and Future, 35-55. New York: Academic Press. Ziemke, Tom 2001 The construction of 'reality' in the robot. Foundations of Science 6 (1): 163-233.
From the meaning of embodiment to the embodiment of meaning: A study in phenomenological selDiotics
Goran Sonesson A Qualisign /---I cannot actually act as a sign until it is embodied; but its embodiment has nothing to do with its character as a sign. A Sinsign / ---I involves a qualisign, or rather, several qualisigns. But these qualisigns are of a peculiar kind and only form a sign through being actually embodied. Charles S. Peirce, Nomenclature and Divisions of Triadic Relations
Abstract Unlike much of the contemporary discussion of embodiment, phenomenology is really involved with the body as a kind of meaning appearing to consciousness; and it does not only attend to the body of the biological organism, but also to the kind of organism-independent artefacts which are required by some sign systems. Because it is concerned with meaning, phenomenology is akin to semiotics. From the point of view of the latter discipline, however, signs must be distinguished from other meanings, and clear criteria are needed for doing so. At least one such criterion can by found in the work of Piaget: differentiation. Meaning in the more general sense of organisation and selection is at the basis of the common sense world, and thus accounts for what is known in Cognitive Linguistics as "image schemas". Cognitive Linguistics, just as biosemiotics, ignores this important distinction. Moreover, some cognitive linguists seem to deny the distinction between organism and environment, which must prevail if "image schemas" are to be acquired, along the lines of earlier conceptions of schematisation. On the basis of these considerations, a developmental sequence can be suggested going from schemas to signs and organism-independent artefacts. Keywords: body, ecology, embodiment, evolution, Lifeworld, memory, phenomenology, picture, semiotic function, semiotics, sign.
86
Goran Sonesson
1.
Introduction
In our time, in which the term "embodiment" is put to quite new (and, to my mind, either fuzzy or redundant) uses, authors such as Johnson (1987) and Varela, Thompson and Rosch (1991) have not failed to suggest a continuity with an earlier discussion of embodiment, taking place about a century ago, notably within phenomenological philosophy (e.g., Husserl 1973b; 1976). Yet these references to phenomenology seem to me to be fairly superficial, and the grasp of the phenomenological notion of embodiment shown often appears to be incomplete, if not inadequate. This is why I will start out by explaining the emergence of the problem of embodiment within phenomenological philosophy. Taking a cue from the phenomenologists themselves, I will also suggest that phenomenology may be interpreted as a branch of psychology, and thus serve as an ingredient of cognitive science as well as a basis for semiotic theory. From there on, my search for the multiple "bodies of the mind" will follow a somewhat spiralling movement: first, I will argue that the concept of sign or representation, which I take to be indispensable for our understanding of human consciousness, supposes something of a body of its own. Then we will see how meaning, which is not specifically embodied in signs, is a requisite, in both a systematic and an evolutionary sense, for the attainment of the sign function (Piaget 1945; Sonesson 1992b). I will go on to suggest that what is elsewhere known as "image schemas" (e.g. Johnson 1987; Lakoff and Johnson 1999, Johnson and Rohrer this volume) do indeed constitute a level of meaning prior to the sign but, for that very reason, are not directly involved in metaphors, which, in my view, and that of the tradition of classical rhetoric, must be construed as signs, and indeed signs standing for other signs (cf. Sonesson 1989, 1998b). Finally, we will look at embodiments of meaning in a rather different sense, of the kind which develops, phylogenetically and perhaps also ontogenetically, after the attainment of the (linguistic) sign, such as pictures, writing, and theories, that is, organism-independent sign-vehicles spanning time and/or space. My aim is not to exhaust the repertory of embodiments of meaning, but merely to expound some of their varieties, and to pinpoint their different evolutionary import.
From the meaning ofembodiment to the embodiment ofmeaning
2.
87
The Cartesian divide: Where angels fear to tread
In the philosophical tradition, embodiment emerges as a problem within the philosophy of consciousness, which aims to reconstruct the world as given to a (generic) subject. In this sense, embodiment gives rise to two separate strands in the particular version of the philosophy of consciousness inaugurated by Russerl, known as phenomenology: in relation to the physical body of the subject itself and/or his or her counterpart in perceptual space, the generic other; in relation to signs and other overarching structures, which, like the physical body, appear in the mind, without being of the mind, and seem to require some kind of physical substratum in order to exist.
2.1.
Phenomenology from the phenomenological point of view
The justification for a philosophy of consciousness is of course that in the common sense world, which Russerl later was to baptise the Lifeworld, everything there is is accessible to us through consciousness. The paradox is that, at the same time, the body, our own, as well as that of the other, cannot be a mere figment of consciousness. To paraphrase the classical dictum of 19th century psychology reemerging in the modem discussion of consciousness (cf. Dennett 1991), the body is not a mere epiphenomenon of consciousness. Indeed, this transcendence of our physical being to consciousness is itself part of the Lifeworld. As Max Scheler (quoted by Gurwitsch 1985) nicely put it, "we know that we are no angels", that is, no free-floating sprits without bodies. The second strand is quite different: genuine semiotic structures such as mathematical concepts, logic, and even language appear to transcend consciousness much in the mode of a Regelian "absolute spirit". They are, in Russerlean terms, "idealised" in order to be detached from their dependence on individual subjects - which is why they may harbour what Deacon (2003) has recently called "semiotic constraints", whose origin is independent of both nature and nurture. And yet, as Russerl (1962a: 365-386) recognised in his study of the origin of geometry, for the idealisation to be complete, its products have to be "embodied" in some kind of notational system, because only in that way can they gain a stable, public existence in a domain completely separate from their instantiations in the practical
88
Goran Sonesson
situations of the Lifeworld. More recently thinkers from separate traditions such as Ivins (1953), Innis (1950), and Donald (1991), have regained this insight in some form or other. The task of phenomenology, as Russerl saw it, was to explain the possibility of human beings having knowledge of the world; as a philosophical endeavour, phenomenology is about the way the world of our experience is "constituted". As a contrast, psychology is not about the world, but about the subject experiencing the world. Rowever, every finding in phenomenological philosophy, Russerl claims, has a parallel in phenomenological psychology, which thus could be considered a tradition within psychological science (cf. Russerl 1962b; Gurwitsch 1974). If consciousness is a relation connecting the subject and the world, then phenomenology is concerned with the objective pole and psychology is about the subjective one. It is often forgotten that Russerl not only inspired but himself was inspired by the Gestalt psychologists. Close followers of Russerl such as, most notably, Gurwitsch (1957, 1966), were as much involved with phenomenological psychology as with philosophy and discussed the findings not only of the psychology of perception but of contemporary contributors to neurobiology such as Gelb and Goldstein. Also the early Merleau-Ponty (1942, 1945),1 was, in this respect, an exponent ofphenomenological psychology. Many of those who are concerned with embodiment today appear to come from the diametrically opposite camp. Edelman (1992), for instance, clearly does not discover the body from the horizon of consciousness, but quite the opposite, he implies that the mind cannot be divorced from the body. In a sense, this is hardly controversial: unlike those hypothetical angels, human beings can only boast a mind as long as they have a body. But, if this is true in the order of existence, it is not necessarily so from the point of view of investigation. After all, Brentano (1885) did not use a scalpel, much less fMRI, to discover the property of intentionality (in the sense of directedness), which Edelman recognises as an irreducible characteristic of consciousness; nor did James (1890) find any of those "Jamesian properties" of consciousness repeatedly mentioned by Edelman in such a way. Indeed, far from being "a deliberately non-scientific set of reflections on consciousness and existence" (Edelman 1992: 159), phenomenology started out from the fact of intentionality and attempted to probe ever
1. Who may not quite deserve the hero status given to him by Varela, Thompson and Rosch (1991); See also Gallagher, this volume.
From the meaning ofembodiment to the embodiment ofmeaning
89
deeper into its ramifications, in order to rediscover and amplify those very Jamesian properties of consciousness mentioned by Edelman. Russerl and Gurwitsch may have been wrong to think of phenomenology as a discipline completely separate from biology and psychology, but the relative disconnection of phenomenological reflections, like those of Brentano and James, from biologicallmowledge has no doubt borne rich intellectual fruit. If "a biologically based theory of mind" can in some respects "invigorate" phenomenology, the opposite is certainly just as true. Interestingly, Edelman (1992; Edelman and Tonini 2000) claims that consciousness as such cannot be a spurious occurrence, because that would not have made evolutionary sense. That is, consciousness is not an epiphenomenon. But we have seen that, to classical embodiment philosophy, the problem is to show that the body is not an epiphenomenon.
2.2.
The science of common sense and its operations
The apparent paradox arises because, in the two cases, the point of view is entirely different. Phenomenology, like the science of semiotics, takes as its point of departure the way things make sense to us, that is, how they mean. In this very broad sense phenomenology accomplishes a semiotical reduction: things are considered only from the point of view of their having meaning to us (where "we" might be people of a particular culture or subgroup, or humankind in general)? From a phenomenological point of view, there is, in a sense, no way of overcoming the divide formulated by Descartes, for Descartes did not invent it: it is intrinsic to that phenomenon which, in Descartes' own words, is the most widely distributed one in the world, common sense. Common sense is not notorious for being right, but if we ask ourselves how the body (and the rest of the world) makes sense to us, then common sense is our very subject matter. Even so, common sense gives rise to an apparent contradiction: my body is necessarily experienced through my consciousness, but in my consciousness it is experienced as
2. Elsewhere (Sonesson 1989 26ft), I have opposed, in this sense, the qualitative reduction to the more familiar quantitative one, characteristic of the traditional natural sciences. There are similarities, but also differences, to the series of "reductions" distinguished by Husserl: the phenomenological and eidetic reductions, notably.
90
Goran Sonesson
being outside of it. 3 All post-Cartesian meditations, including those of Russerl (1973a) and those of Merleau-Ponty (1945), have been concerned to account for this paradox. To do so, it is necessary to accomplish a painstaking analysis (of which there can be no better example than the posthumous papers of Russerl himself, together with the - also largely posthumous - works of Peirce) of all those structures of the mind that are normally at the margin of consciousness (cf. below 5.2). In this sense, all human and social sciences which aspire to discover regularities, such as linguistics and other semiotic sciences, necessarily start out from phenomenology - and we should be happy if those phenomenological investigations sometime manage to be as meticulous as those of Russerl and Gurwitsch. Saussure famously observed that "linguistics and the other semiological sciences" are so difficult, because they are not concerned with anything material: indeed, he continued, their subject matter is the point of view we take on material things. Starting from this principle, Prieto (1975a: 140ff, 1975b: 215ft) has claimed, that, contrary to what is ordinarily taken for granted, it is natural science which is subjective, since it has to take a stand on physical reality, which as such is indifferent, whereas semiotics is capable of objectivity, in so far as it describes the subjective point of view of individuals and communities. According to another formulation, the object of linguistics is the knowledge common to the speaker and hearer (1975a: 110), i.e. it produces knowledge about knowledge, not, as the natural sciences, about the material world (1975a: 140ft). Prieto thus postulates a simple coincidence between the object and the discourse of semiotics. It is, however, less the phoneme, than the features defining it, which are relevant to linguistics, and these are not ordinarily identified by the speaker. In more recent linguistics, it is the "deep structure" or the "image schemas" which are claimed to be relevant for linguistic knowledge, not the particular syntactic form or stylistic turn, of which the speaker is usually aware. We therefore conclude that the linguist, and the semiotician generally, may have to descend at least one level of analysis below the ultimate level of which the user is aware. Put into traditional epistemological terms, we may say that after coinciding with the user in his understanding of the phoneme, the semiotician
3. Strictly speaking, this is not the problem of our own body, nor of the other, but the more general one of the external world, as pointed out by Gurwitsch (1979: 26f). Still, it is quite sufficient for us to note that it also applies to the body.
From the meaning ofembodiment to the embodiment ofmeaning
91
goes on to explain the conditions of possibility of this understanding on the level of distinctive features. In this case, semiotics contains the knowledge of the user and something more, and, quite apart from the problem of obtaining the correct understanding, this explicative part introduces an element of subjectivity. We shall say that what is of primary importance to semiotics is operative knowledge, i.e. knowledge that must exist at some, probably low, level of awareness, in order to render behaviour understandable (and thus explainable); thus, it is not discursive knowledge, the spontaneous theories of the user, which might be what we first tend to identify with common sense. The operation of ideation, familiar to the phenomenologist, the commutation text of structuralist linguistics, the grammaticality or acceptability judgement of the grammarian, and some varieties of psychological experimentation are all techniques for attaining these layers, bringing that which is at the margin of consciousness into its centre (cf. Sonesson 1989: 27ff; and see Zlatev, this volume, for a similar argument). In phenomenological semiotics, then, we are concerned, in the first place, with the figure of the body as it appears on the horizon of consciousness. Once we have described this figure - better than James, Russerl, and so on - we may try to explain it, delving ever deeper into the margins of consciousness. We can of course try to search for explanations outside of consciousness, but we must be aware that this is a complete change of direction. Most contemporary theories of embodiment do not appear to pose the question of meaning. Varela, Thompson and Rosch (1991) start out from the phenomenology of Merleau-Ponty, but, after the first few pages, it is not really clear how the issues they discuss relate to the phenomenological problem of the body, i.e. the body as it appears to consciousness. Lakoff and Johnson (1999: 102) distinguish three different levels of embodiment, which they refer to as "the neural level, phenomenological conscious experience and the cognitive unconscious", none of which, in the end, seems to have anything to do with meaning, as opposed to neurobiology.4 Both senses of embodiment characterised from a phenomenological perspective at the beginning of this section involve a process by which something not recognized as a body presents itself as a being one: in the first case, a mind is being situated in the world; in the second case an idea
4. See Zlatev this volume for a discussion of whether these levels can reasonably be separated, and, in particular, of the problematic character of the "cognitive unconscious".
92
Goran Sonesson
is being reified into an object publicly accessible to all. By denying the distinctions both between body and mind and expression and content, scholars such as Lakoff and Johnson deprive themselves of the very foundations needed by their own notion of "image schemas". To see this, however, we have to start by specifying the concepts of sign and schema.
3.
Meaning embodied in signs
It is true of both main traditions of semiotics, the Saussurean and the Peircean, that they have never really offered any specific definition of the sign - by which I mean a set of criteria permitting us to separate meanings which are signs from other meanings. The same thing appears to apply to the notion of representation in cognitive science (cf. Sonesson 1992b, 2003a, 2003b, 2006, forthcoming). This goes a long way to explaining why many semioticians (such as Greimas, Eco, etc.) have rejected the sign, without much of an argument, and why the second generation of adepts to cognitive science (e.g. Lakoff and Johnson 1999; Johnson and Rohrer this volume) now seem to be doing the same thing with reference to the notion of representation. So before we can pose any questions about the psychological and evolutionary role of the sign concept, we have to be clear about what it is. This involves not only deciding the criteria for analysing a phenomenon of meaning into two separate parts, but also those allowing us to posit an asymmetrical relation between these parts: not only does the expression have to be separate from the content, but the former should stand for the latter, not the reverse.
3.1.
From pebbles to feathers: The notion of differentiation
When Peirceans and Saussureans quarrel over the presence of two or three entities in the sign, they seldom pause to ask themselves what kind of objects, defined by what type of features, are involved. The whole question becomes moot if there is no reason to analyse meaning into two parts, as suggested by both contemporary cognitive scientists and old-time existentialists and Lebensphilosophen. What, then, is it that permits us to determine that an object endowed with meaning is made up of an expression, or "representamen", and a content, or "object" (where further instances of the Peircean version are not relevant)? Peirceans and Saussureans alike would
From the meaning ofembodiment to the embodiment ofmeaning
93
no doubt agree that signs have something to do with the classical formula, often quoted by Jakobson (1975), aliquid stat pro aliquo ("something in the place of something else"), or, as, Jakobson also puts it, more simply, with renvoi, or reference. But this formula itself is vague or ambiguous. Before we can separate signs from other meanings, we have to spell out those criteria for something being a sign that are simply taken for granted, both in the Peircean and in the Saussurean tradition. This can be done by combining what Russerl says about appresentation (something which is directly present but not thematic refers to something which is indirectly present but thematic) and what Piaget says about the semiotic function (there is a differentiation between expression and content in the double sense, I take it, that they do not go over into each other in time and/or space, and that they are perceived to be ofdifferent nature). Phenomenology, which is not afraid of spelling out the self-evident, may offer some help here. Saint Augustine, who has often (as so many others) been hailed as the first semiotician, defined the sign as "a thing which, over and above the impression it makes on the senses, causes something else to come into thought as a consequence" (as translated by Deely 1982: 17ff). Russerl's (1913, 1939) own definition of the sign, which describes the expression as something which is directly perceived but not in focus, and the content as being indirectly perceived while at the same time being the focus of the relation, could be taken as a way of specifying the Augustinean suggestion. 5 Piaget certainly abides by Saussure opposing the sign to the symbol (where the latter is the motivated sign). What Piaget added to Saussure was most obviously a developmental perspective, in particular on the level of ontogeny. But, just as importantly, though it has seldom been observed (cf. Sonesson 1992b, etc.), he realised that not all meanings are signs or symbols, and he even began groping for a definition of that which accounts for the specificity of the sign. According to Piaget the sign function (which Piaget himself called first the symbolic, and then the semiotic function) is a capacity acquired by the child at an age of around 18 to 24 months, which enables him or her to imitate something or somebody outside the direct presence of the model, to use language, make drawings, play "symboli5. These observations could be taken to imply that the content is "embodied" in the expression. Expression would stand to content as body to soul. This was explicitly suggested by Cassirer (1957: 100), but it is also hinted at in some passages by Peirce. The parallel is nonetheless, in my view, seriously flawed (as will be discussed in Section 3).
94
Goran Sonesson
cally", and have access to mental imagery and memory. The common factor underlying all these phenomena, according to Piaget, is the ability to represent reality by means of a signifier, which is distinct from the signified. Indeed, Piaget argues that the child's experience of meaning predates the sign function, but that such meaning does not suppose a differentiation of signifier and signified (see Piaget 1945, 1967, 1970). In several of the passages in which he refers to the sign function, Piaget goes on to point out that "indices" and "signals" are possible long before the age of 18 months, but only because they do not suppose any differentiation between expression and content. The signifier of the index, Piaget (1967: 134ft) says, is "an objective aspect of the signified"; thus, for instance, the visible extremity of an object which is almost entirely hidden from view is the signifier of the entire object for the baby, just as the tracks in the snow stand for the prey to the hunter. But when the child uses a pebble to signify candy, he is well aware of the difference between them, which implies, as Piaget tells us, "a differentiation, from the subject's own point of view, between the signifier and the signified" (ibid.) Piaget is quite right in distinguishing the manifestation of the sign function from other ways of "connecting significations", to employ his own terms. Nevertheless, it is important to note that, while the signifier of the index is said to be an objective aspect of the signified, we are told that in the sign and the "symbol" (i.e. in Piaget's terminology, the conventional and the motivated variant of the sign function, respectively) expression and content are differentiated from the point of view of the subject. Curiously, this distinction between the subjective and objective points of view is something Piaget seems to lose track of in his further discussion. We can, however, imagine this same child that in Piaget's example uses a pebble to stand for a piece of candy having recourse instead to a feather in order to represent a bird, or employ a pebble to stand for a rock, without therefore confusing the part and the whole: then the child would be employing a feature, which is objectively a part of the bird, or the rock, while differentiating the former form the latter from his point of view. Only then would he be using an index, in the sense in which this term is employed in semiotics, that is, in (what this semiotician takes to be) the Peircean sense of the term. Contrary to what Piaget implies, the hunter, who identifies the animal by means of the tracks, and then employs them to find out which direction the animal has taken, and who does this in order to catch the animal, does not, in his construal of the sign, confuse the tracks with the animal itself, in which case he would be satisfied with the former. Both the
From the meaning ofembodiment to the embodiment ofmeaning
95
child in our example and the hunter are using indices, or indexical signs, where the "real" connection is transformed into a differentiation. On the other hand, the child and the adult will fail to differentiate the perceptual adumbration in which they have access to the object from the object itself; indeed, they will identify them, at least until they change their perspective on the object by approaching it from another vantage point. And at least the adult will consider a branch jutting out behind a wall as something that is non-differentiated from the tree, to use Piaget's example, in the rather different sense of being a proper part of it. 6 In the Peircean sense an index is a sign, the relata of which are connected, independently of the sign function, by contiguity or by that kind of relation that obtains between a part and the whole (henceforth termedfactorality). But of course contiguity and factorality are present everywhere in the perceptual world without as yet forming signs: we will say, in that case, that they are mere indexicalities. Perception is perfused with indexicality.7 An index, then, must be understood as indexicality (an indexical relation or ground, to use an old Peircean term) plus the sign function. Analogously, the perception of similarities (which is an iconic ground) will only give rise to an icon when it is combined with the sign function. Deacon (1997: 76ft) must therefore be wrong when he claims that camouflage in the animal world such as the moth's wings being seen by the bird as "just more tree" are essentially of the same kind as those "typical cases" of iconicity we are accustomed to call pictures. As always, there are passages in Peirce's work which may be taken in different ways, but it makes more systematic and evolutionary sense to look upon iconicity and indexicality as being only potentials for something being a sign which still have to be "embodied" (cf. Section 4). While the introduction of the notion of differentiation is a substantial accomplishment on the part of Piaget, he unfortunately never spells out its import. He defines differentiation in terms of the subject's point of view, but then uses examples in which the disconnection already exists objectively, as pointed out above. Objectivity can here, I take it, be identified
6. About proper parts, perceptual perspectives, and attributes as different ways of dividing an object and thus different indexicalities, cf. Sonesson 1989: 1.2.). 7. I am using "indexicality" here (just as "iconicity") in the sense of something which is necessary for a sign being an index (or an icon), but which cannot function "as a sign until it is embodied". See, in particular, Sonesson (1993a, 1998a, forthcoming)
96
G6ran Sonesson
with the common sense world, which the child, in Piagetean terms, is in the process of "constructing". Differentiation should not be identified with displacement as defined by Hockett (1977), which (rightly, no doubt) appears as one of the "design features" of language in most introductory textbooks. As in the case of the tracks left by the hunted animal, displacement may be a consequence of differentiation. But differentiation only comes on its own when the sign is in presence of its referent, for then it allows us to construe reality in different ways ("subjectively", as Piaget would have said), picking out that which is relevant, and ignoring, or downplaying other features. We must be careful not to confuse different relationships involving the sign. Differentiation, in Piaget's sense, must pertain to the signifier and the signified, which are always equally present in the here and now of the sign usef, since they are mental (or, in some cases, intersubjective) entities. To the hunter, both the signifier and the signified of the tracks are present here on the ground (Of, to be precise, on the ground as he perceives it). But the signified contains the information that is itself only part of a larger whole (or rather something once contiguous to a larger whole) which was present here at an earlier time, but which is now elsewhere, more precisely in the direction indicated by the tracks. And the displacement, in Hockett's sense, has taken place between that signified whole and the real animal, which is now present somewhere else. When the sign, whether it is a stretch of discourse, a picture, or an animal track, is present along with the referent, however, the signified allows us to refocus the referent, in other words, to present it in a particular perspective. For this the sign requires independence: that is so say, a "body" of its own.
3.2.
Some other ways of "connecting significations"
As presented here, the concept of sign (or representation) does not include ordinary perception: our way of being in the world is not to be likened to the presence at some kind of private theatre. Second generation cognitive scientists (cf. Johnson and Rohrer this volume) are therefore quite right in rejecting the notion of representation of their forbears. They are wrong, however, to reject all kinds of representation (to the extent that it corresponds to the sign function). More fundamentally, they commit a serious error by not attending to the definition of representation before rejecting it
From the meaning ofembodiment to the embodiment ofmeaning
97
altogether. A few notions of history may help us to disengage ourselves from the present-day conceptual muddle. As was noted above, Augustine seems to have been responsible for making explicit the common sense notion of sign on which later thinkers such as Saussure and Russerl (and, at least in his definitions, Peirce) are tacitly building: it is, he tells us (in the convenient paraphrase of Deely 1994: 58) "something which, on being perceived, brings into awareness another besides itself'. Thomas Aquinas already had some misgivings about this definition, without ever daring to reject it outright. The followers of Aquinas in Paris may have been somewhat bolder. In a written form which has come down to us, however, we first know this criticism from the works of Pedro da Fonseca, who was active in Coimbra, Portugal, in the 16th century. To Fonseca and his followers, the definition of the sign must be considerably broader: a sign is anything which serves to bring into awareness something different from itself, whether the sign (in the sense of the signifier) itself becomes subject to awareness in the process or not. If the sign itself does not have to be perceived in order for us to come to an awareness of that which is signified, Fonseca described it as beingformal; but if the sign cannot lead to the awareness of anything at all unless it is itself perceived, he called it instrumental (cf. Deely 1982: 52ff, 1994: 58ff). Thus, Fonseca pointed to a distinction, which seems to have been lost by latter-day semioticians and cognitive scientists. What is here called an instrumental sign clearly is that which we, following Russerl and Brentano, but also Edelman, have described as the fundamental trait of consciousness, intentionality, that is, the property of being directed to that which is outside of consciousness. In fact, when closely considered, Fonseca's observations really go against the grain of the identification of our awareness of the world with the sign. It echoes Russerl's as well as Gibson's description of the perceptual act as something which points beyond itself without itself being present to consciousness (cf. Sonesson 1989: ill.3.2). Indeed, when Gibson (1978: 228) observes that, when we are confronted with the cat from different points of view, etc., what we really see is all the time the same invariant cat, he actually recovers the central theme of Russerlian phenomenology, according to which the object is entirely, and directly, given in each of its perspectives or noemata (see Russerl 1939, 1962a, 1962b, 1973b; Sonesson 1989: 1.2.2). In a similar fashion, Russerl's favourite example is a cube which can be observed from different sides. In Gibsonean terms, these are "the surfaces of the world that can
98
Goran Sonesson
be seen now from here" (Gibson 1978: 233). Husserl's cube and Gibson's cat instantiate the same phenomenal fact. Just as Husserl called into question the conception of his contemporary Helmholtz, according to which consciousness is like a box within which the world is represented by signs and images, from whose fragmentary pieces we must construct our perceptions (cf. Kiing 1973), so Gibson's strawmen are the followers of Helmholtz, the so-called "constructionists" (who have recently re-emerged within cognitive science; cf. Hoffman 1998), who claim that hypotheses are needed to build up perceptions from the scattered pieces offered us by sensation (cf. Sonesson 1989: 111.3.3).8 Husserl rejected the picture metaphor of consciousness, showing Brentano and Helmholtz to be in error in their very conception of pictures and other signs, because they ignored the transparency of the expression to the content. Gibson (1978) instead emphasises the dissimilarity of the picture from a real-world scene, thus showing numerous experiments using pictorial stimuli to study normal perception to be seriously misguided. To both Husserl and Gibson, normal perception gives direct access to reality; pictures, however, constitutes a kind of indirect perception to Gibson, while to Husserl (1980) they are "perceptually imagined" (cf. Sonesson 1989: ill.3.6, forthcoming). To perceive surfaces is a very different thing from perceiving marks on surfaces, Gibson (1980) maintains. Depth is not added to shape, but is immediately experienced. In fact, the perception of surfaces, of their layout, and of the transformations to which the latter are subjected, is essential to the life of all animal species, but the markings on these surfaces have only gained importance to man, notably in the form of pictures (Gibson 1980: xii, 1978: 229). Surfaces have the kind of meaning which Gibson elsewhere calls "affordances"; the markings on surfaces, however, have "referential meaning". Without discussing the exact import that should be given to the term "affordance" (cf. Costall this volume), we may safely conclude that "referential meaning" is a property of what we have called the sign function. That is, surfaces do not standfor other surfaces, but the markings on surfaces may possibly do so. The pattern of a surface and the pattern on a surface are different, and can usually be distinguished by an adult. The
8. Reed (1996) notes some parallels between Gibson and the American pragmatists (without, however, referring to Peirce). On Gibson's sources, also see Costall this volume.
From the meaning ofembodiment to the embodiment ofmeaning
99
surface on which a "graph" has been executed can be seen underneath the "graph". To Gibson, then, the picture is a surface among other surfaces before becoming a sign. Gibson (1978: 231) observes that, besides conveying the invariants for the layout of the pictured surfaces, the picture must also contain the invariants of the surface that is doing the picturing: those of the sheet of paper, the canvas, etc., as well as those of the frame, the glass, and so on. Although Gibson does not use the term, he clearly describes the picture as a sign, in the strict, Augustinean sense of the word: as a surface which, on being perceived, brings into awareness something besides itself. Gibson never specifies what he means when he claims that surfaces are only seen to stand for something else by (adult) human beings, in contradistinction to animals and infants. If he meant to suggest that surfaces can never be taken to be something else than surfaces by animals and children he was clearly wrong: we know that even doves may react the same way to a picture as to that which is depicted (cf. Sonesson 1989: ill.3.1). The difficulty, clearly, consists in seeing, at the same time, both the surface and the thing depicted. We should grant Fonseca the insight that there is some kind of analogy between signs and intentional acts. However, to use the term sign in both cases dangerously suggests that there is no important distinction to be made. In his late life, Peirce realised that all his notions were too narrow: instead of "sign", he reflected, he really ought to talk about "medium" or "mediation" (manuscript quotations given in Parmentier 1985). In the following, we will use the term mediation for this general sense of meaning which Fonseca called sign and to which Peirce sometimes also may be hinting. In some respects, at least, it seems to correspond to Gibson's "affordances", and to Piaget's notion of "connecting significations".
4.
On the way to the human Lifeworld
If there is meaning before signs, then even the immediate experience of perception is in some very general sense "mediated". The semiotician A. J. Greimas (1970: 49) once suggested that there could be a cultural science of nature, a semiotics of the natural world - which was concerned, then, with the world that is natural to us, just as a particular language is our "natural language". But Greimas was not the first to conceive of a cultural science of nature. His semiotics of the natural world, together with Husserl's sci-
100
Goran Sonesson
ence of the Lifeworld, and "ecological physics" as invented by Gibson are all sciences of normality, of that which is so much taken for granted that it is ordinarily not considered worthy of study (cf. Sonesson 1989, 1994, 1996,). It may seem strange to put together ideas and observations made by a philosopher, a psychologist, and a semiotician; yet these proposals are largely the same; indeed, there are indications that both Greimas and Gibson took their cue from Russer!' Greimas, Gibson, and Russerl all felt the need for such a science because they realised that the "natural world", as we experience it, is not identical to the one known to physics but is conceived from the standpoint of human consciousness. Russerl's Lifeworld as well as Gibson's ecological physics, but not Greimas' natural world, take this level to be a privileged version of the world, "the world taken for granted", in Schiitz's (1967) phrase, from the standpoint of which other worlds, such as those of the natural sciences, may be invented and observed (cf. Sonesson 1989: 26-29, 30-34, and passim).
4.1.
The ecology taken for granted: the Lifeworld
Every particular thing encountered in the Lifeworld is referred to a general type. According to Schiitz ([1974] 1932, 1967), other people, apart from family members and close friends, are almost exclusively defined by the type to which they are ascribed, and we expect them to behave accordingly. Closely related to the typifications are the regularities that obtain in the Lifeworld, or, as Russerl's says, "the typical way in which things tend to behave". This is the kind of principles tentatively set up which are at the foundation of Peircean abductions. Many of the "laws of ecological physics", formulated by Gibson (1982: 217ft), and which are defied by magic, are also such "regularities [that] are implicitly known": that substantial objects tend to persist, that major surfaces are nearly permanent with respect to layout, but that animate objects change as they grow or move; that some objects, like the bud and the pupa transform, but that no object is converted into an object that we would call entirely different, as a frog into a prince; etc. Some of the presuppositions of these "laws", such as the distinction between "objects that we would call entirely different", are also at the basis of the definition of the sign function (cf. Sonesson 1992a, 2000, 2001).
From the meaning ofembodiment to the embodiment ofmeaning
101
It has been suggested (notably by Smith and Varzi 1999) that the Lifeworld, in this sense, is simply the niche, in the sense of (non-Gibsonean) ecology, in which the animal known as the human being stakes out his life (cf. Sonesson 2001: 99). The niche, then, in this sense, is the environment as defined by and for the specific animal inhabiting it. In Husserlean language, the niche is subjective-relative - relative to the particular species. The precursor of the niche, understood in this way, is the notion of Umwelt introduced by von Uexkiill (1956, 1973), which is one of the key concepts of the field known as biosemiotics (see Emmeche this volume). Uexkiill's notion of meaning centres on the environment, the Umwelt, which is differently determined for each organism. As opposed to an objectively described ambient world, the Umwelt is characterised for a given subject, it terms of the features of the world which the subject perceives (Merkwelt) and the features which it impresses on the world (Wirkwelt), which together form a functional circle (Funktionskreis). According to a by now classical example, the tick hangs motionless on a branch until it perceives the smell of butyric acid emitted by the skin glands of a mammal (Merkzeichen), the effect of which is to send a message to its legs to let go of the support (Wirkzeichen). When the tick drops onto the body of the mammal, a new cycle is started, because the tactile cue of hitting the mammal's hair incites the tick to move around in order to find the skin of its host. Finally, a third circle is initiated when the heat of the mammal's skin triggers the boring response, which permits the tick to drink the blood of its host. Together, these different circles consisting of perceptual and operational cue bearers make up the interdependent wholes of the subject, corresponding to the organism, and the Umwelt, which is the world as it is determined for the subject in question. Scholars involved with biosemiotics tend to take this model, immensely enlightening as it is in itself, and simply project onto it the sign conception suggested by Peirce. The first difficulty with this approach, of course, resides in finding out the real import of the Peircean sign conception. Since this is in itself an infinite task, any scrutiny of the parallel risks getting bogged down very early on. If we confront the sign conception defined in this chapter with the world of the tick, however, it will be easy to see that the two are entirely distinct. Not only is there no distinction between expression and content to the tick; there is no separation of sign and reality. At least in part, this is also an opposition between the Umwelt and the Peircean SIgn.
102
Goran Sonesson
4.2.
From Umwelt to Lebenswelt: the thematic field
Pending the invention of biosemiotics, Cassirer (1942: 29ff, 1945: 23ft) was no doubt the first thinker outside of biology to take von Uexkiill's ideas seriously. After pointing out that, to human beings, all experience is mediated (a case of Vermittlung), he observed that this is also true of animals, as described by von Uexkiil1. But he makes no mention of the fact that, to von Uexkiill (1956, 1973), the Funktionskreis is a "theory of meaning" (Bedeutungslehre). In fact, he opposes "animal reactions" to "human responses". Cassirer may be wrong in not seeing the similarity between signs and other meanings (though he suggests it in passing using the term 'Vermittlung'), but he is quite right, I submit, in insisting on the difference. Very tentatively, let us suppose that, in the biosemiotic conception, the features of the world observed by the animal correspond to the sign-vehicle or expression (Peirce's "representamen"); the object or referent would then be that which causes theses features to be present to the animal; and the Peircean interpretant or content would in turn correspond to the pieces of behaviour which tend to make up the reaction of the animal to the features in question. There is no point getting lost here in Peircean exegesis: if anything, we are faced with a "formal sign", as conceived in the Fonseca tradition. As we are using the terms, we would have some kind of mediation (Cassirer's Vermittlung), but not a sign. As Ziemke and Sharkey (2001: 709) point out, it is hard to find the object of the sign, in the ordinary sense of its referent in the "outside world". Indeed, that which is for us, as observers, three cues to the presence of a mammal, the smell of butyric acid, the feel of skin, and the warmth of the blood, do not have to be conceived, in the case of the tick, as one single entity having an existence of its own (a "substance", in Gibson's terms), but may more probably constitute three separate episodes producing each its own sequence of behaviour. In fact, Ziemke and Sharkey go on to quote an early text by von Uexkiill, in which he says that "in the nervous system the stimulus itself does not really appear but its place is taken by an entirely different process" (von Uexkiill 1909, quoted here from Ziemke and Sharkey 2001, my italics). Uexkiill calls this a "sign", but it should be clear that it does not in any way fulfil the requirements of the sign function. Indeed, expression and content are not differentiated, already because they do not appear to the same consciousness. The butyric acid is there to the tick; the mammal is present only to us.
From the meaning ofembodiment to the embodiment ofmeaning
103
What is lacking here - to the tick - is real Thirdness: the reaction to the primary reaction, that is, the reaction which does not respond to a simple fact (Firstness), but to something which is already a reaction, and thus a relation (Secondness; see Table 1). Without having to enter into the earlier discussion of differentiation, we see that, even from a strictly Peircean point of view, there is no Thirdness for the tick: it does not respond to any relationship, since it is not aware (even in the most liberal sense of the term) of any second term (the mammal) to which the first term (the butyric acid) stands in a relation. Table 1.
The relationship between principles, grounds, and signs, from the point of view ofPeirce.
Firstness
Secondness
Principle
Iconicity
Ground
Iconic ground
Indexicality = indexical ground
Iconic sign (icon)
Indexical sign (index)
Sign
Thirdness
Symbolicity = symbolic ground = symbolic sign (symbol)
In fact, things are even more complicated. In a true sign relation, the mammal is not really the object, in the Peircean sense, for which the butyric acid is the representamen (the expression). Or, to be more precise, it is not the dynamical object. At the very most, it is the immediate object. In Peirce's conception, while the immediate object is that which directly induces the sign process, the dynamical object is something much more comprehensive, which includes all those things which may be known about the same object, although they are not present in the act of inducing. Indeed, the dynamical object is that which corresponds to the potentially infinite series of different interpretants resulting from the same original immediate object. It should be clear that, for the tick and similar beings, there could be no distinction between direct and dynamical object, because there is no room for any further development of the chain of interpretants. In this
104
Goran Sonesson
sense, Deacon's (1997: 63) idiosyncratic reading of Peirce, according to which only signs such as those found in human language (his "symbols") give rise to chains of interpretants seem to have some justification - in reality, ifnot in Peircean theory (cf. Sonesson 2003a, 2006).9 To account for the distinction between the "immediate object" and the "dynamical object", we need the concept of ground. lo In one of his wellknown definitions of the sign, a term which he here, as so often, uses to mean the sign-vehicle, Peirce (1931-58, 2: 228) describes it as something which "stands for that object not in all respects, but in reference to a sort of idea, which I have sometimes called the ground ofthe representation" (my italics; see Table 1). Some commentators have claimed that Peirce is here talking about some properties of the expression, whereas others favour the content. In fact, however, the ground must concern the relation between them. Such an interpretation seems to be born out by Peirce's claim that the concept of ground is indispensable, "because we cannot comprehend an agreement of two things, except as an agreement in some respect". (1.551). In another passage, Peirce himself identifies ground with an abstraction exemplifying it with the blackness of two black things (1.293). It therefore seems that the term "ground" must stand for those properties of the two things entering into the sign function by means of which they get connected, i.e. both some properties of the thing serving as expression and some properties of the thing serving as content. In case of the weathercock, for instance, which serves to indicate the direction of the wind, the contentground merely consists in this direction, to the exclusion of all other properties of the wind, and its expression-ground is only those properties which makes it turn in the direction of the wind, not, for instance, the fact of its being made of iron and resembling a cock (the latter is a property by means of which it enters an iconic ground, different from the indexical ground making it signify the wind). If so, the ground is really a principle of relevance, or, as a Saussurean would say, the "form" connecting expression and content: that which must necessarily be present in the expression for it to be related to a particular content rather than another, and vice-versa (cf. Sonesson 1989: ill. 1, 1995, forthcoming). The butyric acid, the hairiness, and the warmth form the immediate objects of the tick, while the mammal as such is the dynamical object. The
9. The problem, however, is that true indices and icons, as experienced as least by human beings, have as many interpretants as symbols. 10. This was independently noted by S0ren Brier (2001).
From the meaning ofembodiment to the embodiment ofmeaning
105
difference, however, is that there is no way that the tick, unlike human beings, may learn more about the "dynamical object" than that which is given in the immediate one. Meaning here appears as a kind of "filter": it lets through certain aspects of the "real world" which, in is entirety, is unknowable, though less so for human beings than for the tick. The Kantian inspiration of von Uexkiill is of course unmistakable. Indeed, in the terms of another thinker with a Kantian inspiration, Biihler (1934), the filter model is based on "abstractive relevance", the neglect of such physical properties which are not endowed with meaning, similar to those properties of the physical sound which vary a lot without the units of meaning (the phoneme, the word, etc.) being changed, which Saussure and Hjelmslev characterised as "substance" in opposition to "form". Returning to modem day biosemiotics, it can be easily shown that what these authors are involved with has little to do with meaning as sign function, but very much concerns meaning as relevance, organisation, configuration andlor filtering. In their early joint paper, Emmeche and Hoffmeyer (1991: 4), criticising the concept of information in information theory, point out (paraphrasing Bateson), that they are interested in "a difference that makes a difference to somebody". They go on to say that living beings "respond to selected differences in their surroundings" (their italics in both cases). The formulation clearly invokes relevance, and even some kind of filtering device. Later on in the paper, however, when the Peircean sign concept is introduced, the DNA-sequence of the gene is said to be the representamen, the protein its object, and the interpretant the cellularbiochemical network. It is, however, difficult to detect any sign function here. According to Emmeche and Hoffmeyer, the contribution of Peircean semiotics is to show us that "the field of genetic structures, or a single gene, cannot be seen in isolation from the larger system interpreted" (1991: 34). This certainly suggests meaning in the sense of a whole or a configuration. In a later paper, Emmeche (2002) sets out to show that in the living being function and meaning are the same. This can also be demonstrated, because Emmeche understands meaning in the sense of function: the relation of the part to the whole. But even in this article, there are traces of the filtering concept of meaning: we learn that "the whole operates as a constraint". Saying that cytochorme c means something to the cell is the same as saying that is has a function. It is not just any molecule. We could well synthesise small proteins and artificially introduce them into the cell. They would be
106
Goran Sonesson
without importance or they would be dysfunctional or, with certain fortuitous strokes of luck, they would actually fulfil some function in the cell. (Emmeche 2002: 19)
This implies that the meaning of the enzyme "is structural" in the sense that "the cell's molecules form a system of dissimilarities (like the elements of language in Saussure)" (Emmeche 2002: 20). This parallel is correct to the extent that there are relevancies in cells, in particular if these relevancies result from a system of oppositions, like those of Saussurean language. From this point of view, everything that is in the cells is also in language. But the opposite cannot be true. There is, of course, no sign function as we have defined it. It is useful to distinguish relevance from filtering, although they have something in common: picking up a limited set of features from the totality of the environment. However, relevance, strictly speaking, does not exclude anything: it merely places some portions of the environment in the background, ready to serve for other purposes. Thus, in the case of language, properties that are not relevant for determining the meaning of words and sentences, still may serve to inform about the dialect, or even identify the person speaking (Hjelmslev's "connotational language"; cf. Sonesson 1989). Indeed, relevance lets the difference between "immediate object" and "dynamical object" subsist, in the vague sense which they retain in the "scholastic" interpretation of Peirce (see above): that which is directly given, in contrast with that which is potentially given for further exploration. Thus, Biihler (1934), added to the principle of "abstractive relevance" that of "apperceptive supplementation", which explains the projection of properties not physically present in perception to the meaningful experience. In contrast,jiltering simply crosses out that which is not let through the filtering device.}} The difference between relevance and filtering no doubt has something to do with the capacity to be aware of the borders of one's Umwelt. It requires some kind of "metacognition": to the tick, to paraphrase Wittgenstein, the limits of its language are the limits of its world, but not so (in spite of Wittgenstein) to human beings. Or rather, the limits of any particular Umwelt are not the limits of our Lebenswelt. Schiitz (1967) sug11. It can now be seen that Btihler's principles of abstractive relevance and apperceptive supplementation go much further than the sign. They have been found in the studies of the systems of cooking and clothing realised by Levi-Strauss, Barthes, and others (as demonstrated by Sonesson 1989.
From the meaning ofembodiment to the embodiment ofmeaning
107
gested there are really "multiple provinces of meaning", such as dreaming, religious experience, the art world, the play world of the child, and that esoteric practise we lmow as science. The peculiarity of the Lifeworld, in this context, is that is offers access to the other worlds, and is accessible to all of them. In this sense, the human Lebenswelt is different from the Umwelt of other animals. Or at least is has the capacity for being different. In Peircean terms, human beings may reach for the dynamical objects beyond the immediate ones. They may try to transform Nature into Culture. However, as Wittgenstein (1971) observed, even if we had a common language game, we would perhaps not have so much to discuss with a lion. The lion, presumably, does not try to go beyond his own Umwelt to grasp the properties of the objects that lie behind it. There is, so to speak, no "dynamical object" beyond the immediate one to him. If the Umwelt is an organised network offilters and/or relevancies, as I suggested above, it seems that maturing in the child consists in breaking out of one Umwelt and going on to another, broader one, until reaching the human Lifeworld. Between each Umwelt and the next, which encompasses it, there is, to borrow a famous expression from Vygotsky (1978) a "zone of proximal development". In this sense, ontogenesis itself forces us to go through a series of "finite provinces of meaning", in the sense of Schiitz. A temporal dimension is thus added. It might therefore be said that what most perspicuously differentiates the tick from the human being (without prejudging for the moment on the question where the exact border is to be placed) is the structure of the field of consciousness: in Gurwitsch's (1957, 1964, 1985) terms, human consciousness is made up of a theme which is the centre of attention, a thematic field around it consisting of items which are connected to the present theme by means of intrinsic links permitting it to be transformed into a theme in its own right, as well as other items present "at the margin" at the same time, without having any other than temporal relations to the theme and its field. 12 The tick of course has access neither to the thematic field nor to the margin. In a way, this is simply another way of saying that the tick cannot reach beyond the immediate object. But Gurwitsch's analysis breaks up that of Peirce: it implies that, not only is there no way for the tick 12. Gurwitsch is right, I believe, in suggesting that this thematic structure translates to language (and no doubt also to other semiotic resources), as most clearly illustrated in the transposition of the functioning of pronouns from the perceptual world to discourse (cf. Gurwitsch 1985); it is unfortunate, however, that he fails to attend to the difference in structuring occasioned by the sign function.
108
Goran Sonesson
to "go on from here" (the Husserlean "etcetera principle"), its experience of the here and now is also very limited. In other words, there is no real "immediate object" to the tick, not only because it is not opposed to a future more extensive dynamical object, but also because even in the here and lmow, what is immediately experienced does not appear as a thematic structuring, or perspective, on such a dynamical object. I have suggested, then, that an important difference between human beings and (some) other animals consists in the thematic structure of consciousness, or, in other words, the function of attention. Some similar difference in the structure of attention have been discussed in very different quarters lately, separating human beings and apes, as well as children of different ages (cf. Tomasello 1999; Tomasello et al. 2005; Zlatev 2002, 2003, this volume). A discussion of such a progression in the development of attention presupposes an analysis of our awareness of the other's body and mind, which would take us out of the limits of this chapter. Something will be said, nevertheless, on the attention to one's own body in the next section. Before that, however, it will be necessary to take stock. I suggested above that there were really two differences between the way in which ticks and other lower animals have access to meaning and the human way. The first of these is the thematic structure: there is no immediate object, because there is no dynamical object in relation to which it may be seen as an adumbration. But there is more to it: there is no representamen (expression), either, because no distinction can be made between such a representamen and the object, either immediate or dynamic. Taking into account the Fonseca tradition, we earlier noted that one kind of mediation (for which I reserve the term sign) consists of a signifier (expression) which has to be perceived as such in order to usher into the perception of the corresponding signified (content); and another one (which following the Brentano-Husserl tradition, can be called intentionality) which may consist in a signifier which is not ordinarily perceived as such but still somehow serves to mediate the perception of a signified. It will be remembered that, according to von Uexkiill, "in the nervous system the stimulus itself does not really appear but its place is taken by an entirely different process" (my italics). If so, this is not even mediation in the broad sense of the term. As Husserl and Gibson have insisted, we are alternatively confronted with different view of the cube or the cat, etc., but what we really see is all the time the same invariant cube or cat. The tick smells the same invariant butyric acid, period. In the world of the tick,
From the meaning ofembodiment to the embodiment ofmeaning
109
there are no signs, as distinct from the world itself. Differentiation has not even started. In other words, signification has not acquired a "body" of its own.
5.
The body in the Lifeworld
In the previous section I suggested that in identifying the functional cycle with the Peircean concept of sign, biosemiotics conflates meaning in the most general sense of organisation and relevance with the more specific sign function. Inversely, contemporary embodiment theorists such as Lakoff and Johnson reduce the sign to the more general model of the pick-up of features from the environment. If Lakoff and Johnson engage in one form of reductionism, biosemiotics seems to accomplish its inversion. The result, however, is the same: distinctions, which are important, both theoretically, and from the point of view of phylogeny and ontogeny, can no longer be maintained. The hybrid term "image schema" has many antecedents, at least as to its latter part. Before the advent of Lakoffs and Johnsons's work, the most familiar usage was no doubt that of Piaget (1970: 41): as a kind of "abstraction from action" taking place at different stages through child development. Schiitz ([1974] 1932), however, used the term to refer to all kinds of fossilized (or, in his words, "sedimented") sequences of action, which could be used to make sense of new actions within the common sense world. The idea of a spatial, if not specifically bodily, projection, is important to the notions of schema in the psychology and sociology of Janet (1928), Halbwachs (1925, 1950) and Bartlett (1932; cf. Sonesson 1988). In all these conceptions, schemas are the result of earlier actions. This seems to accord with the definition by Johnson (1987,2005) of image schemas as being abstractions from the interaction of organism and environment. If so, as I will suggest below, image schemas should require some kind of separation of the human Umwelt into body and world.
5.1.
The body as the axis of the world
It is not surprising that the figure of the body looms large at the horizon of consciousness. After all, the body is our condition of access to all possible experience of the world. It is at the origin of one fundamental characteristic
110
Goran Sonesson
of the Hussserlean Lifeworld: that everything in it is given in a subjectiverelative manner. This means that the access in question is not a merely physical fact: it amounts to the insertion of the mind in the meaningful whole, which is the common sense world. That is, the body appears (also) as meaning. The same observations apply to language. The body (though often presented as a faceless Ego) is at the centre of language, in the I-here-now. Many pronouns and adverbs serve as marks of what Benveniste (1966) has called the "taking into charge" of the language system by the subj ect: these marks can only be understood with reference to the position in space and time of the person doing the speaking. Just as the perceptual world, language is adumbrated from the position of the subject, whose insertion in the world can only be accomplished by the body. Proxemics is concerned with the subject as a body occupying the central position of space. According to Hall (1966) all cultures define their public, social, personal and intimate, spheres, but the distances that characterise each one of these spaces are different in different cultures. When subjects coming from different cultures meet, their respective spaces tend to clash. According to one of Hall's classical examples, a person from an Arab culture, who posits himself within what is from his point of view the personal sphere, the distance from which it is comfortable to have a chat, inadvertently enters the intimate sphere of a Westerner, the sphere in which it is proper to "fight or make love". From a proxemic point of view, the subject could thus be seen as a topological construction: a series of concentric circles demarcating the public, social, personal and intimate, spaces (in relation to another subject), within which is found the bodily envelope, all of which are defined by the fact that they may be penetrated and thus produce an effect of meaning (see Figure 1 and Sonesson 1993b; 2001). This is to say that these "protective shells",' as Hall calls them, are more or less permeable. In topological terms, they possess the property of being open or closed. More exactly, in merotopological terms, some parts of them have the property of being open and others that of being closed. They produce a meaning when their borders are overstepped. This is of course the case with the Arab conversationalist stepping into the sphere of fighting and love of the Westerner. The case of the bodily envelope is however more easily illustrated: it possesses a series of openings (mouth, nostrils, etc.), but it may also be penetrated elsewhere, with more serious consequences, such as injury.
From the meaning ofembodiment to the embodiment ofmeaning
111
Public
sphere Figure 1.
The body envelope and its surrounding proxemic spheres (cf. Hall 1966, Spiegel and Machotka 1974). The arrows illustrate entries through designated openings and through the closed borders, respectively.
The final protective shell of the body, the skin, did not form part of Hall's original model. It was added later, by Spiegel and Machotka (1974), who also pointed out the difference between orifices permitting penetration, and other places where entry can only be forced. In this respect, their contribution connected to another tradition, the Freudian one, whose model of the body is reminiscent of some of the image schemas suggested by Lakoff and Johnson, if we generalise the sexual interpretation to a more general bodily practice (cf. Sonesson 1989).
5.2.
The bucket theory of the body
The function of the image schemas, as conceived by some cognitive linguists (cf. Hampe 2005), seems to be to project our experience of the body (even if experience per se sometimes seems to be dispensed with) to the interpretation of the world, thus accounting for pervasive linguistic phenomena such as metaphor, metonymy and polysemy. In the same vein, Gardner (1970: 360ft), elaborating on an idea of the psychoanalyst Erik
112
Goran Sonesson
Erikson, claims that certain holistic properties are given a particular import from being first experienced in the relationship between one's own body and the field of objects outside the body, sometimes in relation to the keeping of portions of the environment inside the body, and sometimes in relation to the release of what was once part of the body. Since each bodily zone has a characteristic mode, and since each mode possesses several vectorial properties, the modal/vectorial properties can be seen to form a system: to the oral-sensory zone (mouth and tongue), there corresponds passive and active incorporation; in relation to the anal-secretory zone, retention is experienced; in respect to the sphincter, there is expulsion; and, finally, to the genital zone (penis/vagina), there corresponds intrusion and inclusion. The result of this Freudian parti pris is not only an insistence on the primacy of sexual interpretations, but the neglect of some essential bodily relationships. Just as, according to Piaget, conceptual schemas are abstracted from actions through the many stages of intellectual development, the modal/vectorial properties, as Gardner presents them, may also be conceived to take their origin in the actions of one's own body, but rather than being abstracted, they are immediately seen as global characteristics, and while they may be transposed to other objects than the body, and other relationships than that of the body to the world, as is the case in "symbols", they somehow remain bound up with the body in all their further applications as being the deeper source of their sense (cf. Sonesson 1989). However, Arnheim (1966: 215ff) is right in arguing against Freudian pansexualism that a piece of pottery and a womb have the common class meaning of being containers, rather than the first signifying the second, and that the predominance of the sexual interpretation is due to cultural factors. It seems more probable that bodily experience of a more general kind, including that of enclosing an apple in one's hand and sticking the hand into a hollow in the ground, is the primary basis of modal/vectorial properties. When Arnheim suggests that, going up the tree of Porphyry, both the womb and the piece of pottery will be found to be containers, he is certainly not making the kind of analysis that Porphyry or his followers (as for instance Eco 1984: 46ff) would accept, since the womb does not meet the necessary and sufficient conditions for being a container ordinarily conceived; nor is it referable to the container prototype in a strict sense. Rather it is a member of the extended class of containers acceptable in "symbolism". On this interpretation, of course, the womb would be a deviant piece of pottery, rather than the reverse.
From the meaning ofembodiment to the embodiment ofmeaning
113
In making this kind of argument (cf. Sonesson 1989: 1.4.5), I found myself in a terrain very close to Cognitive Linguistics without knowing it. The first step was to identify the modal-vectorial properties as being topological. In the Piagetean conception, the geometry of the child's first space is topological, that is, it contains the kind of relations that would be preserved in a figure drawn on a piece of rubber (cf. Vuipillot 1967: 104ff). Properties of this kind are neighbourhoodness or proximity, separation, succession, inclusion or interiority/exteriority, and continuity. If we now merely introduce a distinction between two instances, the ego and the world, or the other, it will be possible to derive all of Gardner's "modes" from the topological property of inclusion, to which another topological property, that of succession, is applied. Clearly, intrusion and inclusion are opposites, as are incorporation and expulsion, but rentention seems to call for some corresponding term: this must be resistance (postulated in Sonesson 1989, in complete ignorance of force-dynamical theories, for which cf. Talmy 1988; Gardenfors this volume), of which there are two variants, the resistance of the world to us, and of us to the world. Actually, there is nothing very new about resistance as a fundamental concept: it has been the basis of the definition of reality in philosophical epistemology, from Berkeley over the ideologues Destutt de Tracy and Maine de Biran to Sartre. Indeed, "this sense of being acted upon, which is our sense of the reality of things" is the definition of Secondness in the conception of Peirce (1998: 4): A door is slightly ajar. You try to open it. Something prevents. You put your shoulder against it, and experience a sense of effort and a sense of resistance. These are not two forms of consciousness; they are two aspects of one two-sided consciousness. It is inconceivable that there should be any effort without resistance, or any without a contrary effort. This double-sided consciousness is Secondness. (Peirce 1998: 268)
Since Peirce goes on immediately to note that "all consciousness, all being awake, consists in a sense of reactions between ego and non-ego", it is curious that he should not recognise the difference between the case in which the ego takes the active part, and the case in which the non-ego has the initiative and the ego is reduced to resistance. However, it is clear that from a Peircean point of view, the Freudian interpretations of incorporation, retention, expulsion, and so on, are only special cases of more general bodily processes. For they no doubt continue to be bodily based: it is the body of the ego which first enters into a clash with the non-ego.
114
Goran Sonesson
These operations obviously serve more humble purposes than proving the existence of the outside world. Just as such image schemas as PATH and CONTAINER, they are generalizations of "a recurrent pattern, shape, and regularity in, or of, [ ...} ongoing ordering activities" as actions, perceptions, and conceptions (Johnson 1987: 29, original italics). In spite of what is suggested by this definition, Lakoff and Johnson often do not seem to have any use for the body as an experienced meaning, as opposed to the way it appears to the biological sciences. What is at issue is the exact role played by the body. Whether the actions which sediment to form "images schemas" are accomplished in relation to the inside of the body, or to something outside of it, a minimum requirement for their schematisation must be the existence of the bodily envelope as a relevant level of analysis. It is difficult to understand how such schemas may even come into existence if human beings are as tightly embedded in their Umwelt as the tick, entirely merged with their environment. Yet, in introducing the theory of image schemas, this is precisely the view propounded by Johnson (2005). In defining image schemas, Johnson (1987, 2005, Johnson and Rohrer this volume) uses expressions that are clearly reminiscent of Piaget, although the latter is never quoted. However, in order to acquire sensorymotor schemas, let alone "symbols" (Piaget 1945) or "mimetic schemas" (Zlatev 2005), some sense of a distinction between the acting subject and the world resisting him or her is clearly required. In order to arrive at schemas of a higher order (including the sign function), the subject must somehow take cognizance of the more basic schemas themselves. Johnson and Rohrer (this volume) who do not believe in "the supposedly unique ability of humans to engage in symbolic representation", consider it only an illusion resulting from our seeing the interaction of organism and environment "from our standpoint as observers and theorists". But this leaves unexplained the fact that we are ever able to reach such a standpoint. No doubt there must be a number of progressive stages leading from sensorymotor experience to our capacity for engaging in theory.
6.
The body as portable memory
Is has been suggested by Donald (1991, 2001) that there are several discontinuities in hominid evolution, all involving the acquirement of a distinct kind of memory, considered as a strategy for representing knowledge.
From the meaning ofembodiment to the embodiment ofmeaning
115
Although Donald's model concerns phylogeny, parallels in ontogeny are readily suggested (cf. Zlatev 2002, 2003, 2005, this volume; Ikegami and Zlatev this volume). Without necessarily taking Donald's model at face value, I am going to make use of it here since it permits a productive integration with semiotic theory. Indeed, the Tartu school characterizes signs as memory devices and defines culture as collective memory. According to Lotman et al. (1975), material objects and information are similar to each other, and at the same time differ from other phenomena, in two ways: they can be accumulated, whereas for example, sleep and breathing cannot be accumulated, and they are not absorbed completely into the organism, unlike food, but remain separate objects after the reception. The interesting thing, however, not discussed by the Tartu school, is how material artefacts and signs come to work together. According to Donald's conception, many mammals are already capable of episodic memory, which amounts to the representation of events in terms of their time and place of occurrence. The first transition, which antedates language and remains intact at its loss (and which Donald identifies with Homo erectus) brings about mimetic memory, which is required for such abilities as the construction of tools, miming, imitation, coordinated hunting, a more complex social structure and simple rituals. This stage thus in part seems to correspond to what we have called the attainment of the sign function (though Donald only notes this obliquely, in talking about the use of intentional systems of communication and the distinction of the referent). Yet, it should be noted already at this point that while all abilities subsumed in this stage seem to depend on iconic relations (perceptions of similarity), only some of them are signs because they do not involve any asymmetric relation between an expression and the content for which it stands. Only the second transition, occurring with Homo sapiens, brings about language with its semantic memory, that is, a repertory of units that can be combined. This kind of memory permits the creation of narratives, that is, mythologies, and thus a completely new way of representing reality. Interestingly, however, Donald does not think semiotic development stops there, although further stages are no longer based on any biological changes. However, the third transition obviously would not have been possible without the attainment of the three earlier stages. What Donald calls theoretical culture presupposes the existence of external memory, that is, devices permitting the conservation and communication of knowledge independently of face-to-face interaction between human beings. The first
116
Goran Sonesson
apparition of theoretical culture coincides with the invention of drawing. For the first time, knowledge may be stored externally to the organism. The bias having been shifted to the visual modality, language is next transferred to writing. It is this possibility of conserving information externally to the organism that later gives rise to science (cf. Figure 2).
----\...----------""'.~ 'V'" o +:i
~ ~
Sign function • Toolmakjng
-Iolitation
J5
~
• Language
- Gesture:E
iconicity indexicality
Figure 2.
.2 ~
0
~
I-
· Pictures • Writing • Theory
symbolicity iconicity indexicality
Donald's model of evolution related to the notion of sign function.
There are two remarkable features in Donald's analysis. The stage preceding the attainment of the language capacity requires memory to be located in the subject's own body. But, clearly, it can only function as memory to the extent that it is somehow separable from the body as such. The movement of the other must be seen as distinct from the body of the other in its specificity, so that it can be repeated by the self. This supposes a distinction between token and type (that is, relevance) preceding that of the sign function. The stage following upon language supposes the sign to acquire a "body" of its own, that is, the ability to persist independently of human beings. Language only seems to require the presence of at least two human beings to exist: they somehow maintain it between themselves. But it is not enough for two persons to know about a picture for it to exist: there must be some kind of organism-independent artefact on which it is inscribed.
From the meaning ofembodiment to the embodiment ofmeaning
117
The picture must be divorced from the bodies (and minds) of those making use of it. 13 Writing is of course, by definition, the transposition of language to independent artefacts. The case of "theory" may be less obvious: why should not two persons be able to entertain a theory between them? As Husserl (1962a) noted well before Donald, complex sign systems, such as mathematics and logic, only seem to function as such when given an existence independent of human organisms. In the case of pictures, Ivins (1953) has observed that it is their reproducibility (as in Floras, for instance) that makes them into scientific instruments. In their capacity of being permanent records, pictures are not, as art historians are wont to say, unavoidably unique, but, on the contrary, are destined for reproduction. Indeed, they permit repeated acts of perception, as do no earlier memory records. Students of prehistoric pictures such as White (2000) often suggest that creators of such works must have been capable of language. In fact, not much can be concluded on the basis of the depictions having come down to us: even though pictures, by their nature, must have been made on material which conserves the markings on the surface, they might at first have been created on surfaces (such as sand) which only preserve them for a short time. And it is not easy to establish any clear-cut relation between linguistic capacity and the sophistication of the depictions (whatever that is). There are, however, more fundamental reasons for supposing pictures to be later in phylogeny than language: they suppose a record which is independent of the human body; and they require us to see a similarity within an over-arching dissimilarity. Posner (1989) distinguishes two types of artefacts: the transitory ones (as the sound of a woman's high heeled shoes against the pavement) and enduring ones (as the prints that the woman's shoes may leave in clay, in particular if the latter is later dried). The transitory artefa~ts, in this sense, also have a material aspect, just as the lasting ones; they only have the particularity of developing in time, which is why they cannot be accumulated without first being converted. Strictly speaking, the sound sequence produced by high heels against the pavement, and other transitory artefacts, can of course be accumulated (as opposed to being converted into an enduring artefact, which is the case of the sound tape), in the form of the
13. This is of course what is known, mainly in Marxist literature, as the process of reification. As shown by Cassirer (1942: 113ft), this process, far from being only a "tragedy of culture", is the prerequisite for (huma)n culture.
118
Goran Sonesson
(typical) leg movements producing this sound, that is, as a mimetic record, accumulated in the body, but still distinct from it, since the movements can be learnt and imitated, and even intentionally produced as signs of (traditional) femininity. Posner's example of an enduring artefact is interesting in another way: the cast of prints left by the woman's high heels is of course an organism-independent record, just as the marks of a Roman soldier's sandals found in prehistoric caves, and the hand-prints on cave walls. Another case in point may very well be the so-called Berekhat Ram figure (250000-280000 BP), which, if it is not the likeness of a woman, as has been claimed with very little justification, could be the result of abrasion produced by regular movements indicating the intervention of a human agent (that is, "anthropogenic" movements). This suggests that the first organism-independent records may be indexical, rather than iconic, in character. However, even if objects like these were independent objects already in prehistory, there is nothing to prove they were perceived as signs, that is, as expressions differentiated form contents, before pictures were so perceived. Episodic memory, in Donald's sense (which should not be confused with earlier uses of the term) is most clearly "disembodied" memory: it only goes as far as the attention span does. It may refer to a bodily act, such as going in or out of a container-type object, but it is unable to generalise this movement beyond a particular moment and place, and thus it does not give rise to any kind of independent embodiment (cf. Table 2). Mimetic memory still accumulates in the subject's own body, but it only becomes such, to the extent that what is recorded in the body also exists elsewhere, in at least one other body (or perhaps, in same cases, in other moving artefacts), which supposes generalisation or, more exactly, typification: the creation of a type referring to different tokens instantiated in different bodies. As tokens, then, they are in the body; as types they are shared by different uses. Typification, in this sense, does not require the sign function, but is no doubt a prerequisite for it: indeed, it is during this stage, most likely, that the sign function emerges. Mythic memory (which I would prefer to call linguistic memory or perhaps, as Donald sometimes does: semantic memory) is different again: it has a separate existence, but, like some kind of real-world ectoplasm, is requires the collaborative effort of a least two consciousnesses (which no doubt have to be embodied) for this existence to be sustained. Transitory artefacts, as spoken language or (as Posner would have it) the sound of high-heeled shoes on the pavement, acquire a body only to the extent that a
From the meaning ofembodiment to the embodiment ofmeaning
119
sender and a receiver agree roughly on what they are. Only theoretic memory has a distinct "body" of its own: it subsists independently of the presence of any embodied consciousness, because it is itself embodied. Of course, without anybody able to perceive it, organism-independent records are not of any use. Without any human beings present, they are really worse off than the famous acorn falling from a tree without anybody around to hear its sound. Table 2. Donald's memory types analysed in relation to the nature of accumulation (in the sense of Lotman et al. 1975) 1.1. Type of memory
1.3. Type of embodiment
Episodic
Attention span (event in time/space)
Mimetic
Action sequence co-owned by Ego and Alter
Own body
Mythic
Transient artefact co-produced by Ego and Alter
In the interac-
tion between Ego and Alter
Enduring artefact co-externalised by Ego and Alter
External in relation to Ego and Alter
Theoretic
7.
1.2. Type of accumulation
Conclusions
In this chapter I have tried to relate different notions of embodiment stemming from the phenomenological tradition, and contemporary conceptions of embodiment, taking their origin in Cognitive Linguistics, biology, and cognitive science. My aim has been to show that the various forms of embodiment in both traditions are very different, but that once they are properly analysed, they may be connected with each other, and placed on something like an evolutionary scale similar to the one proposed by Donald
120
Goran Sonesson
(1991). Indeed, the whole point of making these distinctions has been to show the complexity of the "ladder" from (non-human) animals to human beings, a ladder that requires a series of very different steps, only one of which is the capacity for language. Another goal of the essay has been to suggest the way in which the sign function, the general faculty for conceiving signs, emerges out of one kind of embodiment and constitutes a requirement for attaining another one. In the process, I have suggested that we must distinguish meaning in a very general sense, akin to organisation and/or selection, from the sign function, which requires the peculiar property of differentiation. I have claimed that this distinction is not observed in either biosemiotics nor in some parts of Cognitive Linguistics. Moreover, I have argued that what is called "image schemas" by cognitive linguists is basically a kind of bodily meaning, resulting from the position of the human body at the centre of the common sense world, known in phenomenology as the Lifeworld.
Acknowledgements
The author wishes to thank Jordan Zlatev and Tom Ziemke for their detailed comments on an earlier draft of this chapter.
References Arnheim, Rudolf 1966 Towards a Psychology ofArt. London: Faber and Faber. Bartlett, Fredrick C. 1932 Remembering. A Study in Experimental and Social Psychology. Cambridge: Cambridge University Press, Reprint 1967. Benveniste, Emile 1966 Problemes de linguistique generale. Paris: Gallimard. Brentano, Franz 1885 Psychologie von empirischen Standpunkt. Leipzig: Mener. Reprint 1924. Brier, Seren 2001 Cybersemiotics and Umweltslehre. On the cybersemiotic integration of Umweltlehre, enthology, autopoiesis theory, second order cybernetics and Peircean biosemiotics. Semiotica. Special issue on Jakob von Uexkall, 134(1/4): 779-814.
From the meaning ofembodiment to the embodiment ofmeaning
121
Biihler, Karl 1934 Sprachtheorie. Fischer, Frankfurt/M: Ullstein. Reprint 1978. Cassirer, Emst 1942 Zur Logik der Kulturwissenschaften. Goteborg: Elanders. 1945 An Essay on Man. New Haven: Yale University Press. 1957 The Philosophy of Symbolic Forms. Ill. The Phenomenology of Knowledge. Translated by Ralph Mannheim. New Haven: Yale University Press. Costall, Alan this vol. Bringing the body back to life: James Gibson's ecology of embodied agency. Deacon, Terry 1997 The Symbolic Species: The Co-Evolution ofLanguage and the Brain. New York: Norton. 2003 Universal grammar and semiotic constraints. In: Christiansen, Morton H. and Simon Kirby (eds.), Language evolution, 111-139. Oxford: Oxford University Press. Deely, John Introducing Semiotic. Its History and Doctrine. Bloomington: Indi1982 ana University Press. New Beginnings. Early Modern Philosophy and Postmodern 1994 Thought. Toronto: University of Toronto Press. Dennett, Daniel 1991 Consciousness Explained. Boston: Little, Brown, and Co. Donald, Merlin 1991 Origins of the Modern Mind. Three Stages in the Evolution of Culture and Cognition, Cambridge, Mass.: Harvard University Press. A Mind So Rare. The Evolution of Human Consciousness. New 2001 York: Norton. Eco, Umberto 1982 Semiotics and the Philosophy of Language. Bloomington: Indiana University Press. Edelmann, Gerald 1992 Bright Air, Brilliant Fire. On the Matter of the Mind. London: PenguinBooks. Edelmann, Gerald and Guilio Tonini 2000 A Universe ofConsciousness. New York: Basic Books. Emmeche, Claus 2002 The chicken and the Orphean egg. On the function of meaning and the meaning of function. Sign System Studies 30 (1): 15-32. this vol. On the biosemiotics of embodiment and our human cyborg nature.
122
Goran Sonesson
Emmeche, Claus and Jesper Hoffmeyer 1991 From language to nature - the semiotic metaphor in biology. Semiotica 84 (1/2) 1--42. Gallagher, Shaun this vol. Phenomenological and experimental contributions to understanding embodied experience. Gardenfors, Peter this vol. Representing actions and functional properties in embodied conceptual spaces. Gardner, Howard 1970 From mode to symbol. The British Journal ofAesthetics 10 (3): 357375. Gibson, James 1978 The ecological approach to visual perception of pictures. Leonardo, 4 (2): 227-235. 1980 A prefatory essay on the perception of surfaces versus the perception of markings on a surface. In: Margaret Hagen (ed.), The Perception of Pictures. Volume I: Alberti's Window, xi-xvii. New York: Academic Press. 1982 Reasons for Realism. Selected Essays of James J Gibson. Edward Reed and Rebecca Jones (eds.). Hillsdale, New Jersey: Lawrence Erlbaum. Greimas, Algirdas J. 1970 Du sens. Paris: Seuil. Gurwitsch, Aron 1957 Theorie du champ de la conscience. Bruges: Desclee de Brouver. 1964 The Field ofConsciousness. Pittsburgh: Duquesne University Press. 1966 Studies in Phenomenology and Psychology. Evanston: Northwestern University Press. 1974 Phenomenology and the Theory of Science. Evanston: Northwestern University Press. 1979 Human Encounters in the Social World. Pittsburgh: Duquesne University Press. 1985 Marginal Consciousness. Athens, Ohio: Ohio University Press. Halbwachs, Maurice 1925 Les cadres sociaux de la memoire. Paris: PUF. Reprint 1952. 1950 La memoire collective. Paris: PUF. Reprint 1968. Hall, Edward 1966 The Hidden Dimension. London: Bodley Head Ltd. Hampe, Beate (ed.) 2005 Perception to Meaning: Image Schemas in Cognitive Linguistics, Berlin: Mouton de Gruyter.
From the meaning ofembodiment to the embodiment ofmeaning
123
Hoffman, Donald 1998 Visual Intelligence. How We Create What We See. New York: Norton. Hockett, Charles 1977 The View from Language. Athens: University of Georgia Press. Husserl, Edmund 1913 Logische Untersuchungen. Tiibingen: Niemeyer. Reprint 1968. 1939 Erfahrung und Urteil. Prag: Academia Verlagsbuchhandlung. Die Krisis der europiiischen Wissenschaften und die transzendentale 1962 a Phiinomenologie. Husserliana VI. The Hague: Nijhoff. 1962 b Phiinomenologische Psychologie. Husserliana IX. The Hague: Nijhoff. 1973 a Cartesianische Meditationen. Husserliana I. The Hague: Nijhoff. 1973 b Ding und Raum. Husserliana XVI. The Hague: Nijhoff. 1976 Ideen zu einer reinen Phiinomenologie und phiinomenologische Psych 0 logie. Husserliana Ill. The Hague: Nijhoff. 1980 Phantasie, Bildbewusstsein, Erinnerung. Husserliana XXIII. The Hague: Nijhoff. Ikegami, Takashi and Jordan Zlatev this vo!. From pre-representional cognition to language. Innis, Harold 1950 Empire and Communication. Toronto: University of Toronto Press. Reprint 1972. Ivins, William M. 1953 Prints and Visual Communication. Cambridge, Mass.: Harvard University Press. Jakobson, Roman 1975 Coup d'adl sur le developpment de la semiotique. Bloomington: Indiana University Press. James, William 1890 The Principles ofPsychology. New York: Dover Publications. Janet, Pierre 1928 L 'evolution de la memoire et la notion du temps. Paris: Chahine. Johnson, Mark 1987 The Body in the Mind. Chicago and London: University of Chicago Press. 2005 The philosophical significance of image schemas. In: Beate Hampe (ed.), From Perception to Meaning: Image Schemas in Cognitive Linguistics., 15-33. Berlin: Mouton de Gruyter. Johnson, Mark and Tim Rohrer this vo!. We are life creatures: Embodiment, American Pragmatism, and the cognitive organism.
124
Goran Sonesson
Kiing, Guido 1973 Husserl on pictures and intentional objects. Review of metaphysics. (June) XXXVI (4): 670-680. Lakoff, George and Mark Johnson 1999 Philosophy in the Flesh. New York: Basic Books. Lotman, Jurij M., Boris A. Uspenskij, Vjaceslav. V. Ivanov, V. N. Toporov and A. M. Pjatigorski 1975 Thesis on the Semiotic Study of Culture. Lisse: The Peter de Ridder Press. Merleau-Ponty, Maurice 1942 La structure du comportement. Paris: PUF. 1945 La phenomenologie de la perception. Paris: PUF. Parmentier, Richard J. 1985 Signs's place in medias res: Peirce's concept of semiotic mediation. In: Elizabeth Mertz and Richard J. Parmentier (eds.) Semiotic Mediation. Sociocultural and Psychological Perspectives, 23-48. Orlando: Academic Press. Piaget, Jean La formation du symbole chez l'enfant. Neuchatel: Delachaux and 1945 Niestle. Reprint 1967. 1967 La psychologie de I 'intelligence. Paris: Armand Colin. 1970 Epistemologie des sciences de l'homme. Paris: Gallimard. Peirce, Charles Sanders 1931-58 Collected Papers I-VIII. C. Hartshome, P. Weiss and A. Burks (eds.). Cambridge, MA: Harvard University Press. 1998 The Essential Peirce, Volume II. Ed. by the Peirce Edition Project. Bloomington and Indianapolis: Indiana University Press. Posner, Roland 1989 What is culture? Toward a semiotic explication of anthropological concepts. In: Waiter Koch (ed.), The Nature of Culture, 240-295. Bochum. Brochmeyer. Prieto, Luis J. 1975 a Pertinence et pratique. Paris: Minuit. 1975 b Essai de linguistique et semiologie generales. Geneve: Droz. Reed, Edward S. 1996 The Necessity ofExperience. New Haven: Yale University Press. Schiitz, Alfred 1932 Der sinnhafte Aufbau der sozialen Welt. Vienna: Springer. Reprint 1974. 1967 Collected Papers I: The Problem of Social Reality. Nijhoff: The Hague.
From the meaning ofembodiment to the embodiment ofmeaning
125
Smith, Barry and Achille Varzi 1999 The niche. Nous 33 (2): 198-222. Sonesson, Goran 1988 Methods and models in pictorial semiotics. Semiotics Project: Lund University. 1989 PictorialConcepts. Inquiries into the Semiotic Heritage and its Relevance for the Analysis of the Visual World. Lund: Aris/Lund University Press. Bildbetydelser. Inledning till bildsemiotiken som vetenskap. Lund: 1992 a Studentlitteratur. 1992 b The semiotic function and the genesis of pictorial meaning. In: Eero Tarasti (ed.), Center/Periphery in representations and institutions. Proceedings from the Conference of The International Semiotics Institute, Imatra, Finland, July 16-21, 1990, 211-156. Imatra: Acta Semiotica Fennica. 1993 a Pictorial semiotics, Gestalt psychology, and the ecology of perception. Semiotica 99 (3-4): 319-399. 1993 b The multiple bodies of man. Project for a semiotics of the body. Degres 74: d-d42. 1994 Prolegomena to the semiotic analysis of prehistoric visual displays. Semiotica 100 (3): 267-332. 1995 On pictorality. The impact of the perceptual model in the development of visual semiotics. In: Thomas Sebeok and Jean Umiker-Sebeok (eds), .The Semiotic Web 1992/93: Advances in Visual Semiotics, 67-108. Berlin and New York: Mouton de Gruyter. 1996 An essay concerning images. From rhetoric to semiotics by way of ecological physics. Semiotica 109 (1/2): 41-140. 1998 a /entries/. In: Paul Bouissac (ed.) Encyclopaedia of Semiotics. New York and London: Oxford University Press. 1998 b That there are many kinds of pictorial signs. VISIO 3 (1): 33-54. 2000 Iconicity in the ecology of semiosis. In: Troels Deng Johansson, Martin Skov and Berit Brogaard (eds.), Iconicity - A Fundamental Problem in Semiotic, 59-80. Aarhus: NSU Press. 2001 From Semiosis to Ecology. On the theory of iconicity and its consequences for the ontology of the Lifeworld. In: Andrew W. Quinn (ed.), Cultural Cognition and Space Cognition/Cognition culturelle et cognition spatiale. VISIO 6 (2/3): 85-110. 2003 a The Symbolic Species revisited. Considerations on the semiotic turn in cognitive science and biology, SGBWP3. Lund: Lund University. 2003 b Why the mirror is a sign - and why the television picture is no mirror. Two episodes in the critique of the iconicity critique. In S. European Journalfor Semiotic Studies 15(2-4): 217-232.
126
Goran Sonesson
2006
The meaning of meaning in biology and cognitive science. A semiotic reconstruction. In Trudy po znakonym sistemam - Sign Systems Studies 34.1, 135-214. forthc. From iconicity to pictorality. A view from ecological semiotics. To appear in VISIO: thematic issue: Iconicity revisitedlIconicite revisite, Sonesson, Goran (ed.). Spiegel, John and Pavel Machotka 1974 Messages ofthe Body. New York: The Free Press. Talmy, Leonard 1988 Force dynamics in language and cognition. Cognitive Science 12: 49-100. Tomasello, Michael 1999 The Cultural Origins ofHuman Cognition. Cambridge, MA: Harvard University Press. Tomasello, Michael, Malinda Carpenter Joseph Call, Tanya Behne, and Henrike Moll 2005 Understanding and sharing intentions: The origins of cultural cognition. Behavioral and Brain Sciences 28: 675-735. Varela, Francisco, Evan Thompson and Eleanor Rosch 1991 The Embodied Mind. Cognitive Science and Human Experience. Cambridge, Mass.: MIT Press. Von Uexkiill, Jakob 1909 Umwelt und Innenwelt der Tiere. Berlin: Springer. 1956 Streifzuge durch die Umwelten von Tieren und Menschen Bedeutungslehre. Hamburg: RowoWt. 1928 Theoretische Biologie. Frankfurt/M.: Suhrkamp. Reprint 1973. Vuipillot, Eliane 1967 La perception de l'espace. In: Paul Fraisse and Jean Piaget (eds.), Traite de psychologie experimentale VI: La perception, 101-186. Paris: PUF. Vygotsky, Lev 1978 Mind in Society. Cambridge. Mass.: Harvard University Press. Wittgenstein, Ludwig 1971 Philosophische Untersuchungen. Suhrkamp: Frankfrut/M. White, Randall 2000 Prehistoric Art. New York: Harry N. Abrahams, Inc. Ziemke, Tom and Noel E. Sharkey 2001 A stroll through the worlds of robots and animals: Applying Jakob von Uexkiill's theory of meaning to adaptive robots and artificial life. Semiotica 134 (1-4): 701-746.
From the meaning ofembodiment to the embodiment ofmeaning
127
Zlatev, Jordan 2002 Mimesis: The "missing link" between signals and symbols in phylogeny and ontogeny. In: Anneli Pajunen (ed.), Mimesis, Sign and the Evolution ofLanguage, 93-122. Turku: University of Turku Press. 2003 Meaning = Life (+ Culture). An outline of a unified biocultural theory of meaning. Evolution ofCommunication 4 (2): 253-296. 2005 What's in a schema? Bodily mimesis and the grounding of language. In: Beate Hampe (ed.), From Perception to Meaning: Image Schemas in Cognitive Linguistics, 313-342. Berlin: Mouton de Gruyter. this vol. Embodiment, language, and mimesis.
Embodiment and social interaction: A cognitive science perspective Jessica Lindblom and Tom Ziemke We respond to gestures with an extreme alertness and, one might almost say, in accordance with an elaborate and secret code that is written nowhere, known to no one and understood by all. Sapir 1928.
Abstract Much recent work in cognitive science has focused on the embodied, situated and distributed nature of mind. This is a radical shift away from the computer metaphor for mind that characterized traditional cognitive science. However, although much attention is paid nowadays to the biological and bodily basis of cognitive processes as well as their sociocultural embedding, the role of embodiment in social interaction is still relatively little understood. In this chapter we trace the role of biological and sociocultural factors in explanations of cognitive processes from Darwin to modem cognitive science. We discuss different degrees of commitment to embodiment and point out further steps and conceptual clarifications that will be required in the further development of a science of embodied cognition and social interaction. Keywords: embodied cognition, history of cognitive science, interactive technology, radical embodiment, social interaction.
1.
Introduction
Embodiment has become a much discussed concept in cognitive science in recent years, and many take it, together with situatedness, to be the defining feature of a new approach to the study of mind commonly referred to as embodied cognitive science or embodied cognition (e.g., Clark 1997, 1999; Varela, Thompson and Rosch 1991). Embodied cognition offers a radical shift in explanations of the human mind - a Copernican revolution in cog-
130
Jessica Lindblom and Tom Ziemke
nitive science, you might say - which emphasizes the way cognition is shaped by the body and its sensorimotor interaction with the world. This is a reaction against the computer metaphor of traditional cognitive science, which views cognition as symbol manipulation, centralized and taking place inside the skull while the body only serves as some kind of input and output device, i.e., a physical interface between internal program (cognitive processes) and external world.! That means, embodiment only played a marginal, if any, role in traditional cognitive science which instead focused on mental representations and computational processes. Gardner (1987: 6), for example, stressed the major assumption for the then "new science of mind" as follows: First of all, there is the belief that, in talking about human cognitive activities, it is necessary to speak about mental representations and to posit a level of analysis wholly separate from the biological or neurological, on the one hand, and the sociological or cultural, on the other.
Consequently, the computer and its syntax-driven way of manipulating symbols became the general model of how the human mind functions. Accordingly, the traditional view of social interaction has been that agents relate to each other much in the same way they relate to other parts of the external world, i.e., by having more or less explicit internal representations of each other. That means, the "secret code" that Sapir referred to in the above introductory quote was quite literally taken to be an actual symbolic/representational code. For example, one agent might encode her mental states into some form suitable for transmission via some communication channel, such as language or gestures, and another would receive and decode the transmitted message, and thus come to an understanding of the first agent's mental states and actions. That means, traditional cognitive-scientific explanations of both individual and social cognition very much focused on internal, individual processes and representations, whereas the interaction with the environment - physical, social or cultural - was treated as fairly peripheral. Gardner (1987), for example, argued that such "murky concepts" as context, history and culture only could cause problems for efforts to find the "essence" of individual cognition, and therefore should not be addressed and integrated until later on. Today's proponents of embodied, situated and/or distributed cognition, on the other hand, paint a much more complex picture of the mind, its reIa1. In cognitive science the traditional view is referred to as "cognitivism" which, however, should not be confused with the cognitivist approach in linguistics.
Embodiment and social interaction: A cognitive science perspective
131
tion to the body, and its interaction with the physical and sociocultural environment. Hutchins (1995), for example, argued that there are unnoticed costs involved when we disregard culture, context and history, which he considered important factors in the development of individual intelligence. The body's role in cognitive processes has also received much attention in recent discussions in cognitive science (cf. e.g., Clark 1997, 1999; Varela, Thompson and Rosch 1991; Ziemke 2002) and a large variety of notions of embodiment and embodied cognition have been developed (cf. e.g., Chrisley and Ziemke 2003; Clark 1999; Rohrer this volume; Svensson, Lindblom and Ziemke this volume; Wilson 2002; Ziemke 2001a, 2003; Ziemke and Frank this volume). From a psychological perspective the interaction between individual agent and environment has been addressed, while from the perspectives of philosophy and linguistics the role of body and world in the constitution of meaning has received much attention. Moreover, from the AI perspective it has been discussed what kind of body an artificial system might need to count as an embodied cognizer. However, despite the fact that embodied cognitive science pays much attention to both the sociocultural embedding of cognitive processes and their bodily basis, there still is surprisingly little work on the role of embodiment in social interaction, as recently recognized by several researchers, who have argued that current theories of embodiment need to be developed further to include the social dimension (e.g., Lindblom and Ziemke 2005; Riegler 2002; Semin and Smith 2002; Sinha and Jensen 2000; Ziemke 2002, 2003). Traditionally, the study of the body in social interaction in cognitive science has been limited to non-verbal communication, commonly viewed as a trivial form of "body language" or an "appendage" to the real intellectuallanguage and mind. However, it has been estimated that nearly two thirds of the meaning in a social situation is "received" from these nonverbal signs (Burgoon, Buller and Woodall 1996). Famell (1999) pointed out that the widespread neglect of the body in the social sciences is a consequence of the Platonic-Cartesian heritage that has resulted in the view of the mind as an internal locus of rationality, thought, language and knowledge (cf., Damasio 1995, 1999; Johnson and Rohrer, this volume). The body, on the other hand, is still widely regarded as the sensate locus of irrationality and feeling, a view that has been supported by the Christian
132
Jessica Lindblom and Tom Ziemke
disregard2 of the flesh as the locus of sinful desire and irrationality. Theories of cultural and social cognition therefore mainly tend to overlook the bodily aspects of social interaction (cf. Tomasello 1999). Varela (1992), for example, argued that "[s]ocial scientists are body-dead because they are conceptually brain-dead to signifying acts within the semiotics of body movements". Trevarthen (1977; cf., Hendriks-Jansen 1996) pointed out that one reason for the neglect of the body in psychological research was that the bodily movement patterns of humans were difficult to observe with current technology, and cognitive science therefore became more of a static science of perception, cognition and action than a science of embodied dynamic interactions. On the other hand, when researchers actually paid attention to embodied movement, it often appeared that, as Famell (1995) formulated it, the moving body has lost its mind. Another reason why researchers commonly overlook the role of the body is perhaps the widespread fear of slipping into biological reductionism which is also why most social scientists tend to, or prefer to, view mind as superior to and independent of the body (Famell 1999). Hence, it should be noted that investigating the role of embodiment in social interaction is not the same as relapsing into (socio-) biological reductionism. In fact, the underlying supposed dichotomy between nature and culture, i.e., biological and sociocultural factors, is highly misleading in the first place, despite the fact that it has a long tradition in scientific discussions. Ingold (2000) used the example of learning to walk as an illustration of the traditional conception of the relation between "nature" and "culture". It is commonly argued that walking is an innate human capability, but Ingold does not categorize human walking as either innate or an acquired. A child learns to walk according to the standard manner of its social and cultural environment, and there is no one "natural" or "pure" biological way of walking, as one might assume. That means, the human skill of walking can be viewed as "biological" in the sense of being a part of the functions of the human organism, but it is also a result of the child's involvement in a social and physical world during normal development. Ingold (2000) therefore pointed out that instead of speaking of embodiment, the term "enmindment" could be used as well, since body and mind are not separate, but in some sense two sides of one coin, i.e., two ways of
2. See also Barbour (1999) who points out that the dichotomous concept of man in Christianity is a result of the Greek dualism of body and soul and actually not supported by the biblical view.
Embodiment and social interaction: A cognitive science perspective
133
describing the same process, namely the activity of the (human) organism in its social and physical environment. Currently, the increased interest in both bodilylbiological and sociocultural processes in cognitive science has led to a partial rediscovery of psychological, philosophical and biological theories that predated cognitive science. Yet these theories received very little attention during the first decades of cognitive science which at the time was dominated by the computer metaphor. This applies, for example, to the works of the Russian scholar Vygotsky, who already in the 1930s suggested that social interaction is not an add-on to, but in fact a necessary requirement for individual cognitive development, an idea revived in recent humanoid robotics research (cf., Lindblom and Ziemke 2003), and the French philosopher Merleau-Ponty, who already in the 1940s argued that what he called intercorporeality constituted the basis of intersubjectivity and social interaction, an idea that has been rediscovered or confirmed in recent neuroscientific findings involving so-called mirror neurons (cf., Gallagher this volume). These examples illustrate that modem cognitive science still might have much to gain from re-evaluating and possibly incorporating some of these "old" ideas. The next section therefore provides some historical perspectives, ending with the "cognitive revolution" and the criticisms thereof. Section 3 then further discusses the role of embodiment and social interaction in cognitive science today. Next, Section 4 discusses the positions of "simple" vs. "radical" embodiment in current theories of embodiment, and finally, Section 5 discusses implications for embodied cognitive science and its possible contributions to the development of interactive technology.
2.
Historical perspectives
Cognitive science, which as a discipline only has been around for a few decades, can be said to have a short history but a long past, given that its philosophical roots can be traced back to ancient Greece (Gardner 1987). Even the view of the human being as a reasoning device, calculating according to rules, has its origin in the work of Plato, who argued that phenomena that could not be formalized explicitly, such as bodily skills and feelings, should not count as knowledge. Consequently, he distinguished between the rational mind and the body with its emotions and skills. This was the starting point of the Western philosophical tradition, assuming that reckoning is the "language of the mind" (Dreyfus 1979; cf., Johnson and
134
Jessica Lindblom and Tom Ziemke
Rohrer this volume; Lakoff and Johnson 1999). This conceptual separation of mind and body was further developed by Aristotle who divorced the practical from the theoretical and defined the human being as the "rational animal". Descartes then confirmed this view, rather than adding anything radically new, and formulated the dualist viewpoint of mind and body as two different substances (Cisek 1999; Dreyfus 1979; see also Costall this volume). That means, while other animals, according to Descartes, are mere mechanisms, or bodies without minds, humans alone possess a soul which he also described as the "pilot of the corporeal boat" (Dreyfus 1979; Freeman and Nufiez 1999). Vitalists, on the other hand, disagreed with this mechanistic view of life and argued that all living creatures had some sort of "vital energy" comparable to a spirit or soul. The issue never really got resolved, but new life was brought into the debate through the work of Darwin (cf., Ziemke 2001b). After Charles Darwin's 1859 book, The Origin of Species, many researchers began to search for unified theories that would explain the behavior of humans as well as other organisms. Interestingly, this resulted in two almost opposite lines of research, which nevertheless can be traced back, at least partly, to Darwin himself. Darwin tried to explain how different species had evolved by assuming that patterns of behavior were the product of evolution and that there was a mental linkage between animals and humans (Cosmides et al. 1997; McFarland 1993). In modem terms, Darwin certainly viewed the mind as embodied and did not believe in a mind separate from the body. For example, he wrote in his personal notebook (1836-1844) that "experience shows the problem of the mind cannot be solved by attacking the citadel itself - the mind is function of [the] body - we must bring some stable foundation to argue from" (Sheet-Johnstone 1999: 435). However, instead of studying mind as "function of the body", many later researchers reduced the body and its interactions with the environment to only include the brain itself (cf., Sheets-Johnstone 1999). Darwin's 1872 book, The Expression of the Emotions in Man and Animal, can be considered the first modem book on behavior, although he focused primarily on facial expressions and did not pay much attention to gestures and body language. Darwin himself combined psychology and biology, but he could also be said to have caused the split between the two. Although his work was a serious attack on dualism in the sense that he tried to bridge the presumed gulf between body and soul, his fear of public opinion resulted in two parallel lines of explanations for the development
Embodiment and social interaction: A cognitive science perspective
135
of the human mind, namely a phylogenetic line that stressed descent, and an adaptationist line that emphasized selection (Cosmides et al. 1997). The adaptationist perspective has generally been ignored within psychology, while the phylogenetic branch has been further explored, partly following Darwin, who mainly tried to explain the human mind from a phylogenetic perspective (Cosmides et al. 1997). Given that Darwin believed in a single universal line of development, in order to make this claim work, he had to run counter to his earlier ideas which stressed the role of environmental conditions and adaptations, resulting in several lines of evolutionary development. Consequently, he had to argue for some kind of supremacy of reason in reference to some general inherited endowment. The adaptationist perspective, however, would not fully support this view, since it focused on the differences between separate species. Darwin was worried that this perspective could be interpreted as supporting the dualist view, in the sense that different qualities in separate species actually could lead to a uniquely human soul (Cosmides et al. 1997). Darwin's work had great general impact and brought about increased interest in the study of behavior (AlIen and Bekoff 1997; Cosmides et al. 1997; McFarland 1993). Subsequently, two contrary lines of behavioral research emerged. Firstly, there were those who attributed human-like subjective mental qualities to other vertebrates and even invertebrates. Darwin (1872) himself, for example, described dogs as feeling pleasure when doing what they considered their duty, and ants as feeling despair at having lost their home to a spade (cf. Sparks 1982). He was, nevertheless, well aware that many of these behaviors were purely instinctive. His protege Romanes (1882) was less careful and provided anthropomorphic explanations for a large range of animal behaviors, attributing to animals human-like degrees of intelligence, consciousness and emotions, based solely on his own observations (cf. Ziemke 2001b). Secondly, there were the mechanists who also believed in common behavioral mechanisms in humans and other organisms, but considered the attribution of subjective mental qualities as unscientific. Instead they viewed animals - unlike Descartes, often including humans - as Cartesian puppets, i.e., mere objects guided by their environment. Darwin himself in his 1880 book, The Power ofMovement in Plants, compared the downward digging behavior of earthworms and moles to the downward growth of roots in response to gravity. Von Sachs (1882) further elaborated the view of plants as guided by "forced movements", which he termed tropisms, and demonstrated the orientation of various plant organs towards or away from
136
Jessica Lindblom and Tom Ziemke
stimuli such as gravity, light or moisture. Loeb (1890, 1918) extended von Sachs' theory of tropisms to animal behavior to stress what he saw as the fundamental identity of the curvature movements of plants and the locomotion of animals. Loeb's own experiments revealed many different kinds of tropisms in caterpillars, moths, shrimps, etc., directed towards or away from light, gravity, chemicals, etc. This led him to apply his theory to human intelligence and to claim even humans to be mere mechanisms "compelled to react to physical conditions like corn waving in the wind" (Sparks 1982). Thus, his theory became the opposite of Romanes' view of feeling, conscious animals. At about the same time, Sherrington identified the reflex as the elementary unit of behavior and described the reflex arc as consisting of three elements: an effector organ, a conducting nervous pathway leading to that organ, and a receptor to initiate the reaction. As a consequence, experi- . mental psychology made the supposedly distinct elements of stimulus and response to its units of analysis and treated them as largely separate (cf. Sherrington 1906; Ziemke 2001b). Sherrington's own initial description of the reflex arc, nonetheless, implied an integrated neural circuitry of "coordinated action". Dewey (1896) was also very critical of the more restricted interpretation of the reflex arc and argued that this psychological model was a legacy of dualism. He rejected the linear model of stimulus and response, proposing instead the concept of circular sensorimotor coordination (Dewey 1896; cf. Clancey 1997). Hence, he put forward an alternative explanation of how the mind works, without an intervening "consciousness" controller as a kind of soul, while at the same time stressing the embodied nature of mind, with sensorimotor coordination as the key underlying mechanism (cf. Johnson and Rohrer this volume). Cisek (1999) has pointed out two important issues in the development of behavioral research at the beginning of the 20 th century. Firstly, there was a split in the study of living organisms, resulting in one line that concentrated on studying behavior, whereas the other focused on physiology. Secondly, the behavioral sciences as a whole suffered from a certain envy of formal approaches. Many researchers therefore tried to explain behaviors and mental phenomena in a mathematical fashion in order to validate the study of behavior "scientifically", worried that otherwise their research might not be considered serious science. Behaviorism, which became the dominant approach in psychology in the beginning of the 20th century, in particular in the US, argued that only observations of overt behavior should be the object of study (AlIen and
Embodiment and social interaction: A cognitive science perspective
137
Bekoff 1997). The approach had its roots in mechanistic theories and the animal learning work of Pavlov, who viewed the biological body as a machine (McFarland 1993). Behaviorists aimed to explain behavior as the result of learning, in terms of classical and operant conditioning. Antimenta/ism and equipotentia/ity, the assumption that there was no inherent bias in organisms towards some stimulus-response associations over others, became the two pillars of behaviorism (Cosmides et al. 1997). This was the first rigorous attack against the dualistic assumption in the sense that it totally ignored mental content. Accordingly, little, if any, interest was directed towards embodiment other than in a mechanistic sense. All along, while psychology was dominated by mechanistic and behaviorist ideas, there were researchers working on alternative, constructivist and phenomenological conceptions of behavior and mind. Some of these can be viewed as the forerunners or roots of modem embodied and situated theories of cognition. Researchers like von Uexldill, Heidegger and Merleau-Ponty, for example, like behaviorists, emphasized, in different ways, the situated and embodied nature of mind and the crucial role of agentenvironment interaction. However, completely unlike behaviorists, they focused on the subjective, phenomenal dimensions of "being-in-theworld", as Heidegger (1962) called it. At about the same time, Vygotsky and Piaget developed their highly influential theories of cognitive development which, in different ways, emphasized the embodied, (socially) situated and constructive nature of cognition. Due to their relevance to modem theories of embodiment and/or its role in social interaction, the contributions of these researchers will be discussed in more detail in the following. Jakob von Uexldill (1864-1944), a German biologist strongly inspired by the Kantian insight that all knowledge is determined by the knower's subjective ways of perceiving and conceiving, considered it the task of biology to expand Kant's research by investigating the role of the body and the relationship between subjects and their objects (von Uexldill 1928; cf. Ziemke 2001b). According to von Uexldill, the main problem with mechanist and behaviourist ideas was that they overlooked the organism's subjective nature. In his own words: The mechanists have pieced together the sensory and motor organs of animals, like so many parts of a machine, ignoring their real functions of perceiving and acting, and have gone on to mechanize man himself. According to the behaviorists, man's own sensations and will are mere appearance, to be considered, if at all, only as disturbing static. But we who still hold that
138
Jessica Lindblom and Tom Ziemke
our sense organs serve our perceptions, and our motor organs our actions, see in animals as well not only the mechanical structure, but also the operator, who is built into their organs as we are into our bodies. We no longer regard animals as mere machines, but as subjects whose essential activity consists of perceiving and acting. (von Uexkii1l1957: 6)
It should be noted that von Uexkiill's view is not vitalistic (cf. Emmeche 2001; Ziemke and Sharkey 2001). He described animal behavior as guided by "successive reflexes" each of which is "elicited by objectively demonstrable physical or chemical stimuli"; at the same time he also emphasized that the organism's components are forged together to form a coherent whole, i.e., a subject that acts as a behavioral entity which forms a "systematic whole" with its Umwelt, namely, its subjective world of perception and action. The French philosopher Maurice Merleau-Ponty (1908-1961), strongly inspired by Gestalt psychology and also partly by Heidegger and von Uexkiill, argued that the mind was essentially embodied and interacting with the surrounding world (e.g., Merleau-Ponty 1963, 1962; cf. Dreyfus 1979). More specifically, he argued that it is actually the body which provides meaning or intentionality for the mind. His core concept was the idea that "I am my body" as a kind of "embodied cogito", which means that it is not only the brain that does the thinking, but the whole body (Priest 1998; cf. Roth this volume). He argued that it is the body that has the necessary "knowledge" to perform tasks present at hand, since it "knows how to act" and "how to perceive" through the history of its phylogenetic and ontogenetic interactions with the environment. Merleau-Ponty (1969) tried to find the fundamental primitives of the mutual interdependence between the organism and its world, and he concluded that both are different aspects of the same primary whole of "brute being", which he referred to as the flesh (cf. Mingers 2001). That means, the lived body, according to Merleau-Ponty, has a twofold character, being able to see or touch on the one hand, and being seen or touched on the other. In his own words, the flesh is "not matter, is not mind, is not substance. To designate it, we should need the old term "element" [... ] in the sense of a general thing, midway between the spatio-temporal individual and the idea" (Merleau-Ponty 1969, quoted in Mingers 2001: 116). In addition to questions concerning low-level functions and bodily intentions, Merleau-Ponty tried to situate his theory in a cultural context, arguing that bodily behavior constitutes the most basic form of communication and cooperation among organisms (cf. Gallagher this volume). Even
Embodiment and social interaction: A cognitive science perspective
139
human language is an extension of these bodily acts, and when humans have acquired a language, it will control human cognition and communication (Loren and Dietrich 1997; Neuman 2001). Merleau-Ponty (1969) treated the relation between language and thought as an intertwined process, arguing that the use of language actually is the process of thinking; either we speak to ourselves or to other persons (Mingers 2001). To summarize, the main characteristics of Merleau-Ponty's work are his nondualistic and anti-behavioristic view of embodied experience, his argument that bodies in fact are intentional in themselves, and the view of cognition as a biological phenomena, rather than a mental one. The Swiss biologist/psychologist Jean Piaget (1896-1980) similarly stressed the importance of sensorimotor activity for the emergence of intelligent behavior (e.g., Piaget 1952). He first and foremost viewed "cognition as an instrument of adaptation, as a tool for fitting ourselves into the world of our experiences" (von Glasersfeld 1995: 14). That means, cognition in Piaget's sense is about the organization of an agent's sensorimotor experiences and interactions with the environment - a view very similar to that of von Uexkiill (cf. Ziemke 2001b). Hence, the relationship between action and lmowledge is central in Piaget's genetic epistemology, which posited lmowledge as constructed by the child in interaction with its environment. The basic organizational force of intellectual development, according to Piaget, is logic, and this is highlighted in his theory which characterizes different forms of logical thinking. Consequently he claimed that his theory of cognitive development was universal (cf. Sinha and Jensen 2000; Wadsworth 1996). It has been argued that Piaget neglected the social dimension, but Sinha and Jensen (2000) have pointed out that this is a common misunderstanding (cf. Cole and Wertsch 1996). For example, Piaget stated that "[h]uman lmowledge is essentially collective and social life constitutes an essential factor in the creation and growth of lmowledge, both pre-scientific and scientific" (Piaget 1995: 30). He actually stressed the social dimension as an essential factor for the development of cognition, supposing that sociocultural factors could either accelerate or retard the developmental process, but perhaps not change its direction towards the terminal point of (logical) cognitive development (Sinha and Jensen 2000). Piaget's most significant contribution probably is the claim that cognitive changes are the outcome of a constructive developmental process that occurs in sensorimotor interaction with the environment. However, it should also be mentioned that Piaget's theory has been criticized heavily
140
Jessica Lindblom and Tom Ziemke
(cf. e.g., Miller 1983; Wadsworth 1996). Boden (1994), for example, noted that Piaget actually underestimated children's innate capabilities, both initially and later on during development. Furthermore, Piaget's theory was based on intellectual ideals of the Western tradition, and did not pay much attention to cultural differences in cognitive development, as illustrated by his claim that his theory was universal (cf. Cole and Wertsch 1996). The role of culture was strongly emphasized, on the other hand, in the development theory of Russian scholar Lev Vygotsky (1896-1934). He rejected mechanistic and behaviorist theories because they disregarded what he considered essential differences between human and animal intelligence (cf. Lindblom and Ziemke 2003). He wanted a psychological theory that would describe the development of what he considered exclusively human abilities, and, as a result, proposed that individual cognitive development requires a sociocultural embedding through certain transformation processes. These transformations, from elementary to higher mental functions, take place via signification, i.e., the shift of control from the environment in elementary mental functions, to the individual's voluntary regulation of her behavior in higher mental functions. This voluntary regulation of individual behavior, according to Vygotsky (1978), is accomplished through the use of mediating artificial or self-generated stimuli, functioning as a link between external stimulus and response. He argued that the incorporation of such psychological tools was the key difference between animal and human behavior, and summarized this position in the statement that "the central fact about our psychology is the fact of mediation" (Vygotsky 1933, quoted from Wertsch 1985:15). Psychological tools, according to Vygotsky, also function as regulators of human social behavior, and, among these tools, language is an especially important "organizer", both in the form of speech and written text. The other transformation process of cultural cognitive development is internalization, as stated in Vygotsky's (1978) general law of cultural development, which states that every function in the child's cognitive development appears twice, first between humans (at a social interaction level) and then in the child's mind (becomes internalized at the individual level). Hence, the cognitive abilities of an "enculturated" adult are the product of developmental processes, in which "primitive" and "immature" humans are transformed into cultural ones through social interactions, leaving room for different forms of intelligence. A common criticism (and misunderstanding) ofVygotsky's work is that he focused almost exclusively on the sociocultural aspects, but neglected
Embodiment and social interaction: A cognitive science perspective
141
the biological line of development (e.g., Davydov and Radzikhovskii 1985). However, as Wertsch (1985) pointed out, Vygotsky considered biological factors as necessary, but not sufficient conditions, stating for example that "culture creates nothing; it only alters natural data in conformity with human goals" (Vygotsky 1960, cited in Wertsch 1985:47). Elsewhere he stated: Both planes of development - the natural and the cultural - coincide and mingle with each other. The two lines of change interpenetrate each other and essentially form a single line of sociobiological formation of the child's personality.3 (Vygotsky 1960, quoted from Wertsch 1985:41)
In summary, despite the fact that Vygotsky's and Piaget's theories are
commonly contrasted as focused on the "child in society" and the "child as a solitary thinker" (cf. Cole and Wertsch 1996), respectively, they are in fact largely compatible and agree in viewing knowledge as constructed through the interaction of biological and sociocultural factors in the course of cognitive development. Despite serious criticisms and the development of the above competing theories, behaviorism remained the dominant approach in psychology, particularly in the US, throughout the first half of the 20 th century. Its impact started to decrease in the mid-1950s, mainly for two reasons: firstly, ethologists challenged the equipotentiality assumption (Breland and Breland 1961; Garcia and Koelling 1966); secondly, the advent of the computer seemed to challenge anti-mentalistic assumptions and paved the way for the so-called "cognitive revolution" (Gardner 1987). Cognitive science in general, from its beginnings in the mid-1950s, and its focus on internal representations in particular, were largely conceived as reactions against the anti-mentalistic behaviorist stance. Little attention, if any, was therefore directed to studies in biology and animal behavior. This was partly because cognitive science aimed for supposedly higher mental processes which animals were assumed to lack, and partly because studies of animal behavior were associated with the "horror" of behaviorism (Gardner 1987). As a result, other theories that stressed agent-environment interaction, such as the work of von Uexki.ill, Heidegger, Vygotsky, Merleau-Ponty and Piaget, were largely ignored during the era of computationalist cognitive science. In a nutshell, while behaviorists had treated mind as an opaque box in a transparent world, computationalist cognitive science treated it as 3. It should be noted that the translation "sociobiological" should not be confused with the later use of the term "sociobiology" by Wilson (1975).
142
Jessica Lindblom and Tom Ziemke
a transparent box in an opaque world (Lloyd 1989; cf. Sharkey and Ziemke 1998). According to the computer metaphor in cognitive science, cognition takes place inside the skull in the form of abstract symbol manipulation, while the body only serves as an input and output device, i.e., a physical interface between internal program (cognitive processes) and external world (cf. e.g., Pfeifer and Scheier 1999). According to this view, the essence of cognition is that minds or brains, just like computers, "accept information, manipulate symbols, store items in "memory" and retrieve them again, classify inputs, recognize patterns and so on" (Neisser 1976: 5). Representations, viewed as based on a correspondence or mapping between elements of the external world and their internal correspondents, played a crucial role in this conception of mind, since in some sense they constituted the only link between agent and environment. Agents, and their environments, were of course also still considered to be physical, but the relation between body and mind was considered as similar to the relation between hardware and software in a computer. That means, the body was viewed a mere physical implementation of the mind, which obviously needs some physical instantiation (cf. Chrisley and Ziemke 2003), but apart from that is largely implementation-independent. This functionalist view (e.g., Putnam 1975), often summarized by the slogan that "[m]inds are simply what brains do" (Minsky 1975), might sound similar to Darwin's statement that "mind is function of the body". Nonetheless, it left out the body and was interpreted entirely differently. Pfeifer and Scheier summarized the functionalist view as follows: Functionalism [... ] means that thinking and other intelligent functions need not be carried out by means of the same machinery in order to reflect the same kinds of processes; in fact, the machinery could be made of Emmental cheese, so long as it can perform the functions required. In other words, intelligence or cognition can be studied at the level of algorithms or computational processes without having to consider the underlying structure of the device on which the algorithm is performed. From the functionalist position it follows that there is a distinction between hardware and software: What we are interested in is the software or the program. (Pfeifer and Scheier 1999:43)
The reason for the success of the computer metaphor for mind was that it offered convincing answers to major questions that confronted cognitive science during its initial decades (Cisek 1999). Computationalism offered non-dualistic explanations of internal states, such as "memory". Further-
Embodiment and social interaction: A cognitive science perspective
143
more it offered the exciting metaphor of mental states and processes acting as the software running on the brain's hardware. This seemed to be an elegant solution to the mind-body problem, bridging the gulf between body/biology (hardware) and mind/psychology (software). Finally, computationalist psychology also provided a longed-for mathematical formalism for the mechanisms underlying behavior. It is therefore hardly surprising that the computer metaphor became the dominant model of how the mind works (Cisek 1999). However, in the late 1970s several criticisms of computationalism, in particular the resulting research in artificial intelligence (AI), emerged (Dreyfus 1979; Searle 1980; cf. Ziemke 2001b). The overall concern in these criticisms was in fact the lack of embodiment and situatedness. Dreyfus (1979), strongly inspired by the Heideggerian notion of being-in-theworld, questioned traditional AI's use of explicit, symbolic representations, arguing that the resulting computer programs lacked situatedness since they represented descriptions of isolated bits of human knowledge "from the outside". Dreyfus' (1979) explanation of human situatedness and a computer program's lack thereof is still worth quoting at length: Humans [... ] are, as Heidegger puts it, always already in a situation, which they constantly revise. Ifwe look at it genetically, this is no mystery. We can see that human beings are gradually trained into their cultural situation on the basis of their embodied pre-cultural situation [... ] But for this very reason a program [... ] is not always-already-in-a-situation. Even if it represents all human knowledge in its stereotypes, including all possible types of human situations, it represents them from the outside [... ]. It isn't situated in anyone of them, and it may be impossible to program it to behave as if it were. [... ] It seems that our sense of our situation is determined by our changing moods, by our current concerns and projects, by our long-range selfinterpretation and probably also by our sensory-motor skills for coping with objects and people - skills we develop by practice without ever having to represent to ourselves our body as an object, our culture as a set of beliefs, or our propensities as situation-action rules. All these uniquely human capacities provide a "richness" or a "thickness" to our way of being-in-theworld and thus seem to play an essential role in situatedness, which in turn underlies all intelligent behavior. (Dreyfus 1979: 52-53)
From this very brief overview of the historical roots, predecessors and critiques of traditional cognitive science, it should be clear that the idea of an embodied mind is by no means new. Although theorists have sought to
144
Jessica Lindblom and Tom Ziemke
understand the mind from different perspectives, the rational and formalized view was the dominating approach in cognitive science for a long time, and consequently the role of the body and the environment - physical as well as social- was largely disregarded. As Dreyfus (1979) pointed out, the computer, as well as the computer metaphor for mind, is the product of over 2,500 years of traditional thinking in Plato's footsteps. In that sense, as Cisek (1999) noted, the cognitive revolution was just "old wine in new bottles". Since the late 1980s, however, cognitive science has re-discovered or re-invented many, although not all, of the "pre-revolutionary" ideas and revived them in theories that acknowledge the embodied, situated, distributed and sociocultural nature of the human mind, a topic which will be discussed in more detail in the following section.
3.
Embodiment and social interaction: Current perspectives
Today, there is a growing interest in embodiment in many areas of cognitive science. Indeed, many researchers consider embodiment a necessary requirement for intelligence and mind. However, among proponents of embodied cognition there are nowadays a large number of notions and conceptions of embodiment and embodied cognition (e.g., Chrisley and Ziemke 2003; Ziemke 2003; cf. Ziemke and Frank this volume), a fact also amply illustrated by the diversity of perspectives collected in this volume. This diversity is also somewhat characteristic of the current state of cognitive science, which during the last twenty years has developed from a fairly unanimous agreement on the centrality of computation and (symbolic) representation to a pluralistic inter-discipline that constantly challenges and re-evaluates its own foundations and conceptions. As already mentioned in the introduction, in recent years both the neuroscientific and the anthropological perspective have steadily gained ground in cognitive science. This has lead to new sub-disciplines such as social cognitive neuroscience (e.g., Blakemore, Winston and Frith 2004) which, also for technical reasons would have been inconceivable only ten years ago. Thus, it is becoming increasingly accepted that cognition does not, as Gardner (1987) described it, constitute or require "a level of analysis wholly separate from the biological or neurological, on the one hand, and the sociological or cultural, on the other" (cf. above). In fact, the opposite is true: many cognitive scientists today are actively researching the dependence of cognitive
Embodiment and social interaction: A cognitive science perspective
145
processes on their neurobiological realization, their sensorimotor embodiment as well as their sociocultural situatedness. Somewhat surprisingly, however, despite the rich discussion of different aspects and forms of embodied cognition, the social dimension of embodiment has not yet received much attention. However, recent work in cognitive science and related disciplines indicates that the body has several important roles in social interactions and cognition. Barsalou et aI. (2003), for example, have identified a number of different empirical findings, mostly from social psychology experiments, which illustrate the role of embodiment and social cognition. More specifically, they identified four types of social embodiment effects which are discussed in more detail in the following. Firstly, perceived social stimuli do not only produce cognitive states, but also bodily states. For example, it has been reported that high school students receiving good grades in an exam adopted a more erect posture than students that received poor grades (cf. Barsalou et aI. 2003). As Barsalou et aI. pointed out, it is unclear whether the bodily reaction is triggered by the social event/stimulus directly, or by mediating mechanisms. In this example it could be the case that an emotional state was triggered first which in turn resulted in a bodily state. Another example is that subjects primed with concepts related to elderly people (e.g., "gray", "bingo", "wrinkles") exhibited embodiment effects such as slower movement when leaving the experimental lab, as compared to a control group primed with neutral words. Several other studies also show similar effects (cf. Barsalou et aI. 2003). Secondly, the observation of bodily states in others often results in bodily mimicry in the observer. Studies show, for instance, that experimental subjects often mimic an experimenter's actual behavior, e.g., rubbing the nose or shaking a foot, and that subjects also tend to mimic observed facial expressions, a fact documented widely in the literature (cf. Barsalou et aI. 2003). Thirdly, bodily states produce affective states, which mean that embodiment not only facilitates a response to social stimuli but also produces tentative stimuli. For example, evidence offacial elicitation or facial feedback is well documented in the literature. According to Barsalou et aI., many studies demonstrate that the shaping of the face to an emotional expression tends to produce the corresponding affective state. For example, it has been demonstrated that subjects rated cartoons differently when holding a pen between their lips than when holding it between their teeth. The
146
Jessica Lindblom and Tom Ziemke
latter triggered the same musculature as smiling, which made the subjects rate the cartoons as funnier, whereas holding the pen between the lips activates the same muscles as frowning and consequently has the opposite effect. That means, the expressions associated with the musculature affected the evaluation of the cartoon although the subjects were not even aware of the fact that their musculature had been manipulated into a facial expression. Similarly, evidence of bodily elicitation has been provided in studies showing that bodily positions or postures actually influenced the subjects' affective state, for example, that subjects in an upright position experienced more pride than subjects in a slump position (cf. Barsalou et al. 2003). Finally, compatibility between bodily and cognitive states enhances performance. For instance, several motor performance compability effects have been reported, and it has been demonstrated that subjects actually responded faster to "positive" words (e.g., "love") than "negative" words (e.g., "hate") when asked to pull a lever towards them. Other studies have shown that embodiment-cognition compatibility reduces the required processing resources in secondary task performance. Taken together, these examples and other studies demonstrate that there is a strong relation between embodied and cognitive states. Barsalou et al. concluded that embodiment is fundamental in human cognitive processing, and that a common mechanism seems to produce such effects across various domains. Barsalou et al. (2003) offered a theoretical framework of social embodiment to explain these phenomena based on internal simulations (cf. Svensson, Lindblom and Ziemke this volume), which can be considered as a mix composed of the traditional view of symbolic information processing, mental imagery and embodiment. Even though their framework is an important step towards highlighting the importance of sensorimotor aspects in cognitive processing, it still constitutes an example of the position that Clark (1999) referred to as "simple embodiment". According to this view, traditional cognitive science can roughly remain the same; or, in Clark's words, "facts about embodiment and environmental embedding" can be treated as mere "constraints upon a theory of inner organization and processing". In Barsalou et al.' s case, for example, it could be said that their theory adds an embodied icing to the traditional information-processing cake in the sense that it acknowledges embodiment by taking as its starting point perceptual/modal representations. Yet it continues to explain cognition largely in terms of internal representations and the computational processes manipulating them.
Embodiment and social interaction: A cognitive science perspective
147
The position of radical embodiment, on the other hand, is, as Clark (1999:348) formulated it, "radically altering the subject matter and theoretical framework of cognitive science" by challenging traditional conceptions of representation and computation and questioning the conceptual separations between perception and action as well as between brain, body and world. Work on so-called mirror-neurons as the neurobiological mechanism of "intercorporeality" as the basis of social interaction (cf. Gallagher this volume) as well as simulation theories of mind-reading (cf. Gallese and Goldman 1998) are good examples of more radically embodied views of social cognition. Dautenhahn (1997), for example, hypothesized that a phenomenological dimension of social understanding might be founded in embodied mechanisms that allow humans to read social signs and other agent's minds. She suggested that an agent's own body can be used as the point of reference for "simulating" another agent's emotional stance. Such a mechanism might be found in a particular type of visuo-motor neurons in premotor cortex, so-called mirror neurons, which are active in both the execution of goal-related motor actions (e.g. grasping movements) and their observation in conspecifics (e.g. Gallese and Goldman 1998). Although the existence of mirror neurons so far has been experimentally confirmed only in monkeys4, there is strong empirical evidence that such a system actually is present in human beings as well (e.g., Arbib 2006; Gallese et al. 2002). This suggests that the function of this matching system might be a part of, or a precursor to, a general mind-reading capability that allows you to adopt the point of view of other conspecifics by matching or simulating their mental states with a resonant state of your own, i.e., putting yourself in another's "shoes" in order to understand or predict mental states and behavior. Empirical findings support the general idea (cf. Svensson, Lindblom and Ziemke this volume). For example, it has been found that observers undertake motor facilitation in the same muscles as used by the observed individual. That means, even while only observing actions of another individual, a neural "triggering" event in fact takes place in the observer. Gallese et al. (2002: 459) therefore suggested "that the capacity to empathize with others [...] may rely on a series of matching mechanisms that we just have started to uncover". Although Gallese and Goldman (1998) have stressed
4. Single neuron recordings of the type used in these experiments eventually destroy the neurons recorded from, and thus for ethical reasons cannot be used in humans.
148
Jessica Lindblom and Tom Ziemke
that imitation behavior has not been observed in connection to mirror neuron activity, Rizzolatti et al. (2002) have hypothesized that various lowand high-level resonance mechanisms may be used under the banner of "imitation", ranging from response facilitation to emulation and "true" imitation. Currently the role of mirror neurons in embodied simulation processes as the basis of social interaction in general (e.g., Gallese, Keysers and Rizzolatti 2004), and language in particular (cf. Arbib 2005; Rizzolatti and Arbib 1998), is receiving a great deal of attention. Nevertheless, these mechanisms still are far from being well understood, and there is a need to further integrate them into embodied/situated cognitive theories to develop a more thorough understanding of the mechanisms that facilitate the human sensitivity to social cues from the perspective of radical (social) embodiment.
4.
Discussion: Simple vs. radical (social) embodiment
As discussed in the introduction, as well as in several other contributions to this volume, the role of the body in cognitive processes has received an increasing amount of attention in recent years from different perspectives and disciplines. While this certainly is a step, or in fact several steps, in the right direction, i.e., towards an interdisciplinary understanding of the embodied and socioculturally situated mind, it might be worth keeping in mind that there also is a substantial risk of premature superficial agreement. As several authors have pointed out, there currently is a wealth of diverse notions, definitions and conceptions of embodiment (cf. e.g., Chrisley and Ziemke 2003; Clark 1999; Rohrer this volume; Svensson, Lindblom and Ziemke this volume; Wilson 2002; Ziemke 2001a, 2003; Ziemke and Frank this volume). On the one hand, this can be interpreted positively since it shows that the embodiment of mind is studied from a number of perspectives which eventually might converge to form a fuller understanding than any individual discipline or approach could have produced on its own. On the other hand, after two decades of research on embodied cognition, the apparent lack of coherence, could be interpreted as confirmation of the recurrent criticism that in fact all that embodied cognitive theories have in common is their rejection of traditional cognitive science (cf. Chrisley and Ziemke 2003).
Embodiment and social interaction: A cognitive science perspective
149
That not everybody means the same thing by "embodiment" should be clear by now. As discussed in the previous section, one distinction particularly relevant to cognitive science is that made by Clark (1999) between the positions of "simple embodiment" and "radical embodiment". A typical example of the former is the position of robotic functionalism (Harnad 1989), according to which cognition is computational after all, but mental representations need to be "grounded" in sensorimotor interaction with the environment (Hamad 1990). The body's role, according to this view, is still very much that of the (software) mind's physical (hardware) interface to the world it represents. That means, although the view of grounded mental representations rejects behaviorism, the mechanistic view of the body as such is largely maintained (cf. Ziemke 2001b). While this view of embodiment and representation grounding is often not formulated explicitly, it nevertheless implicitly underlies many current conceptions of embodied cognition, in particular in artificial intelligence (cf. Ziemke 2004). This fact is also reflected in the way that some of the theories discussed in Section 2 have been incorporated into today's research. Von Uexkiill's notion of Umwelt, for example, i.e., the idea that each creature necessarily has its own subjective, perception- and action-dependent view of the world, has received much attention in' embodied artificial intelligence and robotics (e.g., Brooks 1986; Clark 1997; Emmeche 2001; Prem 1997; Sharkey and Ziemke 1998; Ziemke and Sharkey 2001). However, the organismic roots (cf. Damasio 1995, 1999) and the phenomenal nature of subjective experience, which were strongly emphasized by von Uexkiill and Merleau-Ponty, are still largely ignored. Similarly, the developmental psychology of Vygotsky and Piaget has received much attention in humanoid robotics. Nonetheless in these cases it is also the functional perspective on agent-environment interaction that has been adopted, while the biological dimension has been largely ignored or abstracted in the form of computationalleaming mechanisms (cf. Lindblom and Ziemke 2003). It should be noted that, while the examples from artificial intelligence are perhaps the most obvious cases (cf. Ziemke 2004), instantiations of the simple embodiment position are of course equally common in other areas of cognitive SCIence. A prime example of "radical embodiment", on the other hand, is the work of Maturana and Varela (1980, 1987) on autopoiesis and the biology of cognition (cf. Johnson and Rohrer this volume). As we have discussed in detail elsewhere (Ziemke and Sharkey 2001; Ziemke 2001b), the notion of organisms as autopoietic, i.e., self-creating and maintaining, is very similar
150
Jessica Lindblom and Tom Ziemke
in spirit to von Uexkiill's distinction between organisms as autonomous subjects and mechanisms as heteronomous physical objects. Particularly noteworthy in the context of radical embodiment is also Varela, Thompson and Rosch's (1991) work towards an enactive cognitive science (cf. also Thompson and Varela 2001), that has drawn much inspiration from the works of Merleau-Ponty and Piaget, among others, and in its turn inspired much current work seeking to further integrate phenomenology with embodied cognitive science and neuroscience (e.g., Dreyfus 2002; Gallagher this volume; Thompson and Varela 2001). Thus, many aspects of precognitive-science theoretical biology, psychology and philosophy have found their way into current theories of embodied cognition, although their interpretation can vary significantly depending on the interpreter's own commitment to simple or radical embodiment. It should also be noted that the distinction between simple and radical embodiment should neither be overrated nor underestimated. On the one hand, it really describes a continuum of positions rather than a binary distinction. Here, for example, we chose to categorize the position of Barsalou and colleagues as simple social embodiment because it maintains elements of computationalism and representationalism in the traditional sense, whereas we consider much work on mirror neurons to fall into the radical embodiment category because it pays more attention to the phenomenal nature of subjective and intersubjective experience (cf. Gallagher, this volume). But these distinctions are not as clear-cut as they might seem; for example, there are also purely computational interpretations of mirror neurons. On the other hand, it is useful, or in fact essential for developing an embodied cognitive science, to keep in mind that there are distinctions to be made and that there might be much theoretical tension hidden under a possibly premature, superficial agreement on "embodiment" and the rejection of traditional, computationalist ideas. It should be noted that traditional cognitive science "missed out" on much by simply rejecting behaviorism while ignoring the competing theories discussed here that indicated the importance of embodiment long before cognitive science picked up on it. Likewise, current embodied cognitive science would be ill advised to simply frame itself in opposition to the computationalism of traditional, supposedly disembodied cognitive science.
Embodiment and social interaction: A cognitive science perspective
5.
151
Implications for cognitive science and interactive technology
We would like to close this chapter by pointing out some issues that need to be addressed in order for current theories of embodied cognition, radical embodiment in particular, to develop into a "mature science of the mind" (Clark 1999: 349). Firstly, although embodiment has been much discussed, the role of the body in social interaction and cognition has been oversimplified. Much of the discussion has focused on what you might call the "static" body. For example, in AI it has been discussed what kind of physical realization, or implementation, of the "body" is necessary for cognitive processes, ranging from very simple agents with a minimum of sensors and motors to very complex organism-like, e.g., humanoid, robot bodies (cf. Ziemke 2001 a, 2003). However, the crucial aspect of the body in motion has received relatively little attention, although research in child development and anthropology has shown the relevance of locomotion experience for human cognition (cf. Campos et al. 2000; Famell 1995, 1999; Sheets-Johnstone 1999). We have elsewhere discussed in more detail (Lindblom and Ziemke 2005, 2006) the example of the so-called "nine month revolution" (Tomasello 1999), which is the time when Euro-American infants begin to develop triadic joint attention abilities and a basic understanding of a "self'. How and why this transition occurs is still little understood, but we have argued that it is no coincidence that this "revolution" occurs at around the same point in time as the onset of self-produced locomotion behavior, i.e., when infants start to creep or crawl (Lindblom and Ziemke 2005, 2006). In other words, when children begin to locomote by themselves, they acquire an individual experience of the surrounding world and how it is affected through their own actions and perceptions. It should be stressed that locomotion itself might not be the crucial factor. Instead, the child's cognitive and emotional development emerges from the dynamics of the bodily experience that result from its own locomotion behavior. As a result, the child begins to distinguish between itself and its surrounding world, i.e., the child quite literally experiences different perspectives on the world, and how these perspectives depend on its own actions, and thus a primitive "self' emerges. This emerging understanding is bootstrapped through socially scaffolded bodily experience, which gives the child access to the actual meaning of the social-communicative situation. Consequently, that understanding of perspective-taking might be used during embodied simulations of the type discussed in previous sections, allowing the child to
152
Jessica Lindblom and Tom Ziemke
simulate "off-line" what it would be like to be in the other person's situation, i.e., seeing the world from the other's physical and social perspective. This example illustrates once more how firmly "culture" and "nature", as well as "body" and "mind", are seam1ess1y intertwined and grounded in the dynamics of socially situated embodied experience. Secondly, side-stepping the dualist traps of "mind" vs. "body" as well as "nature" vs. "culture" (cf. Famell 1995; Rogoff 2003; Varela 1995) requires certain methodological consequences. A promising starting-point is to take a dialectical stance, following Vygotsky, and stress that the human mind is shaped through a course of "natural cultural development" (Rogoff 2003). Moreover, alternative analytic tools for studying the body and its movements are needed, which go beyond dista1 observers' "objective" descriptions of behavior. The latter are flawed from the perspective of embodied social interaction, since they do not consider the non-observable social situation at hand, which in many cases actually provides the meaning to the visible embodied actions. For example, it is necessary to investigate so-called "body language" or "non-verbal behavior" as one dynamically embodied system that unites mind and body in action (Famell 1995). By using the term "action" instead of "behavior", it is emphasized that socially embodied actions are a set of movements that have agency, meaning or intentions for the actual person or agent, since "bodies do not move and minds do not think - people just do" (Famel1 1995). That means, the "enactment" of the body is a social act, and in order to direct yourself, you have to consider how others will act and react in response to your own actions. In that sense, our biological embodiment constrains while cultural customs affect, but do not determine, the organization of socially embodied interactions (Fame11 1999). Consequently, there are some obvious methodological problems when observing, ana1yzing and illustrating socially embodied actions. For example, if you suppose that there is no split between verbal and non-verbal communication, that implies that there are two ways of expressing the same thought, verbal and gestura1 (non-verbal), that might be appropriate in different contexts (cf. Go1din-Meadow 2003). Hence, the analysis will have to consider both the spoken words and gestural movements. The dynamic nature of socially embodied actions causes problems when it comes to representation and illustration. Common media such as verbal descriptions, pictures or photographs, are quite "static" and therefore more dynamic representational forms might need to be developed.
Embodiment and social interaction: A cognitive science perspective
153
Thirdly, and closely related to the previous point, there is a need for an integrated framework of radical social embodiment that addresses the role and relevance of the body and its sensorimotor processes in social interaction. In doing so, one should not intend to bridge the 'gap' between, e.g., verbal vs. nonverbal interaction. Many crucial questions have been addressed only partly and/or only from one of the above perspectives. For example, what role(s) do bodily movement and gesture play in cognitive processes? How do the body and its sensorimotor processes affect social interactions and, more specifically, which social cognitive processes are affected, and what functional roles does the body serve? Such a framework of the fundamental roles of embodiment in social interaction is developed by Lindblom (2006, 2007). The developed framework explains and illustrates how embodiment is the part and parcel of social interaction and cognition in the most general and specific ways, in which dynamically embodies actions themselves provide meaning and agency. Furthermore, as the multitude of perspectives briefly described here indicates, there is also a need to investigate and analyze the different notions, perspectives and aspects of social embodiment addressed in the area, and disentangle their different roles in social interaction. For the moment, different perspectives on social interaction range from environment-mediated cooperation in social insects over the "expressiveness" of different bodily communication abilities in animals, humans and robots to theory of mind in humans (and to some degree in other primates and robots). But, from a comparative perspective, it still remains unclear to what degree different kinds and levels of embodied social interactions are grounded in bodily and/or cognitive differences but see Lindblom 2007. Last, but not least, it should be noted that investigating the role of embodiment in social interaction and cognition is not just yet another variation of the old philosophical mind-body problem. It also is highly relevant to and has strong implications for the development of interactive technology for human-computer interaction (HCI) and computer-supported cooperative work (CSCW) (cf. also Emmeche this volume). This is due to the fact that with the increasing use of technology in practically all areas of human life, social aspects have become more and more relevant. This is equally true for technology-supported social interaction between humans, i.e., the increasing use of advanced communication technology in social interaction or collaborative work. Moreover, human users interact with technology that increasingly tends to adopt (partly) social modes of interaction, e.g., between humans and humanoid robots (cf. Fong, Nourbakhsh
154
Jessica Lindblom and Tom Ziemke
and Dautenhahn 2003) or so-called "embodied" conversational computer interfaces (e.g., Cassell et al. 2000). Much of this technology is developed to better meet the strong human sensitivity to social stimuli and interactions. However, it has been noted, for example in the area of computersupported cooperative work, that there is a gap, the so-called socialtechnical gap, between what technology ideally should support socially and what currently it actually can support (cf. e.g., Ackerman 2002; Erickson and Kellogg 2002). That means, there still is a stark contrast between our embodied social interactions in the lived human world and the level of support that is offered by contemporary technology. At least partly, this is due to the fact that there is no sufficient understanding of the mechanisms underlying human social interaction which would allow a more seamless use or integration of technology. In particular the role of embodiment in human social interaction is still far too little understood, and current designs of information and communication technology are still dominated by outdated information processing models of human communication. In conclusion, we suggest that further re-evaluation and integration of some of the pre-cognitive-science theories discussed in this chapter, as well as other contributions to this volume, into the framework of modem theories of embodied cognition, along the lines pointed out above, can significantly contribute to both the further theoretical development of embodied cognitive science as such and its capacity to aid the development of future interactive technology.
References Ackermann, Mark S. 2002 The intellectual challenge of CSCW: The gap between social requirements and technical feasibility. In: John M. Carroll (ed.), Human-Computer Interaction in the New Millennium, 303-324. New York: Addison Wesley. AlIen, Chris and Marc Bekoff 1997 Species ofMind: The Philosophy and Biology of Cognitive Ethology. Cambridge, MA: MIT Press. Arbib, Michael A. 2005 From monkey-like action recognition to human language: An evolutionary framework for neurolinguistics. Behavioral and Brain Sciences 28: 105-167.
Embodiment and social interaction: A cognitive science perspective
155
Barbour, Ian G. 1999 Neuroscience, artificial intelligence and human nature: theological and philosophical reflections. Zygon 34 (3): 361-398. Barsalou, Lawrence W., Paula M. Niedenthal, Aron K. Barbey and Jennifer A. Ruppert 2003 Social embodiment. In: Brian H. Ross (ed.), The Psychology of Learning and Motivation 43: 43-92. San Diego, CA: Academic Press. Blakemore, Sarah-Jayne, Joel Winston and Uta Frith 2004 Social cognitive neuroscience: Where are we heading. Trends in Cognitive Sciences 8 (5): 216-222. Boden, Margaret A. 1994 Piaget. London: Fontana Press. Breland, Keller and Marian Breland 1961 The misbehavior of organisms. American Psychologists 16: 661664. Brooks, Rodney A. 1986 Achieving artificial intelligence through building robots. MIT AI Lab Memo 899. Burgoon, Judee K., David B, Buller and William G Woodall 1996 Nonverbal Communication: The Unspoken Dialogue (2 nd ed). New York: McGraw-Hill. Campos, Joseph J., David I. Anderson, Marianne A. Barbu-Roth, Edward M. Hubbard, Matthew J. Hertenstein and David Witherington 2000 Travel broadens the mind. Infancy 1 (2): 149-219. Cassell, Justine, Joseph Sullivan, Scott Prevost and Elizabeth Churchill (eds.) 2000 Embodied Conversational Agents. Cambridge, MA: MIT Press. Chrisley, Ron and Tom Ziemke 2003 Embodiment. In: Encyclopedia of Cognitive Science, 1102-1108. London: Macmillan. Cisek, Paul 1999 Beyond the computer metaphor. In: Rafael Nunez and WaIter J. Freeman (eds.), Reclaiming Cognition - the Primacy of Action, Intention and Emotion, 125-142. Exeter: Imprint Academic. Clancey, William J. 1997 Situated Cognition: On Human Knowledge and Computer Representations. New York: Cambridge University Press. Clark, Andy 1997 Being There - Putting Brain, Body and World Together Again. Cambridge, MA: MIT Press. 1999 An embodied cognitive science? Trends in Cognitive Sciences 3(9): 345-351.
156
Jessica Lindblom and Tom Ziemke
Cole, Michael and James V. Wertsch 1996 Beyond the individual-social antimony in discussions of Piaget and Vygotsky. Human Development 39 (5): 250-256. Cosmides, Leda, John Tooby, Jonathan H. Turner and Boris M. Velichkovsky 1997 Looking back: Historical context of present practice - biology and psychology. In: Peter Weingart, Sandra D. Mitchell, Peter J. Richerson and Sabine Maasen (eds.), Human by Nature - between Biology and the Social Sciences, 52-64. Mahwah, NJ: Lawrence Erlbaum. Costall, Alan this vol. Bringing the body back to life: James Gibson's ecology of embodied agency. Damasio, Antonio R. 1995 Descartes' Error: Emotion, Reason and the Human Brain. New York: Avon Books. 1999 The Feeling of What Happens: Body and Emotion in the Making of Consciousness. New York: Harcourt Inc. Darwin, Charles 1859 The Origin ofSpecies. London: Murray. 1872 The Expression ofEmotions in Man and Animals. London: Murray. 1880 The Power ofMovements in Plants. London: Murray. Dautenhahn, Kerstin 1997 I could be you: The phenomenological dimension of social understanding. Cybernetics and Systems 25 (8): 417-453. Davydov, V. V. and L. A. Radzikhovskii 1985 Vygotsky's theory and the activity-oriented approach to psychology. In: James V. Wertsch (ed.), Culture, communication and cognition, 35-65. New York: Cambridge University Press. Dewey, John 1896 The reflex arc concept in psychology. Psychological Review 3: 357370. Dreyfus, Hubert L. 1979 What Computers Can't Do - a Critique of Artificial Reason. New York: Harper and Row. 2002 Intelligence without representation - Merleau-Ponty's critique of mental representation: The relevance of phenomenology to scientific explanation. Phenomenology and the Cognitive Sciences 1 (4): 367383. Emmeche, Claus 2001 Does a robot have an Umwelt? Reflections of the qualitative biosemiotics of Jakob von Uexkiill. Semiotica 134 (1/4): 653-693. this vol. On the biosemiotics of embodiment and our human cyborg nature.
Embodiment and social interaction: A cognitive science perspective
157
Erickson, Thomas and Wendy A. Kellogg 2002 Social translucence: Designing systems that support social processes. In: John M. Carroll (eds.), Human-Computer Interaction in the New Millennium, 325-330. New York: Addison Wesley. Famell, Brenda 1995 Do You See What I Mean? Plains Indian Sign Talk and the Embodiment ofAction. Austin: University of Texas Press. Famell, Brenda 1999 Moving bodies, acting selves. Annual Review of Anthropology 28: 341-373. Freeman, WaIter J. and Rafael NUflez 1999 Restoring to cognition the forgotten primacy of action, intention and emotion. In: Rafael Nunez and Waiter J. Freeman (eds.), Reclaiming Cognition - the Primacy of Action, Intention and Emotion, ix-xix. Exeter: Imprint Academic. Fong, Terrence, Illah Nourbakhsh and Kerstin Dautenhahn 2003 A survey of socially interactive robots. Robotics and Autonomous Systems 42: 143-166. Gallagher, Shaun this vol. Phenomenological and experimental contributions to understanding embodied experience. Gallese, Vittorio and Alvin Goldman 1998 Mirror neurons and the simulation theory of mind-reading. Trends in Cognitive Sciences 2 (12): 493-501. Gallese, Vittorio, Christian Keysers and Giacomo Rizzolatti 2004 Mirror neurons and the simulation theory of mind-reading. Trends in Cognitive Sciences 8: 396-403. Gallese, Vittorio, Pier F. Ferrari, Evelyn Kohler and Leonardo Fogassi 2002 The eyes, the hand and the mind: behavioral and neurophysical aspects of social cognition. In: Marc Bekoff, Colin AlIen and Gordon M. Burghardt (eds.), The Cognitive Animal - Empirical and Theoretical Perspectives on Animal Cognition, 451-461. Cambridge, MA: MIT Press. Gardner, Howard 1987 The Mind's New Science. New York: Basic Books. Garcia, J. and R. A. Koelling 1966 Relation to cue to consequence in avoidance learning. Psychonomic Science 4: 123-124. Goldin-Meadow, Susan 2003 Hearing Gesture: How Our Hands Help Us Think. Cambridge, MA: The Belknap Press of Harvard University Press.
158
Jessica Lindblom and Tom Ziemke
Hamad, Stevan 1990 The symbol grounding problem. Physica D 42:335-346. 1989 Minds, Machines and Searle. Journal ofExperimental and Theoretical Artificial Intelligence 1: 5-25. Heidegger, Martin 1962 Being and Time. New York: Harper and Row. Originally published in German in 1927. Hendriks-Jansen, Horst 1996 Catching Ourselves in the Act: Situated Activity, Interactive Emergence, Evolution and Human Thought. Cambridge, MA: MIT Press. Hutchins, Edwin 1995 Cognition in the Wild. Cambridge, MA: MIT Press. Ingold, Tim Evolving skills. In: Hilary Rose and Steven Rose (eds.), Alas, poor 2000 Darwin: Arguments against Evolutionary Psychology, 273-297. New York: Harmony Books. Johnson, Mark and Tim Rohrer this vo!. We are live creatures: Embodiment, American pragmatism and the cognitive organism. Lakoff, George and Mark Johnson 1999 Philosophy in the Flesh: The Embodied Mind and its Challenges to Western Thought. New York: Basic Books. Lindblom, Jessica 2006 Embodied action as a 'helping' hand in social interaction. In: Ron Sun and Noami Miyake (eds.), Proceedings of the 28th Annual Conference of the Cognitive Science Society, 477-482. Mahwah, NJ: Lawrence Erlbaum. Minding the Body: Interacting Socially through Embodied Actions. 2007 Doctoral dissertation, University of Linkoping, University of Skovde, Sweden. Lindblom, Jessica and Tom Ziemke 2003 Social situatedness of natural and artificial intelligence: Vygotsky and beyond. Adaptive Behavior 11 (2): 79-96. 2005 Embodiment-in-motion: Broadening the social mind. In: Brono G. Bara, Lawrence Barsalou and Monica Bucciarelli (eds), Proceedings of the XXVII Annual Conference of the Cognitive Science Society, 1284-1289. Mahwah, NJ: Lawrence Erlbaum. 2006 The social body in motion: Cognitive development in infants and androids. Connection Science 18 (4): 333-346. Lloyd, Dan E. 1989 Simple Minds. Cambridge, MA: MIT Press.
Embodiment and social interaction: A cognitive science perspective
159
Loeb, Jacques 1918 Forced Movements, Tropisms and Animal Conduct. Philadelphia: Lippincott. 1890 Der Heliotropismus der Thiere und seine Ubereinstimmung mit dem Heliotropismus der Pjlanzen. Wiirzburg, Germany: Georg Hertz. Loren, Lewis A. and Eric Dietrich 1997 Merleau-Ponty, embodied cognition and the problem of intentionality. Cybernetics and Systems 28 (5): 345-358. Maturana, Humberto and Francisco J. Varela 1980 Autopoesis and Cognition: The Realization of the Living. Dordrecht, The Netherlands: D. Reidel Publishing. 1987 The Tree of Knowledge - the Biological Roots of Human Understanding. Boston: Shambalaya. McFarland, David J. 1993 Animal Behavior: Psychology, Ethology and Evolution. Harlow: Longman. Merleau-Ponty, Maurice 1962 Phenomenology of Perception. London: Routledge and Kegan Paul. Originally published in French in 1945. 1963 The Structure of Behavior. Boston, MA: Beacon Press. Originally published in French in 1942. 1969 The Visible and the Invisible. Evanston: Northwestern University Press. Originally published in French in 1964. Miller, Patricia H. 1983 Theories of Developmental Psychology. New York: W. H. Freeman and Company. Mingers, John 2001 Embodying information systems: the contribution of phenomenology. Information and Organization 11: 103-128. Minsky, Marvin 1975 A framework for representing knowledge. In: Patrick Winston (ed.), The Psychology ofComputer Vision, 211-277. McGraw Hill. Neisser, Ulric 1976 Cognition and Reality: Principles and Implications of Cognitive Psychology. San Francisco: C. A. Freeman. Neuman, Yair 2001 On Turing's carnal error: some guidelines for a contextual inquiry into the embodied mind. Systems Research and Behavioural Sciences 18: 557-564. Pfeifer, Rolf and Christian Scheier 1999 Understanding Intelligence. Cambridge, MA: MIT Press.
160
Jessica Lindblom and Tom Ziemke
Piaget, Jean The Origin of Intelligence in the Child. New York: Basic Books. 1952 Originally published in French in 1936. Sociological Studies. London: Routledge. 1995 Prem, Erich Epistemic autonomy in models of living systems. In: Phil Husbands 1997 and Inman Harvey (eds.), Fourth European Conference on Artificial Life, 48-73. Cambridge, MA: MIT Press. Priest, Stephen 1998 Merleau-Ponty. London: Routledge. Putnam, Hilary 1975 Philosophy and our mental life. In: Hilary Putnam (ed.), Mind, Language and Rationality, 48-73. Cambridge: Cambridge University Press. Riegler, Alexander 2002 When is a cognitive systems embodied? Cognitive Systems Research 3(3): 339-348. Rizzolatti, Giacomo and Arbib, Michael A. 1998 Language within our grasp. Trends in Neurosciences 21: 188-194. Rizzolatti, Giacomo, Vittorio L. Fadiga, Leonardo Fogassi and Vittorio Gallese 2002 From Mirror Neurons to Imitation: Facts and Speculations. In: Andrew Meltzoff and Wolfgang Prince (eds.), The Imitative Mind: Development, Evolution and Brain Bases, 247-266. Cambridge, MA: Cambridge University Press. Rogoff, Barbara 2003 The Cultural Nature of Human Development. New York: Oxford University Press. Rohrer, Tim this vol. The body in space: Dimensions of embodiment. Romanes, George 1882 Animal Intelligence. London: Kegan Paul. Roth, Wolff-Michael this vol. Communication as situated, embodied practice. Sapir, Edward 1928 The unconscious patterning of behavior in society. In: E. S. Dummer (ed.), The Unconscious, 114-142. New York: Knopf. Searle, John 1980 Minds, brains and programs. Behavioral and Brain Sciences 3: 417457. Semin, Giin R. and Eliot R. Smith 2002 Interfaces of social psychology and situated and embodied cognition. Cognitive Systems Research 3 (3): 385-396.
Embodiment and social interaction: A cognitive science perspective
161
Sharkey, Noel E. and Tom Ziemke 1998 A consideration of the biological and psychological foundations of autonomous robotics. Connection Science 10 (3-4): 361-391. Sheets-Johnstone, Maxine 1999 The Primacy ofMovements. Amsterdam: John Benjamins. Sherrington, Charles S. 1906 The Integrative Action of the Nervous System. New York: C. Scribner's Sons. Sinha, Chris and Kristine Jensen de Lopez 2000 Language, culture and the embodiment of spatial cognition. Cognitive Linguistics 11: 17-41. Sparks, John 1982 The Discovery ofAnimal Behaviour. London: Collins. Svensson, Hemik, Jessica Lindblom and Tom Ziemke this vol. Making sense of embodied cognition: simulation theories of shared neural mechanisms for sensorimotor and cognitive processes. Thompson, Evan and Francisco J. Varela 2001 Radical embodiment: Neural dynamics and consciousness. Trends in Cognitive Sciences 5: 418-425. Tomasello, Michael 1999 The Cultural Origins ofHuman Cognition. Cambridge, MA: Harvard University Press. Trevarthen, Colwyn 1977 Descriptive analysis of infant communicative behaviour. In: H. R. Schaffer (ed.), Studies in Mother-Infant Interaction Proceedings of Loch Lomond Symposium, 227-270. New York: Academic Press. Varela, Charles R. 1992 Harre and Merleau-Ponty: beyond the absent moving body in embodied social theory. Journal for the Theory ofSocial Behaviour 24 (2): 167-185. 1995 Cartesianism revisited: The ghost in the moving machine or the lived body. In: Brenda Famell (ed.), Human Action Sign in Cultural Contexts: the Visible and Invisible in Movement and Dance, 216-293. New Jersey: Scarecrow Press. Varela, Francisco J., Evan Thompson and Eleanor Rosch 1991 The Embodied Mind: Cognitive Science and Human Experience. Cambridge, MA: MIT Press. von Glasersfeld, Emst 1995 Radical Constructivism - a Way ofLearning and Knowing. London: Falmer Press.
162
Jessica Lindblom and Tom Ziemke
von Sachs, Julius 1882 Vorlesungen aber Pflanzenphysiologie. Leipzig, Germany: W. Engelmann. von Uexkiill, Jakob 1928 Theoretische Biologie. Berlin: Springer. 1957 A stroll through the worlds of animal and men a picture book of invisible worlds. In: Claire H. Schiller (ed.), Instinctive BehaviourThe Development of a Modern Concept, 5-80. New York: International Universities Press. Also appeared in 1992 in Semiotica 89(4): 319-391. Originally published in German in 1934. Vygotsky, Lev S. 1978 The Mind in Society: The Development ofHigher Mental Processes. Cambridge, MA: MIT Press. Original work published in Russian in 1934. Wadsworth, Barry J. 1996 Piaget's Theory of Cognitive and Affective Development (5 th ed.). New York: Longman Publishers. Wertsch, James V. (ed.) 1985 Vygotsky and the Social Formation of Mind. Cambridge, MA: Harvard University Press. Wilson, Edward O. 1975 Sociobiology - the New Synthesis. Cambridge, MA: Harvard University Press. Wilson, Margaret 2002 Six views of embodied cognition. Psychonomic Bulletin and Review 9 (4): 625-636. Ziemke, Tom 2001 a Are robots embodied? In: Christian Balkenius et al. (eds), Proceedings of the First International Workshop on Epigenetic Robotics: Modelling Cognitive Development in Robotic Systems, 75-83. Lund, Sweden: Lund University Cognitive Series, vol, 85. 2001 b The construction of 'reality' in the robot. Foundations of Science 6 (1): 163-233. 2002 Introduction to the special issue on situated and embodied cognition. Cognitive Systems Research 3 (3): 271-274. What's that thing called embodiment? In: Richard Alterman and 2003 David Kirsh (eds), Proceedings of the 25 th Annual Meeting of the Cognitive Science Society, 1305-1310. Mahwah, NJ: Lawrence Erlbaum. 2004 Embodied AI as science: Models of embodied cognition, embodied models of cognition, or both? In: Fumiya Iida et al. (eds), Embodied AI, 27-36. Heidelberg: Springer Verlag.
Embodiment and social interaction: A cognitive science perspective
163
Ziemke, Tom and Noel E. Sharkey 2000 A stroll through the worlds of robots and animals: Applying Jakob von Uexkiill's theory of meaning to adaptive robots and artificial life. Semiotica 134 (1-4): 701-746. Ziemke, Tom and Roslyn M. Frank this vol. Introduction: The body eclectic.
Section B
Body and mind
Representing actions and functional properties in conceptual spaces Peter Giirdenfors
Abstract The book Conceptual Spaces (Gardenfors 2000) presents a theory of concepts based on geometrical and topological structures in spaces that are built up from "quality dimensions". Most of the examples in the book deal with perceptual concepts based on dimensions such as colour, size, shape and sound. However, many of our everyday concepts are based on actions and functional properties. For instance most artefacts, such as chairs, clocks and telephones, are categorized on the basis of their functional properties. After giving a general presentation of conceptual spaces, I suggest how the analysis in terms of conceptual spaces can be extended to actions and functional concepts. Firstly, I will argue that "action space" can, in principle, be analysed in the same way as e.g. colour space or shape space. One hypothesis is that our categorization of actions to a large extent depends on our embodied "perception" of forces. In line with this, an action will be described as a spatio-temporal pattern of forces. When it comes to functional properties, the key idea is that the function of an object can be analysed with the aid of the actions it affords. Functional concepts can then be described as convex regions in an appropriate action space. Within Cognitive Semantics, image schemas are mainly based on perceptual and spatial dimensions (e.g. Langacker 1987; Lakoff 1987). Two exceptions are JoOOson's (1987) and Talmy's (1988) work on "force dynamics" that shows the importance of forces, and metaphorical uses of forces, for the semantics of many kinds of linguistic expression. I shall argue that a more developed understanding on "action space" would allow us to extend the semantic analyses pioneered by JoOOson and Talmy. In particular, I shall make a distinction between fIrst-person and third-person perspectives on "forces". My hypothesis is that we start out from an embodied notion of force or "power" that is then extended to forces that are exerted by other individuals and to forces that act on objects outside our control. Keywords: conceptual spaces, force dynamics, image schemas.
168
Peter Giirdenfors
1.
The problem of modelling concepts
A central problem for cognitive science is how representations should be modelled. There are currently two dominating approaches to this problem. The symbolic approach starts from the assumption that cognitive systems can be described as Turing machines. On this view, cognition is seen as essentially being computation involving symbol manipulation (e.g. Fodor 1975; Pylyshyn 1984; Pinker 1997). The second approach is associationism, where associations between different kinds of information elements carry the main burden of representation. Connectionism is a special case of associationism that models associations using artificial neuron networks (e.g. Rumelhart and McClelland 1986; Quinlan 1991). Both the symbolic and the associationist approaches have their advantages and disadvantages. They are often presented as competing paradigms, but since they are used to analyse cognitive problems on different levels of granularity, they should rather be seen as complementary methodologies. However, there are several aspects of concept formation for which neither symbolic representation nor connectionism seem to offer appropriate modelling tools. In this chapter, I will advocate a third way to represent information that is based on using geometrical structures rather than symbols or connections between neurons. Using these structures, similarity relations can be modelled in a way that accords well with human (and animal) judgments. The notion of similarity is crucial for the understanding of many cognitive phenomena. I shall call this way of representing information the conceptual form since I believe that such representations can account for more of the essential aspects of human concept formation than symbolic or connectionist theories. Based on my recent book (Gardenfors 2000), I shall first present a theory of conceptual spaces as a particular framework for representing information on the conceptual level. A conceptual space is built up from geometrical representations based on a number of quality dimensions. Most of the examples I discussed in my book deal with perceptual concepts based on dimensions such as colour, size, shape and sound. However, there is strong evidence that many of our everyday concepts are based on actions and functional properties. For instance, most artefacts, such as chairs, clocks and telephones, are categorized on the basis of their functional properties (Nelson 1986; Mandler 2004). In this chapter, I shall outline how the analysis in terms of conceptual spaces can be extended to functional concepts. Firstly, I will argue that
Representing actions andfunctional properties in conceptual spaces
169
"action space" can, in principle, be analysed in the same way as e.g. colour space or shape space. One hypothesis is that our categorization of actions to a large extent depends on our "perception" of forces. In line with this, an action will be described asa spatio-temporal pattern of forces. I shall also argue that the most cognitively fundamental forces are those that act upon or emanate from one's own body. In this sense my analysis will be based on an embodied perspective. When it comes to functional properties, the key idea is that the function of an object can be analysed with the aid of the actions it affords. Functional concepts can then be described as convex regions in an appropriate action space. I shall outline a research programme indicating that action space should be seen as a special case of a conceptual space.
2.
Quality dimensions
As introductory examples of quality dimensions one can mention temperature, weight, brightness, pitch and the three ordinary spatial dimensions height, width and depth. I have chosen these examples because they are closely connected to what is produced by our sensory receptors (Schiffman 1982). The spatial dimensions of height, width and depth as well as brightness are perceived by the visual sensory system, pitch by the auditory system, temperature by thermal sensors and weight, finally, by the kinaesthetic sensors. However, since there are also quality dimensions that are of an abstract non-sensory character, one aim of this chapter is to argue that force dimensions are important for the analysis of action concepts and functional categories. Quality dimensions correspond to the different ways stimuli are judged to be similar or different. In most cases, judgments of similarity and difference generate an ordering relation of stimuli (Clark 1993: 114). For example, one can judge tones by their pitch that will generate an ordering of the perceptions. The general assumption is that the smaller the distance is between the representations of two objects, the more similar they are. In this way, the similarity of two objects can be defined via the distance between their representing points in the space. The dimensions form the "framework" used to assign properties to objects and to specify relations between them. The coordinates of a point within a conceptual space represent particular instances of each dimension, for example, a particular temperature, a particular weight, etc.
170
Peter Giirdenfors
The notion of a dimension should be understood literally. It is assumed that each of the quality dimensions is endowed with certain geometrical structures (in some cases they are topological or orderings). As a first example, Figure 1 illustrates such a structure, the dimension of "weight" which is one-dimensional with a zero point, and thus isomorphic to the half-line of non-negative numbers. A basic constraint on this dimension, commonly made in science, is that there are no negative weights.
o I Figure 1.
The weight dimension.
A psychologically interesting example of a domain involves colour perception. In brief, our cognitive representation of colour can be described by three dimensions. The first dimension is hue, which is represented by the familiar colour circle going from red via yellow to green and to blue and then back to red again. The topological structure of this dimension is thus different from the quality dimensions representing time or weight which are isomorphic to the real line. The second psychological dimension of colour is saturation, which ranges from grey (zero colour intensity) to increasingly greater intensities. This dimension is isomorphic to an interval of the real line. The third dimension is brightness that varies from white to black and is thus a linear dimension with end points. Together, these three dimensions, one with circular structure and two with linear, constitute the colour domain which is a subspace of our perceptual conceptual space. This domain is often illustrated by the so-called colour spindle (see figure 2). Brightness is shown on the vertical axis. Saturation is represented as the distance from the centre of the spindle. Hue, finally, is represented by the positions along the perimeter of the central circle. The circle at the centre of the spindle is tilted so that the distance between yellow and white is smaller than the distance between blue and white.
Representing actions andfunctional properties in conceptual spaces
171
White
Green
Black Figure 2.
The colour spindle.
A conceptual space can now be defined as a collection of quality dimensions. However, the dimensions of a conceptual space should not be seen as totally independent entities, rather they are correlated in various ways since the properties of the objects modelled in the space co-vary. For example, in the domain of fruits the ripeness and the colour dimensions covary. It is impossible to provide a complete list of the quality dimensions involved in the conceptual spaces of humans. Some of the dimensions seem to be innate and to some extent hardwired in our nervous system, as, for example, colour, pitch, force and probably also ordinary space. Other dimensions are presumably learned. Learning new concepts often involves expanding one's conceptual space with new quality dimensions (Smith 1989). Two-year-olds can represent whole objects, but they cannot reason about the dimensions of the object. Goldstone and Barsalou (1998: 252) note: Evidence suggests that dimensions that are easily separated by adults, such as the brightness and size of a square, are treated as fused together for children [...]. For example, children have difficulty identifying whether two obj ects differ on their brightness or size even though they can easily see that
172
Peter Giirdenfors
they differ in some way. Both differentiation and dimensionalization occur throughout one's lifetime.
Still other dimensions may be culturally dependent. Finally, some quality dimensions are introduced by science. Witness, for example, Newton's distinction between weight and mass, which is of pivotal importance for the development of his celestial mechanics, but which has hardly any correspondence in human perception. To the extent we have mental representations of the masses of objects in distinction to their weights, these are not given by the senses but have to be learned by adopting the conceptual space of Newtonian mechanics in our representations. In order to separate different uses of quality dimensions it is important to introduce a distinction between a psychological and a scientific interpretation. The psychological interpretation concerns the cognitive structures (perceptions, memories, etc) of human beings and other organisms. The scientific interpretation, on the other hand, treats dimensions as a part of a scientific theory. The distinction is relevant when the dimensions are seen as cognitive (psychological) entities, in which case their structure should not be determined by scientific theories which attempt to give a "realistic" description of the world, but by psychophysical measurements that determine how our concepts are represented. The conceptual space of Newtonian particle mechanics is, of course, based on scientific (theoretical) quality dimensions and not on psychological dimensions. The quality dimensions of this theory are ordinary space (3-D Euclidean), time (isomorphic to the real numbers), mass (isomorphic to the non-negative real numbers), and force (3-D Euclidean space). In this theory, an object is thus represented as a point in an 8-dimensional space. Once a particle has been assigned a value for these eight dimensions, it is fully described as far as Newtonian mechanics is concerned. I want to make it clear that the dimensions I consider in my analysis of concepts should be given the psychological interpretation. This applies in particular to the dimension of "force" that will be analysed in the latter sections of this chapter (5-8). A problem for my distinction may be that in Western cultures, the psychological concept of "force" has been tainted by the Newtonian world-view. I will return to this topic in Section 7.
Representing actions andfunctional properties in conceptual spaces
3.
173
Concept formation described with the aid of conceptual spaces
The purpose of this section is to show how conceptual spaces can be used to model concepts. I will focus on concepts that are "natural" in the sense that they can, in principle, be learned without relying on linguistic descriptions and, when described, have simple expressions in most languages. A first rough idea is to describe a natural concept as a region of a conceptual space S, where "region" should be understood as a spatial notion determined by the topology and metric of S. For example, the point in the time dimension representing "now" divides this dimension, and thus the space of vectors, into two regions corresponding to the concepts "past" and "future". But the proposal suffers from a lack of precision as regards the notion of a "region". A more precise and powerful idea is the following criterion where the geometric characteristics of the quality dimensions are utilized to introduce a spatial structure on concepts:
Criterion C: A "natural concept" is a convex region ofa conceptual space A convex region is characterized by the criterion that for very pair of points v1 and v2 in the region all points in between v1 and v2 are also in the region. The motivation for the criterion is that if some objects which are located at v1 and v2 in relation to some quality dimension (or several dimensions) both are examples of the concept C, then any object that is located between v1 and v2 on the quality dimension(s) will also be an example of C. Criterion C presumes that the notion of betweenness is meaningful for the relevant quality dimensions. This is, however, a rather weak assumption which demands very little of the underlying dimensional structure. Most concepts expressed by basic words in natural languages are natural concepts in the sense specified here. For instance, I conjecture that all colour terms in natural languages express natural concepts with respect to the psychological representation of the three colour dimensions. In other words, the conjecture predicts that if some object 01 is described by the colour term C in a given language and another object 02 is also said to have colour C, then any object 03 with a colour that lies between the colour of 01 and that of 02 will also be described by the colour term C. It is wellknown that different languages carve up the colour circle in different ways, but all carvings seem to be done in terms of convex sets. Strong support for this conjecture has been presented by Sivik and Taft (1994). Their study
174
Peter Giirdenfors
can be seen as a follow-up of the investigations of basic color terms by Berlin and Kay (1969) who compared and systematized color terms from a wide variety of languages. Sivik and Taft (1994) focused on Swedish color terms, while Taft and Sivik (1997) compared color terms from Swedish, Polish, Spanish and American English. On the other hand, the reference of an artificial colour term like "grue" (Goodman 1955) will not be a convex region in the ordinary conceptual space and thus it is not a natural concept according to Criterion C. l Another illustration of how the convexity of regions determines concepts and categorizations is the phonetic identification of vowels in various languages. According to phonetic theory, what determines the quality of a vowel are the relations between the basic frequency of the sound and its formants (higher frequencies that are present at the same time). In general, the first two formants F] and F 2 are sufficient to identify a vowel. This means that the coordinates of two-dimensional space spanned by F] and F 2 (in relation to a fixed fundamental frequency F0) can be used as a fairly accurate description of a vowel. Fairbanks and Grubb (1961) investigated how people produce and recognize vowels in "General American" speech. Figure 3 summarizes some of their findings. As can be seen from the diagram, the preferred, identified and selfapproved examples of different vowels form convex sub-regions of the space determined by F] and F2 with the given scales? As in the case of colour terms, different languages carve up the phonetic space in different ways (the number of vowels identified in different languages varies considerably), but I conjecture again that each vowel in a language will correspond to a convex region of the formant space. Criterion C provides an account of concepts that satisfies the desideratum, formulated by Stalnaker (1981: 347), that a concept "[...] must be not just a rule for grouping individuals, but a feature of individuals in virtue of which they may be grouped". However, it should be emphasized that I only view the criterion as a necessary but perhaps not sufficient condition on a natural concept. The criterion delimits the class of concepts that are useful for cognitive purposes, although it may not be sufficiently restrictive. 1. For an extended analysis of this example, see Gardenfors (1990). 2. A selfapproved vowel is one that was produced by the speaker and later approved of as an example of the intended kind. An identified sample of a vowel is one that was correctly identified by 75% of the observers. The preferred samples of a vowel are those which are "the most representative samples from among the most readily identified samples" (Fairbanks and Grubb 1961: 210).
Representing actions and functional properties in conceptual spaces
175
3K
2KI---
2
IK
500 200
Figure 3.
4.
250
500
IK
The vowel space of American English (from Fairbanks and Grubb 1961). The scale of the abscissa and ordinate are the logarithm of the frequencies of F] and F2 (the basic frequency of the vowels was 130 cps).
Relations to prototype theory
Describing concepts as convex regions of conceptual spaces fits very well with the so called prototype theory of categorization developed by Rosch and her collaborators (Rosch 1975, 1978; Mervis and Rosch 1981; see also Lakoff 1987). The main idea of prototype theory is that within a category of objects, like those instantiating a concept, certain members are judged to be more representative of the category than others. For example, robins are judged to be more representative of the category "bird" than are ravens, penguins and emus; and desk chairs are more typical instances of the cate-
176
Peter Giirdenfors
gory "chair" than rocking chairs, deck-chairs, and beanbag chairs. The most representative members of a category are called prototypical members. It is well-known that some concepts, like "red" and "bald" have no sharp boundaries and for these it is perhaps not surprising that one finds prototypical effects. However, these effects have been found for most concepts including those with comparatively clear boundaries like "bird" and "chair". In traditional philosophical analyses of concepts based on truthconditions or functions from possible worlds to extensions (Montague 1974), it is very difficult to explain such prototype effects (see Gardenfors 2000, section 3.3).3 Either an object is a member of the class assigned to a concept (relative to a given possible world) or it is not and all members of the class have equal status as category members. Rosch's research has been aimed at showing asymmetries among category members and asymmetric structures within categories. Since the traditional definition of a concept neither predicts nor explains such asymmetries, something else must be gOIng on. In contrast, if concepts are described as convex regions of a conceptual space, prototype effects are indeed to be expected. In a convex region, one can describe positions as being more or less central. For example, if colour concepts are identified with convex subsets of the colour space, the central points of these regions would be the most prototypical examples of the colour. In a series of experiments, Rosch has been able to demonstrate the psychological reality of such "focal" colours. For another illustration, we can return to the categorization of vowels presented in the previous section. Here the subjects' different kinds of responses show clear prototype effects. For more complex categories like "bird" it is perhaps more difficult to describe the underlying conceptual space. However, if something like Marr and Nishihara's (1978) analysis of shapes is adopted, we can begin to see how such a space would appear. 4 Their scheme for describing biological forms uses hierarchies of cylinder-like modelling primitives. Each cylinder is described by two coordinates (length and width). Cylinders are combined by determining the angle between the dominating cylinder and the 3. Indeed, the approach to semantics in truth-functional semantics is antipsychological in the sense that the goal is to provide an analysis of the meaning of words and sentences that is independent of human cognition. 4. This analysis is expanded in Marr (1982, Ch. 5). A related model, together with some psychological grounding, is presented by Biederman (1987).
Representing actions and functional properties in conceptual spaces
177
added one (two polar coordinates) and the position of the added cylinder in relation to the dominating one (two coordinates). The details of the representation are not important in the present context, but it is worth noting that on each level of the hierarchy an object is described by a comparatively small number of coordinates based on lengths and angles. Hence, the object can be identified as a hierarchically structured vector in a (higher order) conceptual space. Figure 4 provides an illustration of the hierarchical structure of their representations. It should be noticed that this representation of animal concepts is purely shape-based. Animal concepts depend on many other domains, some of which may be of the functional character that will be analysed in Section 6.
_--_.
..------_._-_.__ _
cylioder ....--..-------------_ __-_
.
cr.{1·······_""". '{ i 1\, ~. ·....········1]"J
ill
U biped
limb
bird
~h·········~)G::::.1
~l
human
thick limb
<"l'Strich
n f ~
~~ jj
U thIn limb
_- __
...................................................
Figure 4.
....
ape
.......-..
dove
_----------'
A flIst-order approximation of shape space (from Marr and Nishihara 1978).
178
Peter Giirdenfors
Even if different members of a category are judged to be more or less prototypical, it does not follow that some of the existing objects must represent "the prototype". If a concept is viewed as a convex region of a conceptual space, this is easily explained, since the central member of the region (if unique) is a possible individual in the sense discussed above (if all its dimensions are specified) although it need not be among the existing members of the category. Such a prototype point in the region need not be completely described, but is normally represented as a partial vector, where only the values of the dimensions that are relevant to the concept have been determined. For example, the general shape of the prototypical bird would be included in the vector, while its colour or age would presumably not. It is possible to argue in the converse direction too and show that if prototype theory is adopted, then the representation of concepts as convex regions is to be expected. Assume that some quality dimensions of a conceptual space are given, for example, the dimensions of colour space described above, and that we want to partition it into a number of categories, for example, colour categories. If we start from a set of prototypes P 1, ..., Pn of the categories, for example, the focal colours, then these should be the central points in the categories they represent. One way of using this information is to assume that for every point P in the space one can measure the distance from P to each of the Pi's, that is, that the space is metric. If we now stipulate that p belongs to the same category as the closest prototype Pb it can be shown that this rule will generate a partitioning of the space that consists of convex areas (convexity is here defined in terms of an assumed distance measure). This is the so-called Voronoi tessellation, a two-dimensional example of which is illustrated in Figure 5. Thus, assuming that a metric is defined on the subspace that is subject to categorization, by this method a set of prototypes will generate a unique partitioning of the subspace into convex regions. Hence there is an intimate link between prototype theory and the proposed analysis where concepts are described as convex regions in a conceptual space.
Representing actions andfunctional properties in conceptual spaces
Figure 5.
5.
179
Voronoi tessellation based on six prototypes.
Representing actions by forces
So far, the examples have all been of a static nature where the properties modelled are not dependent on the time dimension. However, it is obvious that a considerable part of our cognitive representations concern dynamic properties (see, for example, van Gelder 1995; Port and van Gelder 1995).5 If we, for the moment, consider what is represented in natural languages, verbs normally express dynamic properties of objects, in particular actions. Such dynamic properties can also be judged with respect to similarities: "walking" is more similar to "running" than to "throwing". An important question is how the meaning of such verbs can be expressed with the aid of conceptual spaces. One idea comes from Marr and Vaina (1982), who extend Marr and Nishihara's (1978) cylinder models to an analysis of actions. In Marr and Vaina's model an action is described via differential equations for movements of the body parts of, say, a walking human (see Figure 6).6
5. To be accurate, van Gelder and his affiliates would avoid using the notion of representation since they associate this with the symbolic approach to cognition. See also the discussion in Johnson and Rohrer (this volume). 6. More precisely, Marr and Vaina (1982) only use differential inequalities, for example, expressing that the derivative of the position of the upper part of the right leg is positive in the forward direction during a particular phase of the walking cycle.
180
Peter Gardenfors
HUMAN
A1
l-lJ
RIGHT
LEG
LEFT
LEG
......
t
Figure 6.
"Walking" represented by cylinder figures and differential equations (fromMarr and Vaina 1982).
Applying Newtonian mechanics, it is clear that these equations can be derived from the forces that are applied to the legs, arms, and other moving parts of the body. Even though our cognition may not be built precisely for Newtonian mechanics, it appears that our brains have evolved the capacity for extracting the forces that lie behind different kinds of movements and action (see below). In accordance with this, I submit that the fundamental cognitive representation of an action consists of the pattern offorces that generates it. However, it should be emphasized that the "forces" represented by the brain are psychological constructs and not the scientific dimension introduced by Newton. The patterns of forces are thus embodied and they can be seen as a form of "mimetic schemas" as discussed by Zlatev (this volume). Such patterns can be represented in principally the same way as the patterns of shapes are described above. For example, the force pattern involved in movements when somebody runs is different from
Representing actions andfunctional properties in conceptual spaces
181
the pattern of a person walking; and the force pattern for saluting is different from that of throwing (Vaina and Bennour 1985). There is, so far, not very much direct empirical evidence for this representational hypothesis. However, one interesting example comes from phonetics. 7 Fujisaki (1992) has developed a theory of how the fundamental frequency F 0 in speech is generated. He treats the F 0 contour as generated from a linear superposition of two force dimensions that are called phrase and accent commands. The phrase command acts over the intonation phrase, shaped as an initial rise followed by a long fall to an asymptote line. This is generated by a phrase control mechanism, activated by a pulse command with varying magnitude (see Figure 7). The accent command is a local peak on an accented syllable, generated by the accent control mechanism. The two force dimensions are implemented as muscular control of the larynx. On this approach, speech is analysed as a special form of action. In the left part of Figure 7, the two force dimensions are represented on a time scale, where the spurts on the phrase command and accent command dimensions result in the F 0 curve represented in the right part of the figure.
J __
PH_R--,:_E_C_O_MF_~1ANOS
:InF.(t)
~f:9=4..L\...t FUNOAt",1ENTAt.
FREOUENCY
CONTOUR
Figure 7.
Functional model based on two force dimensions for generating the F0 contour (from Fujisaki and 0000 1996).
Another indirect source of empirical support for the representational hypothesis comes from psychophysics. During the 1950's, the Uppsala psychologist Gunnar Johansson developed a patch-light technique for analysing biological motion without any direct shape information. 8 He attached light bulbs to the joints of actors that were dressed in black and moved in a black room. The actors were filmed while performing various actions, such 7. I wish to thank Lauri Carlson for directing me to this theory. 8. For a survey of the research, see Johansson (1973).
182
Peter Giirdenfors
as walking, running or dancing. From the films, where only the light dots could be seen, subjects could within tenths of a second recognize the action. Furthermore, the movements of the dots were immediately interpreted as coming from a human being. Later experiments by Runesson and Frykholm (1981, 1983) have shown that subjects can extract subtle details of the action, such as the gender of walkers or the weight of lifted objects (where the objects were not seen on the movies). One lesson that can be learned from the experiments by Johansson and his followers is that the kinematics of a movement contains sufficient information for identifying the underlying dynamic force patterns. Runesson (1994: 386-387) claims that we can directly perceive the forces that control different kinds of motion. He argues that one need not make any distinction between visible and hidden properties: The fact is that we can see the weight of an object handled by a person. The fundamental reason we are able to do so is exactly the same as for seeing the size and shape of the person's nose or the colour of his shirt in normal illumination, namely that information about all these properties is available in the optic array.
According to his perspective, the information that our senses, primarily vision, receive about the movements of an object or an individual is sufficient for our brains to be able to extract, with great precision, the underlying forces. Furthermore, the process is automatic - we cannot help but see the forces. Of course, the perception of forces is not perfect - we are prone to illusions, just as we are in all types of perception. He formulates this as a principle of kinematic specification of dynamics (the KSD-principle) that says that the kinematics of a movement contains sufficient information to identify the underlying dynamic force patterns. 9 It goes without saying that this principle accords well with the representation of actions that I have proposed here. One difference is that Runesson has a Gibsonian perspective on the perceptual information available, which means that he would find it methodologically unnecessary to consider mental constructions such as conceptual spaces. The Gibsonian perspective means that the world itself contains sufficient information about 9. In contrast to humans, recent results of causal reasoning in apes and monkeys indicate that non-human primates often fail to understand the hidden causes, in particular forces, behind certain effects (Povinelli 2000). There seems to be a paucity of research on force perception and how forces affect how we categorize actions.
Representing actions andfunctional properties in conceptual spaces
183
objects and events so that the brain can just "pick up" that information in order to categorise the entity. According to this perspective, mental representations are thus not needed. However, I will here not develop the contrasts between the representational and Gibsonian positions. Another area where actions and objects show similarities in structure is in the graded structure of the action concepts. There are good reasons to believe that actions exhibit many of the prototype effects that Rosch (1975, 1978) has presented for object categories. For example, Hemeren (1997) demonstrated that there is a strong inverse correlation (r = -.81) between judgments of most typical actions and reaction time in a WORD-ACTION verification task. He has also shown that subjects in a free listing task of words or phrases for actions show clear effects concerning base level vs. subordinate level concepts (Hemeren 1996). For example, "running" was more frequent and occurred earlier in the lists than "jogging" and "sprinting" and the same applies to "talking" in relation to subordinates such as "whispering" and "arguing". To identify the structure of the action space, similarities between actions should be investigated. However, this can be done with basically the same methods as for similarities between objects. Even though the empirical evidence is still very incomplete, my proposal is that by adding force dimensions to a conceptual space, we obtain the basic tools for analysing dynamic properties of actions and other movements. As we shall see below, the forces involved need not only be physical forces, but they can also be emotional or social forces.
6.
The cognitive neuroscience of action space
The distinction between perception and action spaces can to some extent be correlated with the findings from neuroscience on how visual information is handled in the brain. Giese and Poggio (2003) note that there is a ventral pathway from the visual cortex that handles form recognition and a corresponding dorsal pathway for motion recognition. These two pathways operate in parallel. Of special interest in relation to my hypothesis, Giese and Poggio speculate that in the dorsal motion pathway there exist neurons (located in the superior temporal sulcus) specialized for motion patterns: The representation of motion is based on a set of learned patterns. These patterns are encoded as sequences of "snapshots" of body shapes by neurons
184
Peter Giirdenfors
in the form pathway, and by sequences of complex optic flow patterns in the motion pathway. (Giese and Poggio 2003: 181)
On the surface, Giese and Poggio' s model does not concern dynamics, but kinematics since they describe a sequence of "snapshots" of a movement. Better evidence for dynamic representation of motion comes, for example, from the literature on representational momentum (Freyd and Finke 1984). In one of the first experiments on this phenomenon, Freyd and Finke showed subjects a rectangle at three positions in a possible path of orientation. Subjects were told to remember the third orientation and were then presented with a rectangle at a fourth position that was either rotated slightly less, or exactly the same, or slightly more than the remembered triangle (see Figure 8 A). Subjects found it more difficult to detect differences in the direction of the implicit motion of the sequence of rectangles. This suggests that their mental representations of the rectangles induced
I
VIV 2
Figure 8.
Two experiments on representational momentum (from Freyd and Finke 1984: 128)
a certain "momentum" that influenced their memory of the third triangle. This effect disappeared when the ordering of the two first rectangles was reversed so that the subjects could no longer perceive a path of motion (see Figure 8 B).
Representing actions andfunctional properties in conceptual spaces
185
Along the same lines, Kourtzi and Kanwisher (2000) showed subjects photos of situations that contained dynamic information. In an fMRI study, they found greater activity in the medial temporal/medial superior temporal region of cortex compared to when subjects were viewing photos with no implied motion. The medial temporal region is one of major brain areas engaged in analysis of visual motion. These glimpses from the cognitive neuroscience of action representations indicate how the brain projects forces, even when the stimuli do not contain any motion. This is a side of "embodiment" that merits further investigation. By combining experiments from cognitive psychology with different kinds of brain imaging, we may hope to acquire the empirical results needed for testing a more elaborate theory of the structure of action space.
7.
Representing functional properties in action space
Another large class of properties that cannot be analysed in terms of perceptual dimensions in a conceptual space are the functional properties that are often used for characterizing artefacts. A nice description of the role of functional properties comes from Paul Auster's novel City of Glass (1992: 77): Not only is an umbrella a thing, it is a thing that performs a function - in other words, expresses the will of man. When you stop to think of it, every object is similar to the umbrella, in that it serves a function. A pencil is for writing, a shoe is for wearing, a car for driving. Now my question is this. What happens when a thing no longer performs its functions? Is it still the thing, or has it become something else? When you rip the cloth off the umbrella, is the umbrella still an umbrella? You open the spokes, put them over your head walk out into the rain, and you get drenched. Is it possible to go on calling this object an umbrella? In general, people do. At the very limit, they will say the umbrella is broken. To me this is a serious error, the source of all our troubles.
In agreement with Auster's intuition, Vaina (1983) notes that when deciding whether an object is a "chair", the perceptual dimensions of the object, like those of shape, colour, texture and weight, are largely irrelevant, or at least extremely variable. Since I have focused on such dimensions in my description of conceptual spaces, the analysis of functional properties seems to be an enigma for my theory.
186
Peter Giirdenfors
I propose to analyse these properties by reducing them to the actions that the objects "afford". To continue with the example, a chair is prototypically an object that affords back-supported sitting for one person, that is, an object that contains a flat surface at a reasonable height from the ground and another flat surface that supports the back. In support of this analysis, Vaina (1983: 28) writes: "[T]he requirement for efficient use of objects in actions induces strong constraints on the form of representation. Each object must first be categorized in several ways, governed ultimately by the range of actions in which it can be become involved." The notion of "affordance" is borrowed from Gibson's (1979) theory of perception. 10 However, he interprets the notion realistically, i.e. as independent of the viewer, while for me the affordances are always identified in relation to a conceptual space, which means that I interpret "affordance" from a cognitivist representational perspective. In more general terms, I propose that function concepts be interpreted in terms of an action space. This is in contrast to the perceptual dimensions that I have presented in my earlier examples in this chapter. To be more precise, I put forward the following special case of Criterion C: Functional properties are convex regions in action space.
The actions involved in the analysis of a functional property may then, in turn, be reduced to force dynamic patterns as was explained above. This is accomplished by representing a functional property as a vector in a highdimensional space where most dimensions are constituted of the force dimensions of the action space. In this sense, the functional space is supervenient on the action space. Functional properties are thus "higher order properties" in the sense of Gardenfors (2000, Section 3.10). The main problem with this proposal is that we know even less about the geometry and topology of how humans (and animals) structure action space than we know about how they structure shape space. This is an area where further research is badly needed. 11
10. However, as Costall (this volume) notes, Gibson's characterization of "affordance" changed over the years. 11. Within robotics, Chella, Gaglio and Pirrone (2001) use Fourier transforms of motions to represent the movements of objects and of a robot. This solution makes sense from an implementational point of view, but it is uncertain whether the brain uses anything like this to represent actions.
Representing actions andfunctional properties in conceptual spaces
187
The upshot of the proposal is that, even if this road of analysis is long and to a large extent unexplored, in principle, functional properties can be explained in terms of more basic dimensions such as forces.
8.
The embodiment of forces
In the tradition of Cognitive Semantics, the meanings of expressions have been analysed in semi-geometrical constructs called image schemas. In earlier writings, I have shown how these image schemas can be given a more precise description in terms of conceptual spaces (Gardenfors 1996). For Cognitive Semantics too, the focus has been on the spatial structure of the image schema (the very term "image" schema indicates this). Lakoff (1987: 283) goes as far as putting forward what he calls the "spatialization of form hypothesis" which says that the meanings of linguistic expressions should be analyzed in terms of spatial image schemas plus metaphorical mappings. However, there are exceptions to this emphasis on spatial structure. One researcher who at a very early stage brought forward the role of forces in cognitive semantics is Johnson (1987). He argues that forces form perceptual Gestalts that serve as image schemas (even though the word "image" may be misleading here). He writes: Because force is everywhere, we tend to take it for granted and to overlook the nature of its operation. We easily forget that our bodies are clusters of forces and that every event of which we are a part consists, minimally, of forces in interaction. [... ] We do notice such forces when they are extraordinarily strong, or when they are not balanced off by other forces. (Johnson 1987: 42)
Johnson presents a number of "preconceptual Gestalts" for force. These Gestalts function as the correspondences to image schemas but with forces as basic organizing features rather than spatial relations. The force Gestalts he presents are "compulsion", "blockage", "counterforce", "diversion", "removal of restraint", "enablement" and "attraction" (Johnson 1987: 4548). Another early exception is Talmy (1988), who emphasizes the role of forces and dynamic pattern in image schemas in what he calls "force dynamics". He develops a schematic formalism that, for example, allows him to represent the difference in force patterns in expressions like "The ball
188
Peter Giirdenfors
kept rolling because of the wind blowing on it" and "The ball kept rolling despite the stiff grass". Talmy's dynamic ontology consists of two directed forces of unequal strength, the focal called "Agonist" and the opposing element called "Antagonist", each force having an intrinsic tendency towards either action or rest, and a resultant of the force interaction, which is either action or rest. All of the interrelated factors in any force-dynamic pattern are necessarily copresent wherever that pattern is involved. But a sentence expressing that pattern can pick out different subsets of the factors for explicit reference leaving the remainder unmentioned - and to these factors it can assign different syntactic roles within alternative constructions. (Talmy1988: 61)
Despite these exceptions, it appears that the role of forces has been underrated within Cognitive Semantics. In Piaget's theory of sensory-motor schemas, developed for modelling cognitive development and not semantics, motor patterns are central. These can be seen as a special case of the dynamic patterns that form our fundamental understanding of the world. I would suggest that many ideas from the schemas of developmental psychologists can fruitfully be incorporated in the construction used by cognitive semanticists. Analysing the use of forces in Cognitive Semantics has led me to an ambiguity in the very notion of "force". In academic circles, Newtonian physics has become a role model for science; and when we speak of "force" it is natural to think of and represent them as Newtonian forces as force vectors in a conceptual space. But when it comes to everyday human thinking, it is important to distinguish between a first-person (phenomenological) and a third-person perspective of forces. From the first-person perspective, it is the forces that act directly on you that are considered. These "forces" are not just the physical Newtonian forces, but more importantly also the social or emotional forces that affect you. It is perhaps more appropriate to call forces seen from a first-person perspective "powers". First-person powers are experienced either as physical forces or as emotional or social pressures that make you move in a particular direction. From the third-person perspective, one sees forces acting upon an object from the outside, so in this case you don't experience the forces directly, but your perceptual mechanisms derive them. Therefore such forces are not embodied in the same way as in the first-person perspective. From the first-
Representing actions andfunctional properties in conceptual spaces
189
person perspective, powers act directly on you, while from the third-person perspective forces act at a distance (pace Newton). 12 One reason for why this distinction is seldom made is that we are extremely good at perceiving forces acting upon other objects. I3 As we have seen in Section 5, the Uppsala school of psychology claims that we can directly perceive the forces that control different kinds of motion. According to their Gibsonian perspective, information about the movements of an object is sufficient for our brains to extract the underlying forces. The importance of this distinction is that our understanding of the thirdperson perspective presumably derives from the first-person person perspective. (This is one reason why Newton had such problems in convincing his contemporaries about forces acting at a distance). If this is the case, then the meanings of words such as push and pull that are based on firstperson powers should be seen as cognitively more fundamental than meanings based on third-person forces. In other words, my hypothesis is that the meanings of the force elements of image schemas are grounded in the actual experience of forces on one's own body. There is much in Johnson's (1987) book that implicitly points to the centrality of the first-person "power" perspective. For one thing, he focuses on the role of interaction: "[F]orce is always experienced through interaction. We become aware of a force as it affects us or some object in our perceptual field" (Johnson 1987: 43). Interaction is primarily seen from a first-person perspective, while forces are abstractions that are seen from a third-person view. Then, in his description of the "enablement" Gestalt or schema, he explicitly focuses on first-person "powers": If you choose to focus on your acts of manipulation and movement, you can become aware of a felt sense of power (or lack of power) to perform some action. You can sense that you have the power to pick up the baby, the groceries, and the broom but not to lift the front end of your car. While there is no actualized force vector here, it is legitimate to include this structure of possibility in our common gestalts for force, since there are potential force
12. There is also a second-person perspective where the subject can "put himself in the shoes of the other". This perspective is what is involved in empathy, joint attention and other aspects of a "theory of mind" (see Gardenfors 2003, ch. 4). Some researchers put forward "mirror neurons" as a possible mechanism behind this perspective (e.g. Rizolatti and Arbib 1998; Gallese 2000) 13. However, it seems that other animal species may not have this capacity to the same extent (Povinelli 2000).
190
Peter Giirdenfors
vectors present and there is a definite "directedness" (or potential part of motion) present. (Johnson 1987: 47)
In contrast to Johnson and Talmy, I view social power relations as semantically fundamental, and physical forces that act at a distance from the subject as derived. For example, Winter and Gardenfors (1995) and Gardenfors (1998) argue that the meanings of modal verbs are based on social power rather than physical force. Even Talmy (1988: 79) concedes that "[a] notable semantic characteristic of the modals in their basic usage is that they mostly refer to an Agonist that is sentient and to an interaction that is psychosocial, rather than physical, as a quick review can show". I completely agree, but see this as an argument for the primary meaning of the modals being determined by social power relations, while the (few) uses of modals in the context of physical forces are derived meanings. In a sense, the focus on social power relations makes the conceptual analysis more intricate, because Newtonian force vectors, viewed as natural representations of the third person forces, may not be entirely appropriate to represent the emotional and social aspects of power. Again, more empirical investigations of how human subjects mentally conceive of these powers will be needed.
9.
Conclusion
The main purpose of this chapter has been to outline an extension of the theory of conceptual spaces to actions and functional properties. In the first part, I have provided an analysis of concepts with the aid of the notion of conceptual spaces. A key notion is that of a natural concept which is defined in terms of convex regions of conceptual spaces - a definition that crucially involves the geometrical structure of the various domains. As a complement to the perceptual dimensions treated in Gardenfors (2000), I have in the latter part of the chapter focused on "action space". I submit that action space can, in principle, be analysed in the same way as e.g. colour space or shape space. Admittedly, this will take extensive psychological experimentation to establish. The core hypothesis is that our categorization of actions to build on our perception of forces (which, indeed, seems to be perceptions). The hypothesis is that the cognitive representation of an action can be described as a spatio-temporal pattern of forces. I have argued that functional properties "live on" action space. When it comes to functional properties, the key idea is that the function of
Representing actions andfunctional properties in conceptual spaces
191
an object can be analysed with the aid of the actions it affords. An empirically testable prediction is that functional concepts can be described as convex regions in an appropriate action space. However, there is, so far, not much empirical support for the prediction. Consequently, it must be left as a research programme for the time being. I also believe that conceptual spaces in general and their application to force dimension in particular can be a useful tool to sharpen Cognitive Semantics. With the aid of the topological and geometric structure of the various quality dimensions, one can obtain a more precise foundation for the concept of image schemas that form the core of the theories of e.g. Lakoff (1987), Johnson (1987) and Langacker (1987). I have emphasized the role of forces in image schemas and argued that the first-person perspective on forces is more fundamental than the third-person perspective. I believe that this distinction could also be fruitfully applied within other areas of cognitive semantics.
Acknowledgements An early version of this chapter was written while the author was a fellow at the Swedish Collegium for Advanced Study in the Social Sciences (SCASSS) in Uppsala. I want to thank the Collegium for providing me with excellent working conditions. I also want to thank Paul Hemeren, Martin Raubal, Tom Ziemke, Jordan Zlatev and an anonymous referee for very helpful comments.
References Auster, Paul 1992 The New York Trilogy: City of Glass, Ghosts, The Locked Room. London / Boston: Faber and Faber. Berlin, Brent and Kay Paul 1969 Basic Color Terms: Their Universality and Evolution. Berkeley, CA: University of California Press. Biederman,lrving 1987 Recognition-by-components: A theory of human image understanding. Psychological Review 94: 115-147.
192
Peter Giirdenfors
Chella, Antonio, Salvatore Gaglio and Roberto Pirrone 2001 Conceptual representations of actions for autonomous robots. Robotics and Autonomous Systems 34: 251-263. Clark, Austen 1993 Sensory Qualities. Oxford: Clarendon Press. Costall, Alan this vol. Bringing the body back to life: James Gibson's ecology of language. Fairbanks, Grant and Patti Grubb 1961 A psychophysical investigation of vowel formants. Journal ofSpeech and Hearing Research 4: 203-219. Fodor, Jerry A. 1975 The Language of Thought. Cambridge, MA: Harvard University Press. Freyd, Jennifer J. and Ronald Finke 1984 Representational momentum. Journal of Experimental Psychology: Learning, Memory and Cognition 10: 126-132. Fujisaki, Hiroya 1992 Modeling the process of fundamental frequency contour generation. In: Y. Tohkura, E. Vatikiotis-Bateson and Y. Sagisaka (eds.), Speech Perception, Production and Linguistic Structure, 313-326. Amsterdam: IOS Press. Fujisaki, Hiroya and Sumio 0000 1996 Prosodic parameterization of spoken Japanese based on a model of the generation process of FO contours. Proceedings of 1996 International Conference on Spoken Language Processing, 4: 2439-2442. Philadelphia, PA. Gallese Vittorio 2000 The inner sense of action: Agency and motor representations. Journal ofConsciousness Studies 7 (10): 23-40. Gardenfors, Peter 1990 Induction, conceptual spaces and AI. Philosophy of Science 57: 7895. 1996 Conceptual spaces as a basis for cognitive semantics. In: Andy Clark, Jesus Ezquerro and Jesus M. Larrazabal (eds.), Philosophy and Cognitive Science, 159-180. Dordrecht: Kluwer. 1998 The pragmatic role of modality in natural language. In: Paul Weingartner, Georg Schurz and Georg Dom (eds.), The Role ofPragmatics in Contemporary Philosophy, 78-91. Vienna: Holder-PichlerTempsky. 2000 Conceptual Spaces: The Geometry of Thought. Cambridge, MA: MIT Press.
Representing actions andfunctional properties in conceptual spaces
193
Gibson, James J. 1979 The Ecological Approach to Visual Perception Hillsdale, NJ: Lawrence Erlbaum Associates. Giese, Martin A. and Tomaso Poggio 2003 Neural mechanisms for the recognition of biological movements. Nature Reviews Neuroscience 4: 179-192. Goldstone, Robert L. and Lawrence W. Barsalou 1998 Reuniting perception and conception, Cognition 65: 231-262. Goodman, Nelson 1955 Fact, Fiction and Forecast. Cambridge, MA: Harvard University Press. Hemeren, Paul E. 1996 Frequency, ordinal position and semantic distance as measures of cross-cultural stability and hierarchies for action verbs. Acta Psychologica 91: 39-66. 1997 Typicality and context effects in action categories. Proceedings of the 19th Annual Conference of the Cognitive Science Society, 949. Stanford, CA: Lawrence Erlbaum Associates. Johansson, Gunnar 1973 Visual perception of biological motion and a model for its analysis. Perception and Psychophysics 14: 201-211. Johnson, Mark 1987 The Body in the Mind: The Bodily Basis of Cognition. Chicago, IL: University of Chicago Press, Kourtzi, Zoe and Nancy Kanwisher 2000 Activation in human MT/MST by static images with implied motion. Journal ofCognitive Neuroscience 12 (1): 48-55. Lakoff, George 1987 Women, Fire and Dangerous Things. Chicago, IL: The University of Chicago Press. Langacker, Ronald W. 1987 Foundations of Cognitive Grammar. Vol. I: Theoretical Prerequisites. Stanford: Stanford University Press. Mandler, Jean 2004 Foundations ofthe Mind. Oxford: Oxford University Press. Marr, David 1982 Vision. San Francisco, CA: Freeman. Marr, David and H. Keith Nishihara 1978 Representation and recognition of the spatial organization of threedimensional shapes. Proceedings of the Royal Society in London, B 200, 269-294.
194
Peter Giirdenfors
Marr, David and Lucia Vaina 1982 Representation and recognition of the movements of shapes. Proceedings ofthe Royal Society in London, B214: 501-524. Mervis, Carolyn and Eleanor Rosch 1981 Categorization of natural objects. Annual Review of Psychology 32: 89-115. Montague,Richard 1974 Formal Philosophy: Selected Papers of Richard Montague. In: Richmond H. Thomason (ed.), New Haven, CT: Yale University Press. Nelson, Katherine 1986 Event Knowledge: Structure and Function in Development. Hillsdale, NJ: Lawrence Erlbaum Associates. Pinker, Steven 1997 How the Mind Works. New York, NY: W. W. Norton and Co. Port, Robert and Timothy van Gelder (eds.) 1995 Mind as Motion. Cambridge, MA: MIT Press. Povinelli, Daniel 2000 Folk Physics for Apes: The Chimpanzee's Theory of how the World Works. Oxford: Oxford University Press. Pylyshyn, Zenon 1984 Computation and Cognition. Cambridge, MA: MIT Press. Quinlan, Philip 1991 Connectionism and Psychology: A Psychological Perspective on New Connectionist Research. New York, NY: Harvester Wheatsheaf. Rizolatti, Giacomo and Michael Arbib 1998 Language within our grasp. Trends in Neuroscience 21: 188-194. Rumelhart, David. E. and James L. McClelland 1986 Parallel Distributed Processing, Vols. 1 and 2. Cambridge, MA: MIT Press. Rosch, Eleanor 1975 Cognitive representations of semantic categories. Journal of Experimental Psychology: General 104: 192-233. 1978 Prototype classification and logical classification: The two systems. In: Ellin Scholnik (ed.), New Trends in Cognitive Representation: Challenges to Piaget's Theory, 73-86. Hillsdale, NJ: Lawrence Erlbaum Associates. Runesson, Sverker 1994 Perception of biological motion: The KSD-principle and the implications of a distal versus proximal approach. In: Gunnar Jansson, Sten-Sture Bergstrom, and William Epstein (eds.), Perceiving Events and Objects, 383-405. Hillsdale, NJ: Lawrence Erlbaum Associates.
Representing actions andfunctional properties in conceptual spaces
195
Runesson, Sverker and Gunnar Frykholm 1981 Visual perception of lifted weights. Journal of Experimental Psychology: Human Perception and Performance 7: 733-740. 1983 Kinematic specification of dynamics as an informational basis for person and action perception. Expectation, gender recognition, and deceptive intention. Journal of Experimental Psychology: General 112: 585-615. Schiffman, Harvey Richard 1982 Sensation and Perception, 2nd edition. New York, NY: John WHey and Sons. Sivik, Lars and Charles Taft 1994 Color naming: A mapping in the NCS of common color terms. Scandinavian Journal ofPsychology 35: 144-164. Smith, Linda 1989 From global similarities to kinds of similarities - the construction of dimensions in development. In: Stella Vosniadou and Andrew Ortony (eds.), Similarity and Analogical Reasoning, 146-178. Cambridge: Cambridge University Press. Stalnaker, Robert 1981 Antiessentialism. Midwest Studies ofPhilosophy 4: 343-355. Talmy, Leonard 1988 Force dynamics in language and cognition. Cognitive Science 12: 49-100. Vaina, Lucia 1983 From shapes and movements to objects and actions. Synthese 54: 336. Vaina, Lucia and YossufBennour 1985 A computational approach to visual recognition of arm movement. Perceptual and Motor Skills 60: 203-228. Van Gelder, Timothy 1995 What might cognition be if not computation? Journal of Philosophy 92: 345-381. Winter, Simon and Peter Gardenfors 1995 Linguistic modality as expressions of social power. Nordic Journal ofLinguistics 18: 137-166.
From pre-representational cognition to language Takashi Ikegami and Jordan Zlatev
Abstract In this chapter we argue for a number of qualitative differences in human cognition: interaction vs. representation, procedural vs. declarative knowledge, dynamical categories vs. concepts, synaesthesia vs. language, the affecto-imagistic dimension, necessary for characterizing the meaning of Japanese mimetics, vs. the analytical dimension (Kita 1997, 2001). Such emphasis is necessary because there is persistent tendency in embodiment theories to "resolve" such oppositions by ignoring the differences, and thus, in effect, reducing or eliminating the second and more "disembodied" side of the oppositions. At the same time, we explore how structures and processes of pre-representational cognition such as dynamical categories, internal meanings space, and synaesthesia can play a role in the "grounding" of mental representations (concepts) and language. Keywords: active perception, complex systems, dynamic categorization, internal meaning space, Japanese mimetics, language, representation, synaesthesia.
1.
Introduction
The issue of the nature and role of representations in cognitive science is a heavily contested one, with e.g. Johnson and Rohrer (this volume) arguing for a non-representational account of cognition, including language, while e.g. Gardenfors (this volume) claims that his theory of "conceptual spaces" is fundamentally representational. While mental representation was the fundamental concept of "classical" cognitive science (e.g. Fodor 1981), along with that of computation, the 1990s witnessed the rise of "second generation" cognitive science (e.g. Varela, Thompson and Rosch 1991) making heavy use of notions such as embodiment and interaction and this approach reacted against what was perceived to be an overextension of the term "representation" to involve just about any kind of cognitive process. For example, Johnson and Lakoff
198
Takashi Ikegami and Jordan Zlatev
(2002: 249-250) point out that: "As we said in Philosophy in the Flesh, the only workable theory of representations is one in which a representation is a flexible pattern of organism-environment interactions, and not some inner mental entity that somehow gets hooked up with parts of the external world by a strange relation called 'reference'." However, there is a serious problem for such interactionist accounts to the extent that they purport to provide an explanation of language, since sensorimotor interaction is an inherently non-representational notion (in any sensible use of the term "representation"), while language is representational in two different, though related, respects: (a) it has expressioncontent structure of signs and (b) statements are about (real or imagined) states of affairs (see Sonesson this volume; Zlatev this volume). In this chapter, we hope to clarify this problem, and perhaps even offer some ingredients to its resolution. The first step involves conceptual analysis, in which we explain how we understand the concept of representation, and in particular, mental representation. We will distinguish between prerepresentational and representational cognition, and point out their respective properties. We will suggest that only true representations are properly regarded as concepts. In sections 3 and 4, respectively, we will characterize two structures of pre-representational cognition: dynamical category and internal meaning space. A dynamical category is an emergent category resulting form sensorimotor interaction with the environment. Unlike explicit, "classical" concepts it can not be characterized in terms of necessary and sufficient conditions. In these two respects it resembles two of the central concepts within Cognitive Linguistics: image schemas, as defined by Johnson (1987) and Johnson and Rohrer (this volume) and prototypes (Rosch 1973; Lakoff 1987). However, the concept of dynamical category is most closely related to Gibson's (1979) ecological psychology (Costall this volume) and in particular to what is currently known as active perception. By internal meaning space we mean a cross-modal network of dynamical categories. Similar to its constituent notion of dynamical category, the notion emphasizes the first-person, or subjective nature of cognition. Sensory input is meaningful for the organism due to its internal (neural) dynamics (Freeman 2003), which depend on an intrinsic value system (von Uexkiill1940 [1982]; Zlatev 2003). At the same time, the notion of internal meaning space captures the fact that dynamical categories are not independent but linked in a "space" of different sensory modalities and dimensions. We will suggest that such a dynamic, internal, value-laden space
From pre-representational cognition to language
199
serves as an essential intermediary between simple, reactive sensory-motor cognition and representational cognitive functions, including language. It resembles the "conceptual spaces" of Gardenfors (2000, this volume), but differs in being dynamic and value-based, and thereby affect-laden. We will also suggest than an internal meaning space is implied in the phenomenon of synaesthesia (Cytowic 2002). In section 5, we show how such a pre-representational meaning space can help account for two aspects of language: synaesthetic metaphors such as sweet smile, and Japanese mimetics such as suta-suta ('walking hurriedly'). We follow the analysis of Kita (1997, 2001), who has argued that mimetics involve an "affecto-imagistic dimension" that is distinct from the "analytic dimension" that dominates linguistic meaning. Thus, the meanings of mimetics can be related in more than name to the concept of mimetic schemas (Zlatev 2005, this volume), which constitute pre-linguistic mental representations deriving from bodily imitation. In brief, while we will argue that language cannot be reduced to structures of prerepresentational cognition, we will also show that such structures may be necessary for a complete account of it. The concepts of dynamical category and meaning space may be rather difficult to grasp, especially if they are only characterized through language, which after all is not optimal for talking about pre-representational, dynamical phenomena with fuzzy boundaries. That is why our approach will be to use complex systems (Port and van Gelder 1998; Kaneko and Ikegami 1998; Kaneko and Tsuda 2000) in order to elucidate the nature of structures of pre-representational cognition. In particular, we will describe a number of computational simulations involving "artificial creatures" which "live" in a simulated "environment" and interact with it, whereby dynamical categories and a pre-conceptual meaning space emerge. To illustrate, we will show how even abstract phenomena such as triangles and rectangles can be categorized pre-representationally via blind touching, so that the criterion for their classification is defined by the creatures' styles of action patterns, rather than by their detached concepts about the figures. We hasten to note, however, that these experiments should be regarded as a form of "Weak AI" rather than "Strong AI": there is a radical difference between simulating cognition and life and duplicating it (Searle 1992; Ziemke 2001). The "creatures" that are described in this chapter are not truly animate: they do not have any intrinsic value system, and thus have no basis for intrinsic intentionality and phenomenal experience (Zlatev 2003). We ask the reader to please remember this, since it will be tedious
200
Takashi Ikegami and Jordan Zlatev
to have to point out again and again that when we refer to "creatures" in connection with the simulations described in Section 3, we are not succumbing to animism: the attribution of life properties to inanimate matter. 1 Nevertheless, we believe that there are sufficient structural parallels between the mode of operation of models based on complex systems and real organisms to make the method justified as an analytical tool for studying "embodied" cognition.
2.
Mental representation
As we understand the concept, a representation is a structure that consists of three parts: an expression that stands for a content for a given subject. It is thus identical with the classical definition of a sign (see Emmeche this volume; Sonesson this volume). A clear example of a representation is a picture: the depicted apple cannot be eaten, but it represents (in this case iconically) an apple that can. The painting itself is the expression, and it is different from, at the same time as it corresponds to, something else. Whether this "something else" is a real specific apple, a generic apple, or an imagined apple is not important here: what is important is the expression-content structure itself. What "connects" the expression and the content is a process of interpretation: representations do not exist by themselves, but only for someone. This much is fairly uncontroversial. The real controversies begin when we ask whether there are mental representations, and if so what are they. "Classical", first-generation cognitive science found them everywhere: in thought, in language, in perception, in practical action (Gardner 1987). Some representatives of second generation, "embodied" cognitive science (Varela, Thompson and Rosch 1991; Johnson and Rohrer this volume) appear to take the opposite extreme and more or less abolish them. We believe that these extremes are equally mistaken, and a golden mean is the best answer, in line with phenomenology (Husserl 1962) and Piaget's (1945) account of the origin of "symbols" in childhood: to say that a subject has a mental representation is to say that he can (a) differentiate between the expression and content, and (b) see the first as corresponding to the second, pretty much as in the picture example, mentioned above. Son-
1. We are grateful to Goran Sonesson, for helping us emphasize this point and thus avoid a possible misinterpretation.
From pre-representational cognition to language
201
esson (1989, this volume) uses this as the major criterion to distinguish between (true) signs and simple indexicalities (based on contiguity and factorality, i.e. part-whole) and iconicities (based on resemblance), which are not signs but only a ground for indexical and iconic signs, respectively. One may (and usually does) ask: Who is the "someone" doing the differentiation and finding the correspondence? There are three types of answers, only the third of which will do: (a)
(b)
(c)
An unconscious processor or a "homunculus" in the head; this leads to infinite regress: we need to account for the ability of the homunculus to "see" the expression and content, and "figure out" that the first stands for the second, and then we need to account for the mental representations in its head, ad infinitum (Edelman 1992). The expression is a "symbol" that is associated with a "meaning" for someone else than the system that is actually using the symbols, as in a computer. Yes, but then the representation is not intrinsic to the system, but to the programmer, or whoever else is doing the "interpreting" (Searle 1992). The subject himself (or herself), i.e. the conscious individual who "owns" a (possibly internalized) mental representation. As when you close your eyes and imagine an apple: you do not confuse your imagined apple with the one it represents, you differentiate between the two. On the other hand, when you see an apple, you do not think of your perception as a "representation" of an apple: you see the apple itself. So there are no mental representations in perception, but only in imagination (Piaget 1945; see also Zlatev this volume; Sonesson this volume).
Mental representations, as here defined, make an irreducible reference to consciousness. The notion of consciousness is, of course, vexed with even more riddles than that of representation, but since it has become again a (scientifically) respectable topic over the last 20 years,2 considerable advances have been made in both philosophical discussions, e.g. of the "hard problem" of the irreducibility of qualitative experience (Chalmers 1997), in distinguishing between different kinds of consciousness, e.g. affective from 2. As testified by a number of journals including Journal of Consciousness Studies, Consciousness and Cognition, and PSYCHE: An Interdisciplinary Journal of Research on Consciousness and annual conferences such as Towards a Science ofConsciousness (TSC).
202
Takashi Ikegami and Jordan Zlatev
reflective consciousness, as well as in understanding the neural underpinnings of this admittedly rather elusive phenomenon (e.g. Edelman 1992; Damasio 2000). Mental representations involve both types of consciousness mentioned above: they are reflective, since they can be accessed and thought about independently of whatever they represent, but they are also affective since they have what phenomenologists call a "affective tone" (Thompson 2001), a particular "flavor" (to use a metaphor), due to the intrinsic value system of the subject as a living being (Zlatev 2003). The "flavor" of my imagined apple is different from that of an imagined rotten banana. Finally, since mental representations can be (in principle) accessed, unless they are "repressed" due to Freudian or other reasons (Searle 1992), they can be themselves represented, or expressed - in language, pictures, gestures or some other external medium (more or less accurately). That is why they can be said to constitute our declarative knowledge, as opposed to procedural skills such as bicycle-riding, which are not based on mental representations (contrary to the claims of first-generation cognitivists). The latter distinction is made clear by Mandler (2004), who consistently distinguishes between the two sorts of knowledge in her recent monograph, pointing out some of their respective characteristics: Procedural knowledge, both perceptual and motor, is inaccessible to consciousness. [... ] In spite of taking in lots of information at once [... ] it is also relatively slow to learn, and learning is accomplished by associative strengthening, typically over a number of trials, as in operand conditioning or perceptual schema formation. It aggregates frequency information. [...] Declarative or conceptual knowledge, in contrast, is accessible to awareness and is either describable in language, or, with a little analytic training, by drawing. It requires attention to be encoded into this format; this means that it is selective. [... ] The system can learn information in a single trial (in small quantities, of course) simply by being told. In comparison to procedural knowledge, it is relatively context-free. (Mandler 2004: 55)
Similarly to Mandler (2004), we consider only declarative knowledge, which consists of mental representations, to be conceptual. Prerepresentational cognition is no-less, and probably even more important for our survival, but it needs to be clearly distinguished from conceptual knowledge, and hence we will say that its major "building blocks" are sensorimotor schemas and categories, but not concepts. We can now explain the prefix "pre" in our title: Pre-representational (pre-conceptual) knowledge precedes representational cognition in (a)
From pre-representational cognition to language
203
phylogenesis: all animals have it, but it is only certain that human beings have mental representations - though it is possible that at least some animals such as the great apes have some forms as well, e.g. dyadic mimesis (Zlatev this volume), (b) ontogenesis: the sensorimotor cognition of the young infant is pre-conceptual in the sense of Piaget, though not as long as Piaget assumed: at least 9-month old babies can recall past experiences, and that implies representations (Mandler 2004) and (c) microgenesis: prerepresentational schemas and categories typically operate faster than conscious representational thought, and continue to serve as a constant backdrop to representational cognition, even when we have progressed past the sensorimotor period. Table 1 summarizes some of the properties of representational cognition and mental representations, contrasting these with properties of prerepresentational cognition. Table 1.
Comparison of mental representation and pre-representational cognition along a number of dimensions.
Dimension
Mental representations
Expression-content differentiation Consciousness Major functions
+
Learning Evolution Ontogeny Guide to behavior Type ofmemory Status Example structures
faster later later lower declarative conceptual mental images, mimetic schemas, symbols
accessible thought, recall, planning
Pre-representational cognition
inaccessible perception, recognition, self-motion slower earlier earlier faster procedural pre-conceptual interaction patterns, sensorimotor schemas
As shown in Table 1, the two sorts of cognition have complementary properties, and an account of human cognition, and possibly even that of certain "higher" animals, involves both types and their coordination. While we have here emphasized their differences in order to make a clear conceptual distinction, we need to mention two caveats, less we be accused of "dichotomizing". First, since mental representations are evolutionarily and
204
Takashi Ikegami and Jordan Zlatev
ontogenetically later than pre-representational cognition, they can be said to "emerge" from it, and hence will not be completely independent from their pre-representational, "bodily" roots. Second, and as implied above, any particular form of "higher cognition", such as language use, will inevitably involve pre-representational, as well as representational structures and processes. Nevertheless, we emphasize that mental representations cannot be reduced to pre-representational cognition: they constitute a qualitatively new ontological level, and in this sense, we are in disagreement with the claims of strong evolutionary continuity of e.g. Johnson and Rohrer (this volume). Language is based on mental representations, but involves yet one higher ontological level: that of consensual social reality, mutual knowledge (Itkonen 1978, 2003; Zlatev this volume). At the same time, it is not completely independent from the pre-representational, sensorimotor roots of all cognition. In the remainder of this chapter, we will show some evidence for this claim by first explicating two related forms of pre-representational cognition, dynamical categories and internal meaning space, and then show how they "map" onto certain aspects of language.
3.
Dynamical categories
The concept of dynamic categorization originates from Gibson's (1962, 1979) "ecological" theory of perception (see also Costall this volume), and in particular his emphasis on perception as a form of activity, which is currently often referred to as active perception. Gibson's insights have been developed and re-interpreted in multiple ways. Sasaki (2000, 2002) applied a Gibsonian analysis to various situations such as a blind man's navigation patterns in a town, people's usage of the visual landscape and the action structure of breaking an egg. According to Costall (this volume), the most important characteristics of Gibson' s concept of meaning is that it is neither equivalent to external sensory input, nor to "representations" generated in one's brain, but constitutes a dynamical, relational category that arises as an active perceiver interacts with an environment. The following aspects are particularly important in re-thinking the concepts of active perception and meaning for the present chapter (see Sasaki 2000,2002; O'Regan and Noe 2001):
From pre-representational cognition to language
i) ii) iii)
205
Perception can emerge via self-movement. Perceiving the environment means to explore it. Any action has inherent multiplicity.
An instance of the first aspect is active touch. Gibson (1962) reports on experiments with blind subjects touching different shapes of cookie cutters. If the cutter was placed on the subjects' palms, they could only determine the correct shape with 50% accuracy. When the cutter was pushed randomly on one's palm, the subjects could tell with 72% accuracy. Only by touching the cutter in a self-guided manner could they recognize the object in more than 95% of the cases. This study also illustrates point (ii) that perception is a form of exploration: As we will discuss in this section, exploration is not just a method to arrive at perception; rather perception is equivalent to the on-going exploratory process. The third aspect is especially important for simulations involving "artificial creatures". An issue that often comes up in this context is how to select the most appropriate set of actions. However, aspect (iii) implies that no discrete action set (a "plan") needs to be prepared in advance. Our body schema (see Gallagher 2005, this volume) has a huge number of degrees of freedom. Even a simple action pattern (e.g. sitting down on a chair) consists of multiple sub-level actions and the exact sequence will be afforded by the environment as the action proceeds. The following example could perhaps clarify the distinction between "abstract" representational knowledge, and the kind of pre-representational know-how that is gained through active perception. In Japan it is customary for children to make paper cranes. We can learn how to make a paper crane by just looking at a picture of the end state of the process and step-by-step instructions on how to get there (as in an IKEA manual). But this kind of knowledge is qualitatively different from that gained by actually making a paper crane. One's experience of making the paper crane, the way we fold the paper, how we feel touching origami and hear its rustling, etc.: all these complex perceptive experiences organize our "embodied" dynamical category of a paper crane. While folding origami, we experience trial-and-error everywhere giving rise to an exploratory process. There is no way to form this category just by seeing and memorizing an algorithmic instruction. Therefore not only can representational cognition not be reduced to prerepresentational cognition (as argued in Section 2), bodily skills cannot be reduced to (mental) representations either (Dreyfus and Dreyfus 1986).
206
Takashi Ikegami and Jordan Zlatev
3.1.
Experiments with "artificial creatures"
The notion of active perception is often evoked within the "embodied cognition" approach in the field of cognitive robotics (Pfeifer and Scheier 2001). In this context "embodiment" refers to the spatial/temporal dynamics and the physical constraints of the robot's body in interaction with a given environment. This is what Ziemke (2003a) apparently means by "organismoid embodiment", which is similar to, but nevertheless different from the kind of embodiment that characterizes living systems ("organismic embodiment"). Since phenomenology presupposes biology, i.e. only (some) living beings are capable of qualia (Zlatev 2003; Emmeche this volume), there is no reason to suppose that such "artificial creatures" have any kind of phenomenal experience. Nevertheless, they can illustrate how categorization can result form mastering sensory-motor coordination with certain aspects of the environment, and in the process creating dynamical categories of these without forming concepts. Recently, there have been many studies of dynamic categorization involving "artificial creatures", some of which are summarized in Table 2 below. Some of these studies involve actual, physical robots (Scheier and Pfeifer 1995), while others involve computational simulations (Morimoto and Ikegami 2004). While some scholars such as Brooks (1999) and Steels (1994) insist on the importance of using real physical devices in "embodied AI", there are good reasons to rely on simulations as well. As Ziemke (2003b: 390) points out, "instead of focusing on one experiment or a few experiments with a real robot, many questions are more suitably addressed through a large number of experiments allowing for more variations on agent morphologies, control architectures, and or environments." Furthermore, it should be remembered that no matter the type of the experiment, such studies involve models which simulate rather the duplicate the phenomenon being studied (Searle 1992; Zlatev 2003), and this is more easily forgotten if the model is a "real" 3-dimensional, physical structure. Another thing that is often forgotten, or at least not explicitly emphasized, but which is central for our discussion, is the difference between the (models of the) categories of the artificial creatures, and the corresponding (human) concepts such as LARGE/SMALL. The two are quite different: the first are "sub-personal" and pre-representational, the second are personal, (potentially) shared and representational. What the models simulate is categories of the first type, and it is only we as external observers possess-
From pre-representational cognition to language
207
ing the respective concepts who can see correspondences between the columns in Table 2. Table 2.
Studies of computer simulations of dynamical categorization. The fIrst column shows the corresponding categories that the "creatures" formed on the basis of physical motion. The second column shows a specification of the differences in conceptual terms. It is only our interpretation that links the two.
Dynamic categories
Concepts
References
GRASPING/NON-GRASPING
LARGE/SMALL
NAVIGATION DIFFERENCES
ROOM AlROOM B
NAVIGATION DIFFERENCES
TRIANGLE/RECTANGLE
FINGER MOTION
SPHERE/CUBIC
Scheier and Pfeifer (1995) Tani and Nolfi (1999) Morimoto and Ikegami (2004) Marocco and Floreano (2003) Nolfi and Marocco (2002)
FAST/SLOW BLINKING
Iizuka and Ikegami (2004)
DIFFERENCES APPROACHING/AVOIDING
To understand this better, we offer a brief summary of how the "creatures" in the first two studies of Table 1 performed classification through dynamic categorization. It is worth having in mind the similarities with Gibson's (1962) psychological experiment involving blind touch mentioned earlier: in both cases perception is performed through self-guided motion and exploration. Scheier and Pfeifer (1995) conducted studies of robots that learned to discriminate on the basis of the size of an object by their bodily movements. In the experiment, there were pegs of two different sizes, one with a large diameter and one with a smaller one, distributed in an arena. A robot used a gripping wire to pick up pegs after sensing these with its light sensors. The robot could pick up the smaller pegs to carry them over to its nest but not the larger ones. Using (a model of) an unconscious learning process, it came to discriminate small and large pegs. As learning proceeded, the robot neglected large pegs and only tried to pick up the smaller ones. On the basis of this, one could be tempted to say that the robot came to know the concepts of LARGE and SMALL (relative to its own "embodiment"), but that would be a mistake. Rather, by mastering a specific type of sensory-motor coordination, the robots learned the categories GRASPABLE/NON-GRASPABLE, as they apply in this particular context. For example, if we changed the surface texture of the pegs so that some "small" ones would be smooth, they would become difficult to grasp, and would there-
208
Takashi Ikegami and Jordan Zlatev
fore be categorized together with the large ones, showing again the difference between categories and concepts. Another experiment demonstrated that robots could discriminate between two different rooms by their bodily movements (Tani and Nolfi 1999). The two rooms were distinguished due to their different spatial arrangements of corners and walls. A door that connected the two rooms randomly opened and closed. When the door opened, a robot could move into the other room by chance. After going back and forth between the two rooms, a robot came to "know" which room it is situated in: it came to possess two mutually exclusive neural modules, one for room A and one for room B, and depending on which one was activated it behaved differently. We can say the robot categorized some of the spatial characteristics of its environment, but can we say that it had concepts of the two rooms? The distinctions made in Section 2 lead to a negative answer. The "neural modules" corresponding to the two rooms are not mental representations of the two, since there was no way for the robot to differentiate between the module and (its perception of) the room. There was therefore no way for it to imagine room B, while it was in room A, and to "decide" to go there. 3 On the other hand, having concepts of the two rooms implies being able to link the concept to specific perceptual details such as the color of the walls or the different light pattern around a corner. Concepts must be "grounded" in perception - or else they are "empty", as Kant famously pointed out. Our proposal is that dynamical categories play an important role in this grounding.
3.2.
Dynamic categorization of object shapes: a case study
In order to gain a better understanding of how such models of dynamic categorization work, and to appreciate both the strengths and the weaknesses of dynamical categories, we here describe in some more detail the study of Morimoto and Ikegami (2004), in which simulated artificial creatures could learn to classify triangles and rectangles through self-guided motion. The goal of the experiment was to see if geometrically well3. The classification of the two rooms was also dependent upon such contingencies as the dynamics of the door between the two rooms. Indeed it turned out that the open/shut dynamics of the door that connected room A and room B was essential: randomness was needed to achieve successful classification (Tani, private communication).
From pre-representational cognition to language
209
defined concepts such as TRIANGLE and RECTANGLE could be classified only by local "blind touch" and "exploration". The "creatures" of Morimoto and Ikegami (2004) "live" in a 2dimensional world, populated by objects of various shapes and colors, as shown in Figure 1. The objects are 2 different kinds of triangles and 4 different kinds of rectangles. As a creature "explores" the environment, it "touches" some of these objects, and then changes its style of motion. Since the creature's receptive field is highly limited, it can not perceive the shape of a whole figure. The creatures were "evolved" using a genetic algorithm, which mimics biological evolution, and after thousands of evolutionary generations, using a "fitness function" that favors creatures that avoid triangles and explore rectangles, some creatures succeeded in spending more time in touching rectangles and less time in triangles. The point is that the creatures were not in any way explicitly trained to distinguish triangles from rectangles and the classification was established only as a result of the actual exploration.
Figure 1.
Part of a 2-dimensional environment showing (a) rectangles and triangles and (b) the spatial trail of a "creature", shown as a line. The black parts of the figures represent points of touch between creature and objects (for details, see Morimoto and Ikegarni 2004)
210
Takashi Ikegami and Jordan Zlatev
Figure 1 shows part of the environment with (a) several triangles and rectangles randomly distributed and (b) narrow lines penetrating the objects. The lines represent a creature's spatial trails and we can see that it bypasses triangles but "explores" rectangles, the black part representing points of contact between creature and object. In particular, it can be seen that the creature has densely covered a rotated square in the top-right corner. How did the creatures achieve this implicit classification? Each creature was controlled by a simple artificial neural network, shown in Figure 2, of the type used in many other similar models (e.g. Ziemke and Thieme 2002). The network receives sensory input in a 3x3 matrix of "neurons", each one of which has a continuous value from 0 to 1. ~
( •••
~
Output Neuron
/\ 0
00 - - . . \ 0
• . . . . . .·1--+--t--4· . . . . .
ITTIcontext Neuron
Input Neuron
~ sensory stimul i Figure 2.
The design of the neural network that controlled the artificial creature's "behavior" (Adapted from Morimoto and Ikegami 2004)
A set of connections projects from the sensory input to the motor output, both directly, and via a number of "context neurons". Each input neuron has "synaptic" connections to the other input neurons, and similarly for the context neurons. Each neuron receives an integrated signal from the other neurons multiplied by the connection weight. The connection weights also take continuous values, and are initially set by the genetic algorithm. The integrated signal filtered by a sigmoid function gives a new neural state of that neuron. This updating schedule (neural states at time T > integration of weighted activities > neural states at time T + 1) is recursively iterated
From pre-representational cognition to language
211
while a creature is moving around the environment. The output signals from the input neurons are integrated to produce motor outputs. There are three output neurons (L, F and R). The most active output neuron determines the next action of the creature: turn right (R), go straight (F), or turn left (L). The weights of the connections were not fixed, but adapted by a process of Hebbian learning (Hebb 1949), to allow predicting the sensory stimuli of one step in the future. This plasticity enables dynamic switching from one navigation mode to another: "wandering", "exploring objects" and "filling in" (for details, see Morimoto and Ikegami 2004). Two important observations concerning the classification and the way in which it was achieved need to be made. First, the creatures could not learn to discriminate between triangles and rectangles completely accurately, but rather formed prototype-like categories. Figure 3 shows an example of a creature's "behavior" towards both different instances of the same object and different objects, given the same initial neural state (i.e. connection weights and neuron activations). It can be noticed that a creature's behavior looks very different depending on where the creature first encounters the object: a)
b) c) d)
The first 10 instances of Figure 3, starting from the top, are triangles. As can be seen, the creature enters and leaves these objects without spending much time exploring them. The following 4 instances are squares, and the creature correctly "fills in" these figures more or less completely. The following 6 instances are all parallelograms. The creature fills in the insides of the first three quite thoroughly, but not the other three. The remaining 12 instances are trapezoids. The creature's behavior depends on the orientation of the trapezoid: the first 6 instances are explored but their reversed images are not, and are thus not differentiated from triangles.
Thus, we can conclude that the creature treated squares as the most prototypical type of rectangle, while other rectangular shapes were not consistently distinguished from triangles. The second observation is that the classification is highly contextsensitive. Different instances of the same object were treated differently depending on (a) the creature's internal state and (b) the spatial arrangement of the objects. For example, the order in which the objects were explored (rectangle to triangle or triangle to triangle) changes the internal
212
Takashi Ikegami and Jordan Zlatev
state of a creature so that the same triangle is "perceived" differently at different points in time.
Figure 3.
A creature's "imperfect" categorization of triangles (the fIrst 10 fIgures) and rectangles (the remaining fIgures). (See the text for discussion)
The key to understanding these features of the model, as well as its relative success in the discrimination task, were the internal context neurons, which functioned as a sort of procedural memory. As shown in Figure 2, the creature's behavior was not simply a product of the coordination between bare sensory input and motor output, but also involved processed
From pre-representational cognition to language
213
sensory input that was "stored" in the context neurons and the weights of the connections to and from these units. Specific analyses showed that a creature could be sometimes driven by the sensory input (the left circuit) but sometimes by the context neural states (the right circuit), see Figure 2. The conclusion is that in order to perform (prototypical, context-sensitive) triangle/rectangle discrimination, the creature required the combination of raw " input signals and the internal neural states. Without this combination, the creatures showed much less diverse behavior and lower performance in discrimination. il
3.3.
Further studies of dynamic categorization
In order to further explore the nature of dynamic categorization and spe-
cifically the role of internal dynamics, Iizuka and Ikegami (2004) conducted experiments with a new neural network architecture, where "creatures" were able to spontaneously turn on and turn off the signal from the sensory input. If the signal was "off' then behavior was guided solely by the internal dynamics (i.e. memory) of the system, which controlled whether environmental input should be taken in or not. This can be compared to a mechanism that differentially focuses attention on either the environment or on the memory of past experience. Since such differentiation is the first step towards representation, as argued in Section 2, the model of Iizuka and Ikegami (2004) could be seen as a model of a step towards mental representation. But only a relatively small one, since there is of course no awareness of any correspondence between the internal state and the environment. The model was applied to a different task: differentiating between different frequencies of light blinking by approaching, respectively avoiding the light source. It turned out that the model learned the appropriate behaviors much more easily with the help of the "selective attention" mechanism than without it. The reason was that by being able to selectively direct its attention either to the blinking patterns or to its memory of past interactions, a creature's behavior was not determined so much by the external stimulus as by the style of its own motion in previous similar circumstances. In sum, the experiments with "artificial creatures" described in this section show how dynamical categories can emerge as result of sensorimotor interactions with various stimuli, without there being any mental represen-
214
Takashi Ikegami and Jordan Zlatev
tations (i.e. concepts), or even awareness of these stimuli. We showed that such categories display features such as prototype effects and contextsensitivity, which may be advantageous in some circumstances, but disadvantageous in others. In particular, we suggested that dynamical categorization can be enhanced through internal "context units" or other mechanisms implementing a sort of (procedural) memory, and even more so by the ability to selectively pay attention to the internal states rather than only to the external environment. We use this as departure point for a discussion of the role of these internal states, or what we call internal meaning space.
4.
Internal meaning space
The experiments described in the previous section illustrated some Gibsonian features such as the active nature of perception and the "multiplicity of action", and how they could emerge from a combination of evolutionary history and a history of spatial exploration. At the same time, we pointed out that the dynamic internal state of the simulated creatures retains a form of long-term memory or personal history (which is something underplayed in Gibsonian theory), and this affects the selection of the current action pattern. This would imply that a living creature with such a memory would experience the world differently from creatures with different histories of structural coupling (Varela, Thompson and Rosch 1991; Ziemke and Sharkey 2001). In this sense, we can say that a creature with an internal state is not simply coordinating sensory input and motor output, but coordinates the meaning of the sensory input with self-motion: it "interprets" its environment, rather than just reacting to it. 4 In section 2 we argued that there is a gap between pre-representational cognition and (true) mental representations. The experiments with "artificial creatures" helped clarify this gap, but also showed that the internal dynamics of the control architectures that govern the behaviors of the "creatures" were essential for the formation of adequate dynamical categories. We will here argue that such internal dynamics constitute an internal meaning space. This concept does not "bridge the gap" between sensori4. On the basis of his long-term animal experimental studies, Freeman (2003) insists that neural activity patterns in the sensory cortex reflect the meaning of the input rather than the actual stimulus. See also the discussion by Ziemke and Sharkey (2001) concerning von Uexkiill's notion of the "historical basis of reaction".
From pre-representational cognition to language
215
motor and representational cognition, but nevertheless constitutes an important (evolutionary) step in the direction of mental representations.
4.1.
Quality dimensions, comparison with Gardenfors (2000)
Let us explain the senses in which we are using the constituent terms of the notion internal meaning space. The fact that it is "internal" does not mean that it is representational, but only that it is subjective, an emergent categorization of the environment performed by the organism, i.e. its Umwelt (von Uexkiill 1940 [1982], see Lindblom and Ziemke this volume). The "meaning" of the space lies in the fact that it consists of a network of dynamical categories which are value-laden. But in what way is it useful to regard it as a "space"? This metaphorical expression becomes justified if, similarly to Gardenfors (2000, this volume), we regard the internal meaning space to be structured by a number of quality dimensions, such as HEIGHT, BRIGHTNESS, PITCH, VOLUME, WEIGHT and SHARPNESS. These are not just abstract dimensions along which stimuli can be classified, but dimensions that matter to living creatures, in the sense of supporting their self-preservation and are hence intrinsically meaningful (Zlatev 2003). Also similar to Gardenfors, we assume that some of these dimensions have been selected for in evolution and are thus in one sense of the word "innate", while others are learned, or at least modulated, as a result of experience. A final similarity is that we find Gardenfors' s notion of "natural concepts" as convex regions in conceptual space useful and believe that it corresponds to our notion of dynamical categories: both allow for prototype effects, with the prototype corresponding to an attractor in the internal meaning space. The differences between our concepts and those of Gardenfors are basically two. First, the notion of internal meaning space, with dynamic categories as attractor states, is considerably more dynamic than what Gardenfors' model allows (even with the extensions of his model involving forces presented in the present volume): dynamical categories are constructed through sensorimotor activity in the manner illustrated by the simulations in Section 3, and do not simply exist as static "convex regions". Second, we maintain that Gardenfors conflates pre-representational internal meaning space and true representations by calling both "representations". This is shown clearly in the quotation that he refers to as well:
216
Takashi Ikegami and Jordan Zlatev
Evidence suggests that dimensions that are easily separated by adults such as brightness and size of a square are treated as fused together by children. ... For example, children have difficulty identifying whether two object differ on their brightness or size, even though they can easily see that they differ in some way. Both differentiation and dimensionalization occur throughout one's lifetime. (Goldstone and Barsalou 1998: 252)
We interpret this as evidence that the dimensions of 2 year-old children are still pre-representational, and that only when they develop the capacity to differentiate between the dimensions consciously, or what Gardenfors (this volume) calls "to reason about the dimensions", do they become truly representational, or conceptual. Similarly, experientially relevant "regions" in internal meaning space are for us natural categories, but not (yet) concepts. This may sound like terminological hair-splitting, but it is not, since we maintain a difference that Gardenfors does not. This leads Gardenfors (this volume) into difficulties when it comes to distinguishing between forces as "psychological constructs" and as "scientific dimensions". Most crucially for the purpose of the chapter, we emphasize the pre-representational nature of the internal meaning space since we are interested in clarifying, and to some extent bridging the gap between non-representational cognition and language, and we suggest that the internal meaning space plays an important role in this respect.
4.2.
Cross-modal transfer and synaesthesia
An important property of the internal meaning space is that it is intrinsically cross-modal. Indeed, Edelman (1992) argues that at least two modalities ("a classification couple") are necessary for any kind of natural categories to emerge. While the experiments described in Section 3 were multi-modal in the sense that they involved simulations of coordination between haptic sensation and self-motion, perceptual experience is much richer than so. The quality dimensions mentioned earlier derive from different modalities: vision (e.g. BRIGHTNESS), auditory system (e.g. PITCH), kinesthetic sense (e.g. WEIGHT) and haptic sense (e.g. SHARPNESS), and the major function of the internal meaning space is to provide an integration of the different modalities. A good deal of this integration is due to experience, by coordinating the modalities through exploration, e.g. as proposed by Piaget (1945). But this cannot explain all the available data, since it appears that some forms of
From pre-representational cognition to language
217
coordination between modalities, often described in terms of "transfer" or "mappings" are pre-established by evolutionary processes (analogous to the "genetic algorithm" in the simulation described in Section 3). Meltzoff and Borton (1979) showed that infants as young as 1 month looked longer at pacifiers that they had previously explored only orally, displaying a transfer between haptic sense and vision. Perhaps even more famously (and at first controversially), a number of studies (Meltzoff and Moore 1977, 1995) have shown that neonates are able to copy a number of bodily actions such as tongue protrusion and mouth opening, practically from birth (see Gallagher this volume). More recently, such displays of "blind imitation" (the infant cannot see its own body part) have been shown for chimpanzees as well (Myowa-Yamakoshi et al. 2004). These results imply mappings between visual perception of body motion and kinesthetic perception of one's own body and have been linked to specific neural mechanisms such as "mirror neurons" (e.g. Rizzolatli and Arbib 1998). While not offering evidence of innateness, the perception and classification of types of body motion (Johansson 1973, see Gardenfors this volume) through visually minimal information can also be explained as the result of mapping from vision to the body-schema (Gallagher 2005, this volume). All these mappings, we suggest, take place in the internal meaning space. They do not require to be thought about, or inferred, but are "perceived directly" in phenomenological terms. This is in line with our claims that internal meaning space is pre-representational, in contrast to Gardenfors (this volume), as well as Meltzoff and Borton (1979) who interpreted their results as showing that infants "represent" objects in an amodal representational format. Rather, we view the internal meaning space as crossmodal, and initially inaccessible to reflective consciousness. A phenomenon that can be interpreted in terms of a cross-modal internal meaning space is synaesthesia (Cytowic 1995, 2002; Baron-Cohen 1996; Ramachandran and Hubbard 2001). Despite controversies on the mechanisms responsible for it, there is now a consensus that "synaesthesia (Greek, syn = together + aesthesis = perception) is the involuntary physical experience of a cross-modal association" (Cytowic 1995). Clinically, synaesthesia is present in (at least) 1 in 20,000 individuals, with a higher rate for women than men (6:1) and is genetically inherited. Synaesthesia has a number of characteristics: (a)
It is involuntary and "insuppressible": the subject cannot help but, for example, see a certain color on hearing a particular tone;
218
Takashi Ikegami and Jordan Zlatev
(b)
It can involve all modalities, but some are more common than others (see Section 5 below); It is usually unidirectional, e.g. different sounds evoke visions, but vision does not typically evoke sound; It is "projected", i.e. perceived externally as ordinary perception, rather than in "the mind's eye"; It is emotional: either disturbing or rewarding, but never neutral.
(c) (d) (e)
The involuntary character of synaesthesia is sufficient to distinguish it from mental representations, and in particular language. While language may involve some similarities with synaesthesia, as we will point out below, we emphasize again that language, as a conventional-normative semiotic system for communication and thought, involves (intersubjective) representations, and is qualitatively different from synaesthesia. To quote Cytowic (1995) again: "Its phenomenology clearly distinguishes it from metaphor, literary tropes, sound symbolism, and deliberate artistic contrivances that sometimes employ the term 'synaesthesia' to describe their multi-sensory joinings." On the other hand, synaesthesia is obviously not "out there" in the objective environment. In our terms, it is clearly prerepresentational. While we follow Cytowic in emphasizing its "distinctness" as a phenomenon, synaesthesia is possibly more continuous to ordinary perception than so, and in a sense "we are all synaesthetics" to some degree. Maurer (1993) argues, on the basis of both animal and human evidence, that neonatal cross-modal transfer is essentially synaesthetic, but that it normally disappears after the first few months: During early infancy - and only during early infancy - [... ] evoked responses to spoken language (are recorded) not just over the temporal cortex, where one would expect to find them, but over the occipital cortex as well. There are similar reports of wide-spread cortical responses to visual stimuli during the fIrst 2 months of life. [... ] Results such as these suggest that primary sensory cortex is not so specialized in the young infant as in the adult. (ibid: 111, quoted in Baron-Cohen 1996)
Such "neonatal synaesthesia" could possibly explain the phenomenon of neonatal imitation, which we pointed out above is also a form of crossmodal transfer - as well as its "disappearance" after the first 3 months or life. However, cross-modal transfer "reappears" later in life, e.g. in the form of mental simulation (see Svensson, Lindblom and Ziemke this volume), but we would argue that these effects reflect imagination, rather than
From pre-representational cognition to language
219
perception, and are thus representational. There is thus room for both discontinuity and some continuity between internal meaning space and representational cognition, including language, as we suggest in the next section.
5.
Representation, quasi-synaesthesia and language
What kind of change in internal meaning space is necessary in order to give rise to mental representations? This is a difficult question to which we will here offer a simple and preliminary answer: Representation involves a bifurcation of the internal meaning space into (a) perceptual consciousness and (b) imagination (reflective consciousness). The model of Iizuka and Ikegami (2004), reviewed at the end of Section 3, can serve as an illustration. To the extent that the subject can attend differentially to aspects of the internal meaning space itself, separate from the way they mediate the perception of the external world, that subject will be capable of "mental images". What remains is to understand these images as actually standing for something else than themselves. With this the conditions for mental representations defined in Section 2 will be fulfilled. How does this "bifurcation" come about? Piaget (1945) argued that imitation plays a central role for this in childhood. Zlatev (this volume) explains this in terms of bodily mimesis: initially the child does not differentiate between the perceived body motion of the other, and the motion of its own body, as in synaesthesia. But with deferred imitation comes differentiation. And with "representational imitation" comes the ability to (consciously) access the mimetic schema, and use it as a model to guide future action. However, an apparent problem for these accounts is posed by children who have been paralyzed, or severely motorically impaired, from birth (e.g. Jordan 1972). Hence, actual imitation cannot be a necessary condition for the emergence of mental representations. Zlatev (2005) attempts to deal with this in terms of covert imitation, but this does not answer where the ability to "covertly imitate" derives from. At present there does not seem to be any good explanation, but we can here at least offer a description: to have a mental representation is to have one part of internal meaning space, imagination, standing in correspondence with another, perception, and being able to differentially focus attention to one or the other. This description, however, reminds of the phenomenon of synaesthesia reviewed at the end of the last section. The
220
Takashi Ikegami and Jordan Zlatev
important differences are that (a) attention is under voluntary control and (b) the "synaesthetic experience" is not projected into the perceptual world, but understood as internal and "unreal" (in normal conditions). If this conjecture is correct, we would expect to find "quasisynaesthetic" experiences: imagined projections between modalities in the arts, as well as in language. In this section, we will briefly describe two linguistic phenomena which seem to provide some support for the view that at least some forms of linguistic representation may involve what we here call quasi-synaesthesia in order to distinguish it from true synaesthesia as defined in Section 4.2.
5.1.
Quasi-synaesthetic metaphors
What we will refer to as quasi-synaesthetic metaphors are more commonly known as "synaesthetic metaphors" (e.g. Ullman 1964): expressions such as sweet smile, cold look, soft music, and loud color, where experience from one sense ("source domain") is projected to, or mapped onto a phenomenon that is primarily perceived in terms of another sense. For the expressions given above, that would involve: Taste ~ Vision, Temperature ~ Vision, Touch ~ Hearing, and Hearing ~ Vision, respectively. While some of these expressions may be quite conventional, others may be more "novel", e.g. bitter chuckles (Tomas Pynchon, Gravity's Rainbow). The specific types of expressions are language- and culture-specific, but the phenomenon is apparently universal (Osgood 1959). Furthermore, many have observed that there appear to be some universal tendencies. Ullman (1964: 86) writes: [T]he movement of synaesthetic metaphors is not haphazard but conforms to a basic pattern. I have collected data for the sources and destinations of such images in a dozen nineteenth-century poets, French, English and American, and found three tendencies which stood out very clearly: (1) transfers from the lower to the more differentiated senses were more frequent that those that map in the opposite direction: over 80 percept of a total of 2000 examples showed this "upward trend"; (2) touch was in each case the largest single source, and (3) sound the largest recipient.
If these tendencies are found across languages and cultures, then that would appear to imply some general cognitive motivation of semantic structure, of the type often evoked in Cognitive Linguistics. At the same time, "mappings" from less differentiated domains (Taste and Touch) to
From pre-representational cognition to language
221
more differentiated domain (Vision and Hearing) would appear to go against the grain of Conceptual Metaphor Theory (Lakoff and Johnson 1980, 1999), where rather the latter is expected: "The greater inferential complexity of the sensory and motor domains gives the metaphors an asymmetric character, with inferences flowing in one direction only." (Lakoff and Johnson 1999: 57-58) Ullman does not specify his evidence in more detail, but Day (1996) reports a study in which he analyzed 1269 (quasi-)synaesthetic metaphors in English texts ranging from Chaucer to Pynchon, specifying the directionality of the mappings. Table 3, adapted from Day (1996, Table 7), classifies the 1269 expressions in terms of these mappings between the "six senses". It can be seen, for example, that Touch was by far the most common source domain, mapping to Vision 135 times (e.g. hard look), while Hearing was, also by far, the most common Target domain, exactly as claimed by UIIman (1964). Table 3.
Classification of the mappings in 1269 quasi-synaesthetic metaphors, based on Day (1996: Table 7), see text for discussion.
Target domain
Source domain Hearing Taste
Hearing Taste Smell Temp. Touch Vision # Source domain
0 7 0 3 26 36
149 60 19 10 38 276
Smell Temp.
Touch
Vision
1 0
540 6 34 8
80 0 14 4 2
0 0 1 2
86 1 3 0 42 132
135 723
100
# Target domain 856 7 118 31 15 242 1269
When Day (1996) subtracted the instances when a sense modality was used as source from those when it was used as target, he arrived at the "sensory ranking" given in (1): (1)
Touch> Taste> Temperature> Smell> Vision> Hearing
Analogously, Day classified the types of synaesthesia from the 25 subjects reported by Cytowic (2002) some of who had multiple synaesthesia, giving a total of 35 projections. These are given in Table 4, adapted from Day
222
Takashi Ikegami and Jordan Zlatev
(1996, Table 6). Comparing to Table 3 shows that, similarly, Hearing was by far the most common target (i.e. when the subjects heard sounds they also experienced sensations from other modalities), but unlike with the metaphors, it was Vision that was the most typical source, which is often stated in descriptions of synaesthesia, typically involving projection of c%r (e.g. Ramachandran and Hubbard 2001). Table 4.
Classification of 35 proj ections in 25 synaesthetic subjects, analyzed by Cytowic (2002) dapted from (Day 1996: Table 5), see text for discussion.
Target modality
Source modality Hearing Taste Smell
Hearing Taste Smell Temp. Touch Vision # Source modality
0 0 0 0 0 0
2 0 0 0 0 2
1 0 0 0 1 2
Temp.
Touch
0 1 1
2 1 1 0
0 0 2
1 5
Vision # Target modality 21 26 1 3 0 2 0 0 2 2 2 24 35
Using the same method of "subtraction" provided the ranking given in (2), which, however, given the small number of subjects and projections should not be taken as too seriously. Nevertheless, it clearly shows that there are both similarities between quasi-synaesthetic metaphors and actual synaesthesia, above all that Hearing is most often "interpreted" as something else, and differences: above all the ranking of Vision. (2)
Vision> Touch> Temperature> Smell> Taste> Hearing
The conclusions that we draw from this are in part similar to those drawn by Day (1996): that while the similarities may reflect universal features of human consciousness, and its neural underpinnings, or in our terms, the organization of our internal meaning space, the differences require distinguishing between the mechanisms of synaesthesia and semantic processes: "The meanings for synaesthetic metaphors are not simply there, hard-wired and innate, but are generated through semantic processes and fashioned by time and cultural elements, much like other metaphors" (ibid: 20). We
From pre-representational cognition to language
223
further add that the differences point to the radical difference between prerepresentational cognition and language. The first could possibly motivate some of its characteristics (e.g. the tendency to consciously relate soundbased concepts to concepts based on some other modality), but semantics cannot be reduced to pre-representational cognition, and even less so to neural structures (against the claims of e.g. Dodge and Lakoff 2005). Still, we find this line of research intriguing, and pointing to the possible synaesthetic roots of conscious, mental representations. Once the "source" and the "target" modalities can be differentiated, the first can be mapped onto the other, utilizing the quality dimensions of the internal meaning space. If the mapping between the perception of another's actions and those of oneself involves similar mechanisms to those involved in synaesthesia, its differentiation and focus on the "internal image" of the action would be equivalent to (dyadic) covert mimesis, possibly the original form of mental representation, as suggested by Piaget (1945) and Zlatev (2005, this volume).
5.2.
Japanese mimetics
Sound-symbolism is another universal phenomenon of language, but the degree to which particular languages employ it varies. It is comparatively marginalized in Indo-European languages, and unsurprisingly, it was not considered a central feature of language from the birth of modem linguistics in Europe (Saussure 1916). Another reason for its de-appreciation is that it goes against the Saussurian dictum of the "arbitrariness of the linguistic sign". However, while language is conventional, this need not imply arbitrariness, even though these two concepts are often conflated (see Zlatev this volume). Japanese mimetics are highly conventional, but "aspects of the form meaning relationship are not arbitrary but are motivated by iconicity." (Kita 2001: 419-420). They are also a central feature of the language. Ivanova (2001: 2) provides the following informative characterization of their role in the language: Japanese is one of the languages with vast sound-symbolic systems [... ] with more than 2,000 onomatopoeic and mimetic words. These words overwhelm ordinary speech, literature and the media due to their expressiveness and load of information. Although they are never used in official documents, it is not exceptional to hear them in formal situations, too. People of all ages employ mimetic words in communication, believing that speech that abounds in
224
Takashi Ikegami and Jordan Zlatev
such words sounds much more natural and full of life than speech that tends to avoid them.
Classifications of Japanese mimetics differ, but at least four types can be distinguished (Martin 1975; Kita 1997; Ivanova 2001; Baba 2003): (a) (b)
(c) (d)
gi-sei-go, words imitative of sounds produced by living creatures, e.g. wan-wan ('bow-bow', the barking of a dog) gi-on-go, words imitative of sounds produced by the inanimate world, e.g. ban ('bang', a loud sound produced from an object hitting another) gi-tai-go, words imitative of physical actions, e.g. koro-koro ('light object rolling repeatedly') gi-jyoo-go, words imitative of psychological states, e.g. muka-tsuku (' irritating')
The first thing that needs to be explained is the sense in which these words can be said to be "imitative", i.e. their iconicity. In the case of (a) and (b), which are similar to the onomatopoetic words that we are all familiar with, this is relatively straightforward: the sound-expression resembles the sound that is produced in the referential scene. Hamano (1998), who is often credited for providing one of the most extensive analysis of Japanese mimetics, attempts to link particular phonemes (e.g. Ipl vs. Id/) and distinctive features (e.g. +1- voice) to specific meaning components. Thus, Ipl is for exampled claimed to be associated with "light, small, fine", /bl with "heavy, large, course" and /ml with "murkiness". However there are problems for this analysis. As Ivanova (2001) points out mimetic "words with initial Imf are maza-maza (clearly, vividly), meki-meki (remarkably, fast), miQchiri (hard, severe), moya-moya (hazy, murky), muka-muka (retch, go mad). It is clear that "murkiness" is not their common semantic feature." Instead, she characterizes the expression-meaning correspondence of 199 mimetics of the (c) and (d) types in terms of more general "phonaesthematic patterns", where "phonaesthematic describes the presence of sequence of phonemes shared by words with some perceived common element in meaning" (Ivanova 2001). An example of such a pattern is given in (3) below. (3)
expressIon: g/k + V + chi + g/k + V + chi LACK OF FLUIDITY OR SPACE meaning: examples: gichi-gichi ('very tight'), kachi-kachi ('frozen hard', 'dried up'), kochi-kochi ('tense', 'stiff, 'frozen hard')
From pre-representational cognition to language
225
What this approach, however, leaves out are generalizations that apply within and across the patterns, such as that contrast +1- voice may be used to distinguish WEIGHT, e.g. koro-koro vs. goro-goro 'heavy object rolling repeatedly' or VOLUME, e.g. chara-chara 'few coins rattling' vs. jara-jara 'many coins rattling' - as when the coins "come out of a slot machine when one hits the jackpot" (Baba 2003: 1868). Finally, neither of these accounts explains in which way the expression of the mimetic "imitates" or "resembles" their meanings, a problem that is even more pronounced with respect to gi-jyoo-go, sometimes also called "psychomimes". How can we make sense of the suggestion that their expression is mimetic with respect to psychological states? We believe that a key to this puzzle is offered by the presence of crossmodal mappings in internal meaning space. In their investigations of synaesthesia Ramachandran and Hubbard (2001) describe experiments in which ordinary subjects were given contrasting pictures of objects of different shapes, for example one which was roundish and "soft" and another which was edgy and "sharp". Then they were given two "names", e.g. kiki and bouba, and asked to pair name and object. Just as expected, 95% of the subjects paired kiki with the sharp object and bouba with the roundish one. 5 Why should this be the case? If we start from the shapes, the crossmodal mapping between vision and touch would allow them to be perceived as "soft" vs. "sharp", motivating the use of these quasi-synaesthetic metaphors as a natural way to describe these figures. From the side of the expressions the production of the velar stop 1kJ, even more so combined with the front, unrounded vowel lil involves obstructions and narrowings in the vocal tract, which can similarly be perceived as "sharp" and "edgy". On the other hand, the shape of the vocal tract and the lips in the production of Iu/ in bouba, are quite literally "roundish" and the passage of the air is "soft". The mappings between the senses Vision-Touch-ProprioceptionSound in internal meaning space thus provides for a correspondence between the shapes and the labels that would be impossible otherwise. A robot or a Martian with a very different kind of body (and possibly even a person lacking haptic sense and proprioception) would not be able to perceive the iconicity involved. 6 5. The experiment was essentially a replication of a classical experiment performed by Kohler (1929), who called the figures takete and baluma. 6. Ramachandran and Hubbard (2001: 18) note that a patient with damage of the angular gyrus, a cortical structure situated between the temporal, parietal and occipital lobes "showed no propensity for the boubalkiki effect". This is in-
226
Takashi Ikegami and Jordan Zlatev
Returning to Japanese mimetics, we can suggest that they are quasisynaesthetic in a similar way to kiki and bouba. This can explain some of the contrasts, such as +1- voice, since voicing involves a higher degree of energy, both in terms of perception and production. But notice that even for kiki and bouba it is not possible to figure out what they would mean only on the basis of the iconicity of the expressions, but only to do the matching to the "correct" shape when one is provided a few shapes to choose from. Sonesson (2001) makes an important distinction between primary iconicity, in which the similarity between A and B can be perceived even without knowing that A is a representation of B, and secondary iconicity, in which to perceive any similarity between A and B requires knowing that A is a representation (sign) of B. Realistic pictures, photographs and pantomime can be interpreted by virtue of primary iconicity. On the other hand, the iconicity of diagrams in which "up stands for more" or onomatopoetic words like "bow-bow" can be appreciated first when one understands their representational expression-content structure. We would like to suggest that the distinction between primary and secondary iconicity is more of a cline, defined by the degree to which the "sign function" (i.e. knowing what an expression represents) is necessary for perceiving the similarity involved in iconicity. From this perspective, the bouba/kiki phenomenon is somewhat intermediary in the cline. Synchronically speaking, the iconicity of Japanese mimetics must be secondary rather than primary: once the child learns what concepts they express as part of the language acquisition process, some of the similarities could be perceived. It is much more difficult to answer the diachronic question: how the particular set of mimetics emerged in the first place? But it is clear that it must have involved a social, collaborative process, and not just a matter of Japanese speakers spontaneously externalizing their specific dynamic categories in speech. Furthermore, since Japanese mimetics are conventional expressions the motivation behind their meaning will be mediated by cultural norms and analogies to other expressions in the language, and hence often difficult to perceive. For example, while the gi-taigo mimetic noro-noro ('drag oneself', 'walk slowly'), can possibly be related to the "laxness" of the nasal 1nl and the round central vowel 10/, the
triguing since the angular gyrus is considered to play a role in cross-modal transfer, and may even be important for the comprehension of (novel) quasisynaesthetic metaphors.
From pre-representational cognition to language
227
meaning of the gi-jyoo-go mimetic noko-noko ('nonchalantly') is less transparent, and possibly co-motivated by its analogy to noro-noro. We can conclude therefore that Japanese mimetics take a somewhat intermediary place between iconic representations such as pictures and pantomimes (which can be interpreted even by virtue of primary iconicity), and fully symbolic and propositionallanguage. While they may bear traces of their pre-representational roots, in particular by relying on the crossmodal mappings of the internal meaning space, they are clearly conventionalized linguistic representations, consisting of socially shared expressions and contents. These contents, however, appear to be more subjective, somewhat difficult to define, and very difficult for second language learners of Japanese (Ivanova 2002). This conclusion is fully consistent with the analysis of Japanese mimetics presented by Kita (1997, 2001). Kita argues that the meaning of Japanese mimetics is (primarily) represented in an affect-imagistic dimension, where "language has direct contact with sensory motor and affective information" (Kita 1997: 380) and "vivid imagery of perceptual and physiological experiences" (Kita 2001: 420). In contrast, Kita advocates that the meaning of non-mimetic expressions constitutes an analytic dimension, including "quantifiers, logical operators, and semantic categories such as agent, patient and action" (Kita 1997: 1863). Apart from the iconicity of Japanese mimetics discussed above, Kita provides the following types of evidence for the need to evoke two different kinds of representations for mimetic and non-mimetic expressions. (a)
(b) (c)
A mimetic such as suta-suta ('walk hurriedly') does not lead to redundancy when combined with a semantically overlapping nonmimetic expression such as haya-aruki ('walk hastily') in a single clause. In comparison, the combination of the latter and another overlapping adverbial such as isogi-ashi ('hurriedly') does lead to an impression of "wordiness". A clause with an (adverbial) mimetic cannot be combined with sentence negation. The production of a mimetic is highly associated with expressive intonation and spontaneous iconic gestures: 95% of the mimetics produced in a study where precisely synchronized with an iconic gesture, compared to only 36% of the verbs.
228
Takashi Ikegami and Jordan Zlatev
At the same time, Kita (2001) clarifies that this evidence concerns only socalled "adverbial mimetics", which are most common and least integrated into the grammar of the language, rather than cases when mimetics are used as verbs, nouns, and noun-modifiers. Thus there is not only close integration between the two dimensions, as Kita points out, but also evidence for a cline between them, with nominal mimetics being the most grammaticalized and "analytical". This interpretation is furthermore supported by a recent pragmatic study of the use of mimetics in 4 different spoken registers, characterized by different levels of "emotive intensity" (Baba 2003). It was found that indeed the total use of mimetics correlated with the intensity level, and that gi-jyoo-go (the "psychomimes") where used only when the episode was narrated from a first-person perspective, involving the highest degree of subjectivity. At the same time, the nominal use of mimetics was most typical with the least emotive and most detached of the four levels. Finally, we can link these studies to the concept of bodily mimesis and mimetic schema, a mental representation involving bodily simulation that is prelinguistic, and arguably precedes and "grounds" language in phylogeny and ontogeny (Zlatev, Person and Gardenfors 2005; Zlatev 2005, this volume). Mimetic schemas are dynamical structures of consciousness involving the body image, used in pre-linguistic thought and externalized in body movements and gestures. The meaning of adverbial Japanese mimetics would thus appear to correspond rather directly to mimetic schemas, even more so than the meanings of verbs, while those of nominal and verbal mimetics, as well as non-mimetic expressions would qualify as "postmimetic". Thus our distinction mimetic/post-mimetic appears to correspond quite closely to Kita's distinction between the "affecto-imagistic" and the "analytic" dimension. Both dimensions are necessary and need to be integrated for effective communication, especially in literature. Consider the following passage, taken from the novel And Then by 80seki Natsume: Turning to the head of his bed, he noticed a single camellia blossom that had fallen to the floor. He was certain he had heard it drop during the night; the sound had resounded in his ears like a rubber ball bounced off the ceiling. Although he thought this might be explained by the silence of the night, just to make sure that all was well with him, he had placed his right hand over
From pre-representational cognition to language
229
his heart. Then, feeling the blood pulsating correctly at the edge of his ribs, he had fallen asleep. 7
The meaning conveyed by this passage consists only in part of its propositional, analytic content, representing subjective and objective states-ofaffairs such as the protagonist turning to the head of the bed, a flower lying on the floor, his memory of a loud sound, placing the hand over the heart etc. What we could call the "embodied meaning" (if the phrase were not overused nowadays) supplements this by the reader identifying with the protagonist and mimetically experiencing the situation from the protagonists' point of view. Notice that Natsume explicitly mentions four different sensory modalities: Proprioception (turning, placing the hand, blood pulsating), Vision (noticing), Hearing (dropping, bouncing), Touch (feeling). Due to the cross-modal connections of the internal meaning space, there are quasi-synaesthetic experiences as well: the smell of the camellia blossom, and the temperature (warmth) of the blood. It is possible to associate in the affecto-imagistic dimension further. For example, for one of us (the native speaker of Japanese), mentioning the pulsating of blood and the heart brings to mind the color red. Hence the color of the falling blossom is also perceived (in imagination) as red!8 The chain of subjective quasisynaesthetic experiences can run on: the color red induces a memory of a red sunset. The memory cold air at sunset stimulates the olfactory senses and that further stimulates the tactile feeling of a cold handraiL .. However, since these meanings are not conventional, they are bound to remain rather private associations. For example, the second (and non-Japanese) author "sees" the fallen flower as white, while a young Swedish poet (reading the English translation) is convinced that it is pink... Such indeterminacy has its advantages, as in the interpretation of poetry, but is problematic if it is essential for "sender" and "receiver" to be able to share similar experiences.
7. This constitutes a literary translation, by Norma Field, of the original passage, which reads: Makura moto wo miruto, yae no tubaki ga itirin tatami no ueni otiteiru. Daisuke ha yuube tokono nakade tashikani kono hana no ochiru oto wo kiita. Kare no miminiha, sore ga gomumari wo tenjyou-ura kara nagetuketa hodo ni hibiita. Yoru ga fukete, atari ga sizukana seikatomo omottaga, nennotame, miginote wo shinzou no ueni nosete, abara no hazureni tadashiku ataru chi no oto wo tashikamenagara nemuri ni tuita. 8. And similarly for another native Japanese speaker, Misuzu Shimotori.
230
Takashi Ikegami and Jordan Zlatev
Japanese mimetics offer the advantage of being conventional, and thus less idiosyncratic, while at the same time evoking the affecto-imagistic dimension. In a passage from another novel, Wayfaring Diary, Natsume uses a number of mimetic expressions describing the spatial movements of the first-person protagonist, as well as of other personae. A flight of stone steps brings back my memories. Some time I strolled around the "Five Mountains". Just like today I was sluggishly walking up the stairs which leads to the residence of monks in Engaku temple or somewhere else. Out of the gate appeared a monk in a yellow robe. He had a flatcrowned head. I ascended while he descended. When we came across, he asked in a sharp voice, "Where are you going?" I answered, "To see the Precincts", and stopped. "There's nothing within the precinct", the monk gave an immediate answer as he left quickly down the stairs.
The mimetics used in place of the highlighted parts are guru-guru ('strolling around'), nosori-nosori ('sluggishly') and suta-suta ('walk hurriedly') capturing vividly the contrast between the protagonist's motion, and metaphorically his state of mind, and those of the monk.
6.
Summary and conclusions
The goal of this chapter has been twofold. On the one hand, we have tried to emphasize the following qualitative differences in human cognition: interaction is different from representation, procedural knowledge is different from declarative knowledge, dynamical categories are different from concepts, synaesthesia is different from language, the meaning of Japanese mimetics (the affecto-imagistic dimension) is different from the meaning of e.g. quantifiers etc. (the analytical dimension) We believe that such emphasis of differences is necessary because there is persistent tendency in embodiment theories, e.g. the "full embodiment" approach advocated by Nufiez (1999), to "resolve" such oppositions by ignoring the differences, and thus, in effect, reducing or eliminating the second and more "disembodied" side of the oppositions. In our view, such an approach is inadequate since we believe that conceptual - and in some
From pre-representational cognition to language
231
cases even ontological - differences between different levels need to be maintained. At the same time, our second goal has been to explore how structures and processes of pre-representational cognition such as dynamical categories, internal meanings space, and synaesthesia can play a role in the "grounding" of mental representations (concepts) and language, i.e. provide evolutionary and ontogenetic prerequisites for the emergence of the latter. Figure 4 summarizes our general picture of the major different levels of meaning in a pyramid ofsemiotic development. The rock-bottom of cognition is life itself, the sine qua non of all meaning (von Uexkiill 1940 [1982]; Maturana and Varela 1987; Zlatev 2003). Value systems derived from natural selection in evolution control the behavior and learning of living organisms. Using the complex systems modeling approach in Section 3 we showed how simple "artificial creatures" can form dynamical categories through sensorimotor coordination without any representational ability. At the same time it was suggested that by distinguishing perception and memory of past experience, categorization can be enhanced. Cross-modal mappings of various sorts are prevalent in the internal "space" which defines a coherent, multi-modal world for the subject. Synaesthesia may be only the tip of the iceberg of this, showing the importance of both correlating and differentiating modalities. The crucial step between pre-representational and representational cognition occurs, we suggested, with the bifurcation of the internal meaning space into a part the focuses on the external world, and another that "looks" into memory, i.e. recall and the projected future, i.e. planning. A necessary step for this, as with any representation, is to acknowledge that a certain "expression" both corresponds to and is different from certain "content". It is possible that this breakthrough occurred precisely with bodily mimesis, making the body image the first true "signifier", but this remains so far only a conjecture. In any case, the presence and role of mimetic schemas is most clearly shown in iconic gesturing, which is universal and ubiquitous. The close synchronization of gesture and speech can be explained if mimetic schemas underlie both, with iconic gestures, and certain structures of language such as mimetic expressions, being more directly related to the mimetic, imagistic dimension, while most of the unique properties of language are qualitatively distinct from mimesis in being symbolic, propositional and fully conventional: the top triangle of the pyramid. Of course, just like the Cat with a Hat of Dr. Seuss, we could cut up this top in smaller and smaller slices, with the more "abstract" ones
232
Takashi Ikegami and Jordan Zlatev
being on the top: written language, mathematics, symbolic logic ... On the other hand, actual language use does not involve the "top" alone, but the whole pyramid, and at least the imagistic dimension involved in mental simulation (imagination) and the quasi-synaesthetic associations of the internal meaning space. This is, in brief, our view of how language can be "embodied", while at the same time remaining conceptually and ontologically irreducible to sensorimotor experience.
Dynamical categories Autopoiesis and intrinsic value (Life)
Figure 4.
From Life to Language. Autopoiesis and intrinsic value preserving the identity of the organism, essential properties of life, constitute the primary basis for cognition and consciousness. The interconnectedness of dynamical categories along common quality dimensions provides an internal meaning space. Bifurcations of this space, possibly arising from self-other separation in overt or covert imitation gives rise to bodily mimesis and mimetic schmeas, which are consciously accessible mental representations. These are imagistic and affect-laden, and can be said to ground language in ontogeny and phylogeny. Only the highest two layers involve mental representations.
From pre-representational cognition to language
233
Acknowledgements We wish to thank the collaborators of the first author: Ryoko Uno, Hiroyuki Iizuka and Gentaro Morimoto, as well as Goran Sonesson, Tom Ziemke, Sotaro Kita and Misuzu Shimotori for their valuable comments on an earlier draft. The first author was partially supported by a grant-in-aids from The 21st Century COE (Center of Excellence) program (Research Center for Integrated Science) of the Ministry of Education, Culture, Sports, Science and Technology, Japan, the Kayamori foundation and the ECAgent project, sponsored by the Future and Emerging Technologies program of the European Community (IST-1940). The second author was supported by the Language, Gesture and Pictures in Semiotic Development project at the Faculty for Humanities and Theology at Lund University, Sweden, and the EU-project Stages in the Evolution and Development of Sign Use (SEDSU).
References Baba, Junko Pragmatic function of Japanese mimetics in the spoken discourse of 2003 varying emotive intensity levels. Journal of Pragmatics 35: 18611889. Baron-Cohen, Simon 1996 Is there a normal phase of synaesthesia in development? PSYCHE, 2(27), June 1996, http://psyche.cs.monash.edu.aulv2/psyche-2-27baron cohen.html Brooks, Rodney 1999 Cambrian Intelligence. Cambridge, Mass.: MIT Press. Costall, Alan this vol. Bringing the body back to life: James Gibson's ecology of agency. Chalmers, David 1997 The Conscious Mind: In Search of a Fundamental Theory. Oxford: Oxford University Press. Cytowic, Richard 1995 Synaesthesia: Phenomenology and neuropsychology. PSYCHE, 2(10), July 1995, http://psyche.cs.monash.edu.aulv2/psyche-2-10cytowic.html Synaesthesia: A Union of the Senses. Second edition. Cambridge, 2002 Mass: MIT press.
234
Takashi Ikegami and Jordan Zlatev
Damasio, Antonio 2000 The Feeling of What Happens. Body, Emotion and the Making of Consciousness. New York: Harvester. Day, Sean 1996 Synaesthesia and synaesthetic metaphors. PSYCHE, 2 (32), July 1996, http://psyche.cs.monash.edu.au/v2/psyche-2-32-day.html Dodge, Ellen and George Lakoff 2005 On the neural bases of image schemas. In: Beate Hempe (ed.), From Perception to Meaning: Image Schemas in Cognitive Linguistics. Berlin: Mouton de Gruyter. Dreyfus, Hubert and Stuart Dreyfus 1986 Mind over Machine. The Power ofHuman Intuition and Expertise in the Era ofthe Computer. New York: Free Press. Edelman, Gerald 1992 Bright Air, Brilliant Fire: On the Matter of the Mind. London: Basic Books. Emmeche, Claus this vo!. On the biosemiotics of embodiment and our human cyborg nature. Fodor, Jerry A. 1981 Representations. Cambridge, Mass.: MIT Press. Freeman, WaIter 2003 A neurobiological theory of meaning in perception. International Journal ofBifurcation and Chaos 13 (9): 2493-2511. Gallagher, Shaun 2005 How the Body Shapes the Mind. Oxford: Oxford University Press. this vo!. Phenomenological and experimental contributions to understanding embodied experience. Gardenfors, Peter 2000 Conceptual Spaces. The Geometry of Thought. Cambridge, Mass.: MIT Press. this vo!. Representing actions and functional properties in conceptual spaces. Gardner, Howard 1987 The Mind's New Science. London: Basic Books. Gibson, James J. 1962 Observations on active touch. Psychological Review 69: 477-491. 1979 The Ecological Approach to Visual Perception. Boston: HoughtonMifflin. Goldstone, Robert L. and Lawrence W. Barsalou 1998 Reuniting perception and conception, Cognition 65: 231-262. Hamano, Shoko 1998 The Sound-symbolic System ofJapanese. Stanford, CA: CSLI.
From pre-representational cognition to language
235
Hebb, Donald O. 1949 The Organization o/Behaviour. New York: John WHey and Sons. Husserl, Edmund 1962 Phiinomenologische Psychologie. Husserliana IX. The Hague: Nijhoff. Itkonen, Esa 1978 Grammatical Theory and Metascience. Amsterdam: Benjamins. 2003 What is Language? Turku: University of Turku Press. lizuka, Hiroyuki and Takashi Ikegami 2004 Simulating autonomous coupling in discrimination of light frequencies. Connection Science 16 (4): 283-299. Ivanova, Gergana 2001 On the relation between sound, word structure and meaning in Japanese mimetic words. Iconicity in Language. http://www.trismegistos.com/lconicitylnLanguage/Articles/lvanova.html Johansson, Gunnar 1973 Visual perception of biological motion and a model for its analysis. Perception and Psychophysics 14: 201-211. Johnson, Mark 1987 The Body in the Mind. Chicago: University of Chicago Press. Johnson, Mark and George Lakoff 2002 Why cognitive linguistics requires embodied realism. Cognitive Linguistics 13 (3): 245-263. Johnson, Mark and Tim Rohrer this vol. We are live creatures: Embodiment, Pragmatism and the cognitive organism. Jordan, N. Is there an Achilles heel in Piaget's theorizing? Human Development 1972 15: 379-382. Kita, Sotaro Two-dimensional semantic analysis of Japanese mimetics. Linguis1997 tics 35: 379-415. Semantic schism and interpretive integration in Japanese sentences 2001 with a mimetic: a reply to Tsujimura. Linguistics 39: 419-436. Kaneko, Kunihiko and Takashi Ikegami 1998 Evolutionary Scenario o/Complex Systems Studies. Tokyo: Asakura. Kaneko, Kunihiko and Tsuda Ichiro 2000 Complex Systems: Chaos and Beyond: A Constructive Approach with Applications in Life Sciences. Berlin: Springer-Verlag. K6hler, Wolfgang 1929 Gestalt Psychology. New York: Liveright.
236
Takashi Ikegami and Jordan Zlatev
Lakoff, George 1987 Women, Fire and Dangerous Things: What Categories Reveal About the Mind. Chicago: University of Chicago Press. Lakoff, George and Mark Johnson 1980 Metaphors We Live By. Chicago: University of Chicago Press. 1999 Philosophy in the Flesh: The Embodied Mind and its Challenge to Western Thought. New York: Basic Books. Lindblom, Jessica and Tom Ziemke this vol. Embodiment and social interration: A cognitive science perspective. Mandler, Jean 2004 The Foundations of Mind: Origins of Conceptual Thought. Oxford: Oxford University Press. Marocco, Davide and Dario Floreano 2003 Active vision and feature selection in evolutionary behavioral systems. In: Bridget Hallam, Dario Floreano, Jean-Arcady Hallam, Gillian Hayes and John A. Meyer (eds.), From Animals to Animates VII: Proceedings of the 7th International Conference on Simulation ofAdaptive Behavior, 247-255. Cambridge, Mass: MIT Press. Martin, Samuel, E. 1975 A Reference Grammar of Japanese. New Haven: Yale University Press. Maturana, Humberto and Francisco Varela 1987 Tree of Knowledge: The Biological roots of Human Understanding. Boston: Shambhala. Maurer, Daphne 1993 Neonatal synaesthesia: implications for the processing of speech and faces. In: Benedict de Boysson-Bardies (ed.), Developmental Neurocognition: Speech and face processing in the first year of life, 109124. Dordrecht: Kluwer. Meltzoff, Andrew N. and Richard W. Borton 1979 Intermodal matching by human neonates. Nature 282 (5737): 403404. Meltzoff, Andrew and Michael Moore 1977 Imitation of facial and manual gestures by human neonates. Science 198: 75-78. 1995 Infants' understanding of people and things: From body imitation to folk psychology. In: Jose Bermudez, Naomi Eilan, and Anthony Marcel (eds.), The Body and the Self, 43-70. Cambridge, Mass.: MIT Press. Morimoto, Gentaro and Takashi Ikegami, 2004 Evolution of plastic sensory-motor coupling and dynamic categorization, In: Jordan Pollack, Mark Bedau, HH Husbands, Takashi Ike-
From pre-representational cognition to language
237
gami and Richard A. Watson (eds.), Artificial Life IX' Proceeding of Ninth International Conference on the Simulation and Synthesis of Living Systems, 188-193. Cambridge, Mass.: MIT Press. Myowa-Yanakoshi, Masako, Tomonaga, Masaki, Tanaka, Masayuki and Tetsuru Matsuzawa 2004 Imitation in neonatal chimpanzees (Pan troglodytes), Developmental Science 7 (4): 437-442. Nolfi, Stefano and Davide Marocco 2002 Active perception: A sensorimotor account of object categorization. In: Bridget Hallam, Dario Floreano, Jean-Arcady Hallam, Gillian Hayes and John A. Meyer (eds.), From Animals to Animates VII: Proceedings of the 7th International Conference on Simulation of Adaptive Behavior, 266-271. Cambridge, Mass: MIT Press. Nuiiez, Rafael 1999 Could the future taste purple? Reclaiming mind, body and cognition. Journal ofConsciousness Studies 6 (11/12): 41-60. O'Regan, J. Kevin and Alva Noe 2001 A sensorimotor account of vision and visual consciousness, Behavioral and Brain Sciences 24: 939-1011. Osgood, Charles E. 1959 The cross-cultural generality of visual-verbal synaesthetic tendencies. Behavioral Science 5: 146-169. Piaget, Jean 1945 La formation du symbole chez l'enfant. Neuchatel-Paris: Delachaux et Niestl6; English translation: G. Gattegno and F. M. Hodgson. Play, Dreams and Imitation in Childhood. New York: Norton, 1962. Pfeifer, Rolf and Christian Scheier 2001 Understanding Intelligence. Cambridge, Mass: MIT press. Port, Robert and Timothy Van Gelder 1998 Mind as Motion: Explorations in the Dynamics ofCognition. Reprint Edition. Cambridge, Mass.: MIT Press. Ramachandran Vilayanur S. and Hubbard Edward M. 2001 Synaesthesia: A window into perception, thought and language. Journal ofConsciousness Studies 8 (12): 3-32. Rizzolatti, Giacomo and Michael Arbib 1998 Language within our grasp. Trends in Neurosciences 21: 188-194. Rosch, Eleanor 1973 Natural categories. Cognitive Psychology 4: 328-350. Sasaki, Masato 2000 Chikaku ha Owaranai [Perception Never Stops - Introduction to Affordance] Tokyo: Seido-sha.
238
Takashi Ikegami and Jordan Zlatev
2002
Sentai-Shinrigaku no Kousou [Grand Design of Affordance] Tokyo: University of Tokyo Press. Saussure, Ferdinand de 1916 Cours de Linguistique Generale. Paris: Payot. Scheier, Christian and RolfPfeifer 1995 Classification as sensory-motor coordination: A case study on autonomous vehicles. In: Federico Moran, Alvaro Moreno, Juan J. Merelo and Pablo Chacon (eds.), Proceedings of the 3rd European Conference on Artificial Life, 657-667. Berlin: Springer-Verlag. Searle, John 1992 The Rediscovery ofthe Mind. Cambridge, Mass.: MIT Press. Sonesson, Goran 1989 Pictorial Concepts. Lund: Lund University Press. 2001 From semiosis to ecology. On the theory of iconicity and its consequences for the ontology of the Lifeworld. VISIO 6/2-3: 85-110. this vo!. From the meaning of embodiment to the embodiment of meaning: A study in phenomenological semiotics Steels, Luc 1994 The artificial life roots of artificial intelligence. Artificial Life 1: 75IlO. Svensson, Henrik, Jessica Lindblom and Tom Ziemke this vo!. Making sense of embodied cognition: Simulation theories of shared neural mechanisms for sensorimotor and cognitive processes. Tani, Jun and Stefano Nolfi 1999 Learning to perceive the world as articulated: an approach for hierarchical learning in sensory-motor systems. Neural Networks 12: 1131-1141. Thompson, Evan 2001 Empathy and consciousness, Journal of Consciousness Studies 8 (5/7): 1-32. Ullman, Stephen 1964 Language and Style. Oxford: Basil Blackwell. Varela, Francisco, Evan Thompson and Eleonor Rosch 1991 The Embodied Mind. Cognitive Science and Human Experience. Cambridge, Mass.: MIT Press. von Uexkiill, Jakob 1940 The theory of meaning. Semiotica 42 (1): 25-82. Reprint 1982. Ziemke, Tom 2001 The construction of 'reality' in the robot. Foundations of Science 6 (1): 163-233. 2003 a What's that thing called embodiment? In: Richard Alterman and David Kirsh (eds.), Proceedings of the 25 th Annual Meeting of the
From pre-representational cognition to language
239
Cognitive Science Society, 1305-1310. Mahwah, NJ: Lawrence Erlbaum. 2003 b On the role of robot simulations in embodied cognitive science. AISB Journal 1 (4): 389-399. Ziemke, Tom and Noel Sharkey 2001 A stroll through the worlds of robots and animals: Applying Jakob von Uexkiill's theory of meaning to adaptive robots and artificial life. Semiotica 134 (1/4): 701-746. Ziemke, Tom and Mikael Thieme 2002 Neuromodulation of reactive sensorimotor mappings as a short-term memory mechanism in delayed response Tasks. Adaptive Behavior 10 (3/4): 185-199. Zlatev, Jordan 2003 Meaning = Life (+ Culture). An outline of a unified biocultural theory of meaning. Evolution o/Communication 4 (2): 253-296. 2005 What's in a schema? Bodily mimesis and the grounding of language. In: Beate Hampe (ed.), From Perception to Meaning: Image Schemas in Cognitive Linguistics, 313-342. Berlin: Mouton de Gruyter. this vol. Embodiment, language and mimesis. Zlatev, Jordan, Tomas Persson and Peter Gardenfors 2005 Bodily mimesis as "the missing link" in human cognitive evolution, LUCS 121. Lund: Lund University Cognitive Studies.
Making sense of embodied cognition: Simulation theories of shared neural mechanisms for
sensorimotor and cognitive processes 1 Henrik Svensson, Jessica Lindblom and Tom Ziemke
Abstract Although an increasing number of researchers are convinced that cognition is embodied, there still is relatively little agreement on what exactly that means. Notions of what it actually means for a cognizer to be embodied range from simplistic ones such as "being physical" or "interacting with an environment" to more demanding ones that consider a particular morphology or a living body prerequisites for embodied cognition. Based on experimental evidence from neuroscience, psychology and other disciplines, we argue that a key to understanding the embodiment of cognition is the "sharing" of neural mechanisms between sensorimotor processes and higher-level cognitive processes. The latter are argued to be embodied in the sense that they make use of (partial) simulations or emulations of sensorimotor processes through the re-activation of neural circuitry also active in bodily perception and action. Keywords: action, embodied cognition, gesture, intersubjectivity, language, mirror neurons, perception, simulation theories.
1.
Introduction
Although an increasing number of researchers are convinced that cognition is embodied, there still is relatively little agreement on what exactly that means. Notions of what it actually means for a cognizer to be embodied range from simplistic ones such as "being physical" or "interacting with an environment" to more demanding ones that consider a particular morphology or a living body prerequisites for embodied cognition (cf., e.g., Ander1. This is an extended and revised version ofSvensson and Ziemke (2004).
242
Henrik Svensson, Jessica Lindblom and Tom Ziemke
son 2003; Chrisley and Ziemke 2003; Rohrer this volume; Wilson 2002; Ziemke 2003). This lack of agreement or coherence, after two decades of research on embodied cognition, has unfortunate consequences. Firstly, critics commonly argue that the only thing that embodied cognitive theories have in common is, in fact, the rejection of traditional, computationalist and supposedly disembodied cognitive science. Secondly, there is a certain trivialization of embodiment, not least among many AI researchers who consider as embodied any physical system, or in fact any agent that interacts with some environment, such that the distinction between computationalist and embodied cognitive theories disappears since, in some sense, all systems are embodied, and thus cognitive science has always been about embodied cognition (Chrisley and Ziemke 2003; Ziemke 2004). Thirdly, there is the "misunderstanding" that perhaps embodiment is only relevant to sensorimotor processes directly involving the body in perception and action, while higher-level cognition might very well be computational in the traditional sense and only dependent on the body in that mental representations ultimately need to be grounded in sensorimotor interaction with the physical environment. Instead, we argue that the key to understanding the embodiment of cognition, in an important, non-trivial sense, is to understand the "sharing" of neural mechanisms between sensorimotor processes and higher-level cognitive processes. Based on experimental evidence from a range of disciplines, we argue that many, if not all, higher-level cognitive processes are body-based in the sense that they make use of (partial) simulations or emulations2 of sensorimotor processes through the re-activation of neural circuitry that is also active in bodily perception and action (cf. Clark and Grush 1999; Hesslow 2002; Grush 2004). As Barsalou, Solomon and Wu (2003: 45) put it, the main point is that "simulations of bodily states in modality specific brain areas may often be the extent to which embodiment is realized". The next section describes the idea of cognition as body- and simulation-based in more detail. Section 3 presents empirical evidence that strongly suggest a role of sensorimotor simulation in mental imagery (3.1 and 3.2), agent-object interaction (3.3), social cognition (3.4 and 3.5) and
2. The terms simulation and emulation are used somewhat interchangeably in this paper, as in much of the literature, but it should be noted that they are sometimes used differently (e.g., Grush 2004).
Making sense ofembodied cognition
243
language (3.6 and 3.7). The final section then briefly discusses some open questions and directions for future work.
2.
Cognition as body-based simulation
The idea that even higher-level cognitive processes are in a strong sense grounded in bodily activity and experience is, of course, hardly new. It was developed already in the 1980s, most influentially by Maturana and Varela (1980, 1987; cf. Varela, Thompson and Rosch 1991) from a neurobiological perspective, and by Lakoff and Johnson (1980, 1999) from a linguistic perspective. Lakoff (1988: 121) summarized the basic idea as follows: Meaningful conceptual structures arise from two sources: (1) from the structured nature of bodily and social experience and (2) from our innate capacity to imaginatively project from certain well-structured aspects of bodily and interactional experience to abstract conceptual structures.
Back in the 1980s, however, relatively little was known about exactly how such an imaginative projection from bodily experience to abstract concepts might work. In recent years more detailed accounts of how the sensorimotor structures of the brain are involved in cognition have been developed in several disciplines, often taking into account data from neurophysiological and neuroimaging studies (cf. Johnson and Rohrer this volume). These accounts show that the traditional strong division between perception and action, as well as between sensorimotor and cognitive processes, needs to be revised. A particular kind of "embodiment" theory that has emerged in different contexts are the so-called emulation or simulation theories (e.g., Barsalou, Solomon and Wu 2003: Decety 1996; Frith and Dolan 1996; Grush 2003, 2004; Gallese 2003a; Hesslow 1994, 2002; Jeannerod 1994, 2001). The basic idea is that neural structures that are responsible for action and/or perception are also used in the performance of various cognitive tasks. As Hesslow (2002) pointed out, this idea is not entirely new; Alexander Bain, for example, suggested back in 1896 that thinking is basically a covert form of behavior that does not activate the body and thus remains invisible to external observers. Today simulation theories, based partly on data from neuroscience, can further clarify the possible role of simulation in cognition, and thus explain in a more concrete way than before the embodiment of cognition.
244
Henrik Svensson, Jessica Lindblom and Tom Ziemke
Hesslow (2002) proposed that cognition can to a large extent be explained by simulated chains of covert behavior. More precisely, at least human brains3, from a certain age, have the ability to reactivate previous perceptions and actions in the absence of any sensory input or overt movement. These simulations of actions or perceptions are through, for example, conditioning coupled to achieve internal simulations of organismenvironment interactions (Hesslow 2002). A similar, technically more detailed account of the general idea has recently been formulated by Grush in his emulation theory of representation (Grush 2003, 2004; see also Clark and Grush 1999). Based on the control-theoretic concept offorward models (emulators), previously used to account for motor control (e.g., Wolpert and Kawato 1998), Grush developed an emulation theory for several types of cognitive processes, including perception, imagery, reasoning and language. In a nutshell, he argued that emulation circuits are able to calculate a forward mapping from control signals to the (anticipated) consequences of executing the control command. For example, in goal-directed hand movements the brain has to plan parts of the movement before it starts. To achieve a smooth and accurate movement proprioceptive/kinesthetic (and sometimes visual) feedback is necessary, but sensory feedback per se is too slow to affect control appropriately (Desmurget and Grafton 2000). The "solution" is an emulator/forward model that can predict the sensory feedback resulting from executing a particular motor command. A further prediction is that the emulator circuits are achieved by the reactivation of the same sensorimotor processes that are used in overt action and perception (e.g., Grush 2004; Hesslow 2002; Jeannerod 2001)4. Connecting simulated perceptions and actions by an anticipatory mechanism might also explain certain ways/types of problem solving (Hesslow 2002). An example of this kind of simulation is possibly the problem solving and planning involved in the Tower-of-London problem (Shallice 1982), a task that requires subjects to manually put objects on top of each other under certain non-trivial constraints that require planning ahead. Dagher et al. (1999) found that even seemingly purely mental plan3. To which degree animals are capable of so-called "mental time travel", i.e., recollection of specific past events or anticipation of the future, is still an open question. For a detailed recent discussion see Clayton, Bussey and Dickinson (2003). 4. According to Blakemore, Frith and Wolpert (1999), this is also why for most people it is not so easy to tickle themselves: the forward model produces predicted sensory feedback that "prepares" the agent.
Making sense ofembodied cognition
245
ning and problem solving, without physical object manipulation, activated higher motor areas (premotor cortex, prefrontal cortex) and the basal ganglia, which seemed to interact with visual and posterior parietal areas (cf. Schall et al. 2003). This gives some support to the idea that the subjects solved the problem by simulating the moving around of objects through the use of reactivated (or simulated) perceptions and actions (Hesslow 2002). In other words, the simulation account argues that cognitive processes are achieved by the reactivation of the same neural structures as used for physically sensing, moving and manipulating the environment. The following section summarizes a number of the many empirical studies that support the idea that cognition is body-based, especially as predicted by simulation theories.
3.
Empirical evidence
Several sources of evidence support the basic tenet of the simulation account, i.e., the idea that perceptual and motor areas of the brain can be covertly activated either separately or in sequence for use in cognitive processes. Moreover, there are a number of accounts that implicate simulation on multiple levels of cognitive complexity from motor imagery to language. The experiments mentioned here range from mental chronometry studies to functional imaging studies of animal and human brains. The following subsections review some of the empirical evidence that suggest that sensorimotor structures of the brain are deeply involved in the generation of cognitive phenomena, such as imagery and problem solving. The starting point is the extensive similarities found between the neural structures activated during preparation (and execution) of an action and mentally simulating an action (i.e., motor imagery), as well as between visual perception and visual imagery. These similarities are so striking that some have posited that internally activated actions and perceptions are the same as overt ones, but without actual sensory input or overt movement (e.g., Hesslow 2002; Jeannerod 2001).
3.1.
Motor imagery
There has been extensive research in the last couple of decades into the relation between motor imagery and the preparation and execution of ac-
246
Henrik Svensson, Jessica Lindblom and Tom Ziemke
tions (Jeannerod 2001; Johnson 2000). A large number of behavioral and neurophysiological experiments have shown that motor imagery and the mechanisms involved in the planning and production of overt actions share a large number of properties (e.g., Decety, Jeannerod, and Prablanc 1989; Jeannerod and Decety 1995; Jeannerod and Frak 1999; for reviews see Decety 1996, 2002; Jeannerod 1994, 2001). Motor imagery is usually defined as the recreation of an experience of actually performing an action, e.g., the person should feel as if he or she was actually walking (Decety 1996; Jeannerod 1994). It is difficult to entirely distinguish or separate motor imagery from visual imagery, since actions also involves visual consequences (e.g., Jeannerod 1994; see also Dechent, Merboldt and Frahm 2004). However, a motor image differs from a visual image in that it is based mainly on kinesthetic/proprioceptive information about the action, i.e., the subject feels as if performing the action, not necessarily involving a visual iconic representation of the action or the external visual surroundings. Some examples involving a motor image would be imitating somebody's movements, anticipating the effects of an action and having kinesthetic or bodily sensations (e.g., muscle contractions, heart beats) (Jeannerod 1994). In fact, the term motor imagery is sometimes used in a wider sense to mean an unconscious form of motor imagery such as when subjects are tested on tasks that, for example, require making judgments about actions, but not necessarily evoke a feeling of performing an action (cf., Jeannerod and Frak 1999). For instance, Frak, Paulignan and Jeannerod (2001) manipulated the positions to place fingers on a cup of water and asked subjects whether it would be "easy", "difficult", or "impossible" to grasp the cup and pour the contents into another container. 5 They found that subjects rated grasp executions near the limits of what is a physically possible grasping action as difficult, whereas positions leading to grasp executions that are preferred when subjects actually perform the action were rated as easy. Furthermore, the response times increased with the estimated difficulty of the task. The interpretation was that subjects were mentally simulating performing the action in order to determine its feasibility. Thus, subjects seemed to rely on an unconscious form of motor imagery. However, to avoid confusion this is better described in terms of the subjects relying on simulations of actions, i.e., reactivations of motor areas responsible performing an action but without any overt movement. Motor imagery should 5. For a closely related study see also Johnson (2000).
Making sense ofembodied cognition
247
be used to denote the, at least, partly conscious act of feeling as if one were performing an action, which in turn might be based on the more general (neural) mechanism of simulation of action. In fact, there is considerable evidence that motor structures are reactivated in several phenomena closely related to motor imagery, such as intentions to act, judging the feasibility of an action, determining by observation whether an object is graspable, and actions in dreams (Gallese 2003a; Jeannerod 2001). Since imagined actions or motor imagery have been shown to share many properties with actual actions, that will be the focus our brief review of the current evidence for similarities between overt actions and mental actions. 6 The empirical evidence cited in support for the (partial) equivalence of overt actions and motor imagery comes mainly from mental chronometry, physiological responses, measurements of brain activity and lesion studies. Mental chronometry experiments, which measure the duration of behavioral and mental responses, have found that the time needed to mentally execute actions in several conditions closely corresponds to the time it takes to actually perform them (Jeannerod and Frak 1999; Papaxanthis, Pozzo et al. 2002; Papaxanthis, Shieppati et al. 2002; for a review see Guillot and Collet 2005). For example, Decety and Jeannerod (1996) found that Fitt's law (i.e., the finding that execution times increase with task difficulty) also holds for motor imagery. Decety et al. (1989) compared the durations of walking towards targets (with blindfolds) placed at different distances and mental simulation of walking to the same targets. In both conditions times were found to increase with the distance covered. Besides producing similar reaction times motor imagery has been shown to produce similar physiological effects, in the form of muscle 6. Johnson (2000) has pointed out that it is important in experiments on motor imagery not to ask subjects explicitly to use motor imagery in order to minimize experimenter effects. However, the current issue does not concern if and in which situations people rely on motor imagery, where Johnson's concern is correct; rather it is if motor imagery and other related (cognitive) phenomena rely on simulations of actions as defmed above. Consequently, some studies reported here might be better characterized as employing a covert form of motor imagery, i.e., simulation of action rather than actual conscious awareness of imagining an action. Deiber et al (1998) pointed out a related issue concerning the lack of explicit control of whether visual imagery was involved in the studies. A hypothesis is that if simulations of actions are coupled with anticipatory mechanisms, simulation would be a more pervasive property of cognition, not only related to imagery, as will be seen throughout this chapter.
248
Henrik Svensson, Jessica Lindblom and Tom Ziemke
strength and autonomic responses, as overt actions. Sport psychological experiments have revealed that mentally practicing a specific action can enhance the performance when subsequently actually performing the previously mentally imagined action (Jeannerod 1994). For example, Vue and Cole (1992; cf. Ranganathan et al. 2004) showed that mentally simulating oneself contracting a muscle could significantly (by 22%) increase muscle strength? (actual training produced an increase of 30%). Jeannerod (1994) pointed out that there have been several interpretations of these results, for example, that motivational factors increase the physiological arousal and as result of this performance is increased. Another explanation is that if motor imagery, at least in part, involves the same neural structures responsible for overt action it is probable that motor imagery could "train" the neural structures used in subsequent execution of the previously imagined action (Jeannerod 2001; Ranganathan et al. 2004; Vue and Cole 1992). Besides strength increases, autonomic responses, such as the adaptation of heart and respiratory rates, which are beyond voluntary control, have been shown to be activated by motor imagery to an extent proportional to that of actually performing the action, and as a function of mental and actual physical effort (Decety 1996; Jeannerod 1994; Jeannerod and Decety 1995). For example, Decety et al. (1991) showed that when subjects imagined performing a leg exercise their heart and respiratory rates increased. Since the first study that investigated motor imagery using regional cerebral blood flow (rCBF) to indicate active brain areas (Ingvar and Philipson 1977), there have been many neuroimaging experiments that confirm the first study's indication that similar brain areas are activated in overt actions and motor imagery. Together with results from studies on neurological disorders (e.g., Jeannerod and Decety 1995), these experiments have, with some discrepancies, found motor imagery to involve structures primarily associated with the execution of actions, such as primary motor cortex, premotor cortex, supplementary motor area, lateral cerebellum and the basal ganglia, as well as those primarily associated with action planning, such as, the dorso-Iateral prefrontal cortex, inferior frontal cortex and posterior parietal cortex (Grezes and Decety 2001; Jeannerod 2001; Jeannerod and Frak 1999; Schwoebel, Boronat and Coslett 2002). Even though there is a strong similarity between overt action and simulation of action, the actual degree of overlap is not yet fully understood. For
7. More precisely, the voluntary force production of the fifth digit's metacarpophalangeal joint (Yue and Cole 1992).
Making sense ofembodied cognition
249
example, subjects in mental chronometry experiments tend to over- or underestimate durations of actions in some conditions (Guillot and Collet 2005) and activation of the primary motor cortex is not found in every neuroimaging study on motor imagery (Dechent, Merboldt and Frahm 2004; Grezes and Decety 2001). To summarize, there is a substantial amount of research, which suggests that there are strong psychological and neurophysiological resemblances between overt actions and motor imagery. Thus, motor imagery and associated phenomena might best be explained in terms of simulations of actions, i.e., neural processes normally used to produce overt actions are reactivated by motor imagery, with the overt movement inhibited. Although the focus so far has been on actions, it should not be forgotten that the motor system also integrates sensory information when planning and executing an action (e.g., Grush 2004; cf. Desmurget and Grafton 2000; Jeannerod 1997). Thus, simulating an action might also involve an emulator mechanism (forward model) that predicts the proprioceptive feedback and sometimes visual feedback that would have resulted from the executed action to produce the (conscious) feeling of mentally imagining performing an action (Decety 2002; Grush 2004; Jeannerod 1997, 2001). Also, there are a number of unanswered questions concerning motor imagery worth mentioning here (which, however, do not affect the chapter's main point concerning the embodiment of cognition). For example, to what degree do actions and mental simulations of actions engage executive motor structures (such as, the primary motor cortex) (cf., e.g., Decety 2002; Jeannerod and Frak 1999), and how is the overt movement "hindered" (Jeannerod 2001; Hesslow 2002)? Although there may not be a complete overlap between the neural structures involved in real and mentally simulated action, we believe that the evidence suggests that they are not different in nature, but only in degree.
3.2.
Visual imagery
The discussion of motor imagery can be extended to the visual modality in that similar types of studies report that perceptual structures can be and are internally reactivated when, e.g., visually recreating a previous perception. Many studies in cognitive psychology have found such similarities between visual perception and visual imagery (Farah 1988; Finke 1989). For example, in a seminal study by Shepard and Metzler (1971), subjects had to
250
Henrik Svensson, Jessica Lindblom and Tom Ziemke
determine whether two three-dimensional forms had the same shape or not. Besides the introspective reports of the subjects that they had mentally rotated three-dimensional forms to see if they were the same, the results showed that reaction times increased linearly with the angular difference, which meant that the imagined rotations were performed at a constant rate (cf. Finke 1989). Furthermore, they found reaction times not to be longer for depth rotations than for rotations in the picture plane. These two findings suggest that imagined rotations in some aspects correspond to actual physical rotations of objects (Finke 1989). Although alternative explanations are difficult to rule out, such as experimenter effects or tacit knowledge (cf. Finke 1989), neuropsychological and neuroimaging studies offer more conclusive evidence of the involvement of similar areas of the brain in mental visual imagery and visual perception (Farah 1988,2000; Hesslow 2002; Kosslyn and Thompson 2000). However, as in the case of motor imagery, the overlap is neither complete nor uniform across experiments, which to some extent might be explained by differences between types of imagery, and resolution of mental images (Ganis, Thompson and Kosslyn 2004; Kosslyn and Thompson 2000; Trojano et al. 2004; see also Mazard et al. 2004). In line with simulation theories, Wexler, Kosslyn and Berthoz (1998: 92) suggested that at least visual mental imagery in the form of mental rotation is achieved by a "prediction of an about-to-be-executed motor action". Their hypothesis has also been supported by functional brain imaging studies (Vingerhoets et al. 2002). The so-called mental imagery debate in cognitive science between those arguing for a picture theory (e.g., Kosslyn 1994) and those arguing for a description theory (e.g., Pylyshyn 1981), which has recently reemerged (Kosslyn, Ganis and Thompson 2003; Pylyshyn 2003; see also Thomas 1999), is not of central importance for the discussion in this chapter since both perception and imagery may be said to use the same format. That is, even though there are strong similarities between properties of imagery and of perception both may be explained using either theory (Block 1983; see also Pylyshyn 2003). On the other hand, explaining cognition as reactivation of sensorimotor structures does not (at least in some cases) rely on the computer metaphor of symbol manipulation (cf. Lindblom and Ziemke this volume), and thus may offer a novel view that does not see the vehicle and the content of representations as separate entities but as constitutive of each other (Gallese 2003b; cf. Dreyfus 2002; Thomas 1999). The representations of embodied cognitive theories are not the same as the amodal symbol systems proposed by classical theories, but are grounded in
Making sense ofembodied cognition
251
bodily interaction with an environment. According to Gallese (2003b), canonical neurons in the monkey brain illustrate how the interaction between an agent and its environment provides an example of such representations.
3.3.
Canonical neurons
The discovery of so called mirror neurons and canonical neurons in the macaque monkey8 (di Pellegrino et al. 1992; Murata et al. 1997; Rizzolatti et al. 1996) have resulted in a number of different theories about their role in primate and human cognition (cf. Johnson and Rohrer this volume; Gallagher this volume; Lindblom and Ziemke this volume). Canonical neurons (and mirror neurons, see Section 3.5 below) have been found in the rostral region of the inferior premotor cortex (area F5) of the monkey brain which contains neurons that are known to discharge during goal directed hand movements, such as grasping, holding, tearing, or manipulating. However, they are not responsive to similar movements, but only actions that have the same "meaning" (di Pellegrino et al. 1992; Rizzolatti et al. 1996; Rizzolatti et al. 2002), which is why they are often interpreted as internal representations of actions, rather than motor or movement commands (Jeannerod 1994; Rizzolatti et al. 1996; Rizzolatti et al. 2002). Gallese (2003b) emphasized seeing them as coding not physical parameters of movement, but a relationship between agent and object. The so-called canonical neurons of area F5 have both motor properties and sensory properties, and they discharge both during the action they code and when an object that affords that action in the Gibsonian sense (cf. Costall this volume; Sonesson this volume) is perceived. Canonical neurons have a strict congruence between the type of grasping action and the size or shape of the object they respond to (Gallese 2003b). This implies that they implement affordances, e.g., code objects that are graspable-in-a-certainway, specifying not only perceptual and action aspects but a particular relationship between agent and environment (cf. Gallese 2003b, see also 8. For practical and ethical reasons it is so far not possible to investigate the existence of mirror neurons (and canonical neurons) at the single neuron level in humans. However, many researchers have presented strong arguments for the existence of a similar system in humans (e.g., Arbib in press; Fadiga et al. 1995; Grafton et al. 1996; Grezes et al. 2003; Rizzolatti and Arbib 1998; Rizzolatti et al. 1996).
252
Henrik Svensson, Jessica Lindblom and Tom Ziemke
Dreyfus 2002). This kind of function has also been suggested as part of human action and motor imagery (Jeannerod 1994). Jeannerod (1994: 233) argued that a motor representation not only contains the "plan" that generates the actual kinesthetic movements but also a pragmatic representation "in which the visual attributes of objects are thought to be processed as affordances, that is, on the basis of the extent to which they are related to a given action directed at these objects". The more general implication is that the common division between outer and inner, or mind and body/environment, stemming from the debates between rationalism and empiricism, is not plausible (cf. Lindblom and Ziemke this volume). The generation of behavior and of cognition always takes place in the interaction with an environment and if anything it is this interaction that gets represented by the subject through dynamic brain processes. Similar simulation mechanisms as those thought to be the basic machinery behind mental imagery have also been suggested to be an essential part of social cognition, where mirror neurons might be a key example, as discussed in more detail in the following subsections.
3.4.
The body as an intersubjective resonance mechanism
The research mentioned so far enforces yet another dichotomy of cognitive science, viz., that of individual versus social, by focusing on individual cognitive processes. However, even though it may be possible to separate the two for explanatory purposes, in nature humans and many other animals are essentially social beings. Blakemore, Winston and Frith (2004), for example, argued that humans are highly social beings such that much of the brain must have evolved to handle social communication and interaction. The interest in social cognitive neuroscience, the empirical study of the neural mechanisms underlying social cognitive processes, has increased rapidly in recent years. Recent work addressing social aspects of embodiment implies that the body has several important roles in social interactions (cf. Lindblom and Ziemke this volume). Dautenhahn (1997), for instance, hypothesized that a phenomenological dimension of social understanding might be founded in embodied mechanisms that allow biological agents (in particular humans) to read "social signs" and other agent's mind, by simulating the other agent's emotional stance, and she suggested that the agent's own body can be used as the point of reference. In line with this remark, simulation mechanisms have also been suggested to play a vital role for
Making sense ofembodied cognition
253
action recognition (e.g., Gallese 2003a), empathy (e.g., Decety and Chaminade 2003), social cognition (e.g., Barsalou et al. 2003; Nielsen 2002; cf. Lindblom and Ziemke this volume) and even language understanding (e.g., Glenberg and Kaschak 2002). Yet, social cognition and its connection to bodily-based simulation processes are not well understood, and they have not received much attention in embodied/situated cognitive theories. The evidence reported in the following subsections emphasizes that perceptual and motor processes are not different in nature at the neural and behavioral level, but seem to be intimately linked in social cognition, possibly through simulation mechanisms. Barsalou et al. (2003) noted that there are at least four types of wellknown phenomena in social psychology experiments, which can be explained as simulations of bodily states (cf. Lindblom and Ziemke this volume; Nielsen 2002). Firstly, perceived social stimuli can produce bodily states (e.g., a more slumped posture in response to negative feedback). Secondly, social stimuli can induce bodily mimicry (e.g., a smile in response to a smile). Thirdly, bodily states can produce and effect emotional states (e.g., an upright posture tends to have a positive effect). Finally, compatibility between bodily states and emotional states leads to increased cognitive performance (e.g., it is easier to pull a lever towards you in response to "positive" stimuli than in response to "negative" ones; cf. also Section 3.7). These phenomena have in common that states of the body, such as postures, arm movements and facial expressions, change automatically without any conscious mediating knowledge structures in specific instances of social interaction (cf. Nielsen 2002). Roughly speaking, these phenomena imply that bodily states are involved in social cognition and that they might constitute the very foundations of the particular social cognitive phenomena in question. An example of how perception, action and social cognition come together at the level of single neurons is so-called mirror neurons in macaque monkeys (Decety and Sommerville 2003).
3.5.
Mirror neurons
Beside canonical neurons, area FS of the monkey brain contains so-called mirror neurons which have sensory properties that become activated both when performing a specific action and when observing the same goaldirected hand (and mouth) movements of an experimenter (di Pellegrino et
254
Henrik Svensson, Jessica Lindblom and Tom Ziemke
al. 1992; Rizzolatti et al. 1996; Rizzolatti et al. 2002). Mirror neurons provide a key example of sensorimotor brain structures also involved in (social) cognitive processes. Although different hypotheses exist, many of the theories of the function of mirror neurons emphasize their role in social cognition (e.g., Gallese and Goldman 1998; Gallese, Keysers and Rizzolatti 2004; Rizzolatti and Arbib 1998; Rizzolatti et al. 2002). These researchers aclmowledge that area F5 and mirror neurons can be interpreted as a kind of observationexecution mechanism or resonance mechanism, which links the observed actions to actual actions of the subject's own behavioral repertoire. That is, it enables the agent to understand the meaning of the observed action by simulating the observed action through its own sensorimotor processes. Thus, mirror neurons can be interpreted as representations of actions, used both for performing and understanding actions (e.g., Rizzolatti et al. 1996; Rizzolatti and Arbib 1998). Gallese and Goldman (1998) hypothesized that mirror neurons might be a basic mechanism necessary for "mind-reading", i.e., attributing mental states to others. In particular, they argued that such mechanisms can explain how an agent determines what mental states of another agent have already occurred. When mirror neurons are externally activated by observing a target agent executing an action (allowing the subject to evaluate the meaning of the other's action), the subject lmows (visually) that the observed target is currently performing this very action and thereby "tags" the "experienced" action as belonging to the target. Brain imaging experiments with human subjects sitting still observing others moving have indicated that the mirror system seems to distinguish between biological and non-biological actions (Blakemore, Winston and Frith 2004). It is commonly argued that another person's action can influence one's own actions, and Sebanz et al. (2003) showed that when a subject carried out a spatial compatibility task, the presence of another person altered the timing of the response time. Moreover, observation of another person's actions has an impact on one's own actions, and interference effects occur when there is a mismatch between one's own actions and the observed ones (Blakemore, Winston and Frith 2004). However, these interference effects seem to occur only for observed human actions and not while observing a robot making interfering actions (Kilner, Paulignan and Blakemore 2003). Blakemore, Winston and Frith (2004) asked what is special about human biological actions and why mirror systems require biological action to be activated. Furthermore little is lmown about how the
Making sense ofembodied cognition
255
subject can distinguish its own actions from those performed by others, given that to large extent the same neural mechanisms are underlying both action observation and one's own action (cf. Blakemore, Wolpert and Frith 2002). However, this might be possible to resolve in the future using techniques for dual scanning of two brains which would facilitate the recording of simultaneous responses of two interacting humans (Blakemore, Winston and Frith 2004).
3.6.
Gesture and language
In addition to action-recognition, mirror neurons are also considered to be
involved in more complex actions, such as gestures. Rizzolatti and Arbib (1998) suggested that the human mimetic capacity (cf. Donald 1991) is a natural extension of action-recognition based on mirror neuron mechanisms, allowing human ancestors to communicate to a higher degree than other primates. For instance, they pointed out that empirical studies suggest that a mirror system for gesture recognition also exists in humans and is situated in Broca's area (a homolog to area F5 in the monkey). Premotor areas are activated both when performing an action and when observing another person performing an action. According to Rizzolatti and Arbib, a series of mechanisms are usually activated in order to inhibit the actual (re) production of the observed action. Occasionally, however, the premotor system will allow a tiny aspect of the simulated movement to be executed, and this short glimpse is recognized by the other person, affecting both the actor and the observer. That means, the actor recognizes it as an intention in the observer, and the observer notices that her (involuntary) response, in turn, affects the behavior of the actor (Rizzolatti and Arbib 1998). That means, the mirror system provides the causal mechanisms for basic intentional interaction and thus might constitute the foundation for human language (cf. Arbib 2005). Iverson and Thelen (1999) examined the role of embodiment in language, noting that gesture is a pan-human ability in communication and that gestures are tightly connected and synchronized with speech. Furthermore, they pointed out that gestures provide important communicative information to the listener and even blind people gesture while talking to others, even when talking to blind listeners (cf. Goldin-Meadow 2003; Iverson and Goldin-Meadow 1998). Iverson and Thelen (1999) linked gestures and language together from the perspective of embodiment, and
256
Henrik Svensson, Jessica Lindblom and Tom Ziemke
presented three types of empirical evidence (see also Section 3.7). Firstly, some language and motor functions share the same underlying brain mechanisms. Ojemann (1984), for example, demonstrated that there seems to be a common brain mechanism for sequential movement and speech production located in the same area in the brain. Moreover, Fried et al. (1991) pointed out that there are indications that the vocal tract, and the hands and arms are represented in closely related sites in certain brain areas. Secondly, some of the brain regions normally associated with motor functions have been shown to be involved in language tasks (cf. Pulvermuller et al. 1996). Furthermore, classical "language areas" become activated during motor tasks (e.g., Bonda et al. 1994; Krams et al. 1998). Finally, there seems to be a close link between patterns of collapse and recovery in certain motor and language functions in some type of patients. For instance, language breakdowns in patients suffering from aphasia show a parallel dysfunction in gesturing (cf., e.g., Hill 1998). Moreover, there are close connections between the oral and manual systems in the infant at birth, e.g., the Babkin reflex which makes newborn babies open their mouth if pressure is applied to their palm. Furthermore, gesturing has been shown to have positive effects on language development in infants (cf. Goodwyn and Acredolo 1998). Taken together, there exists converging empirical evidence that the systems of hand and mouth movements are not separate systems; rather they should be viewed as intimately linked in language production. Thus, from the perspective of embodied cognition, the simulation mechanisms grounded in the mirror system might function as the glue that binds hand, mouth and language together.
3.7.
Language as embodied simulation
Some researchers have argued that conceptualization and language understanding cannot be achieved through the manipulation of amodal, arbitrary symbols alone but have to be grounded in bodily interaction with an environment. In particular, Glenberg and Kaschak (2003) have outlined an explanation of language in line with the ideas of cognition as body-based simulation as expressed in this paper, suggesting that language is partly achieved through the same neural structures as used to plan and guide action. Under the heading of the indexical hypothesis they developed an account of language comprehension partly based on simulation of action.
Making sense ofembodied cognition
257
They argued that the meaning of a sentence is achieved by a process that indexes words to perceptual symbols, i.e., modal symbols based on records of the neural states that underlie perception (Barsalou 1999), which in turn retrieves the available affordances in the situation and determines their relevance through the particular sentence construction. Thus, the understanding of a sentence is essentially achieved through a simulation of action using the same neural systems active in overt behavior. An empirical result that supports the close coupling between language and action is the "action-sentence compatibility effect" (Glenberg and Kaschak 2002). It was found that the sensibility of a sentence is modified by physical actions. Reaction times increased when subjects read "toward sentences" that implied action toward the reader, such as "Open the drawer" and had to give the answer through an incongruent action, i.e., moving the hand away from the body. Conversely, when subjects answered through an action congruent with the sentence, reaction times decreased. It might be worth noting that Glenberg and Kaschak included not only sentences describing concrete, physical transfers, but also sentences describing cases of abstract transfer, such as "Liz told you the story" (2002: 560). The action-sentence compatibility effect was also present when reading these more abstract sentences. Further support comes from experiments on language comprehension and construction that are only explainable by implicating perception and action systems as predicted by the indexical hypothesis (Glenberg and Kaschak 2003). To give but one example, Barsalou, Solomon and Wu (1998) described an experiment which showed that presenting a modifier that potentially reveals internal features has an effect on feature listing not predicted by standard amodal approaches. The standard models predict that listing the features of half a watermelon opposed to a whole watermelon would only differ with regard to amount, i.e., a half watermelon is smaller than a whole watermelon. The experiment showed, however, that subjects listed more internal features such as seeds, which can be explained if the concepts are based on perceptual symbols. Readers interested in more comprehensive reviews, also including neurophysiological evidence, of the coupling between language and action/perception are referred to Glenberg and Kaschak (2003) or Zwaan (2004) (see also Johnson and Rohrer this volume; Rohrer this volume; Zlatev this volume).
258
Henrik Svensson, Jessica Lindblom and Tom Ziemke
4.
Discussion
This paper has presented an emerging framework of simulation theories, based on terminology and ideas from control theory as well as data from psychological, neurophysiological and brain imaging studies, that explains higher-level cognitive processes as - at least partly - based on reactivations of sensorimotor brain structures. By reactivating mechanisms used in bodily perception and action together with a predictive mechanism a flexible inner world emerges that can be used for many different higher-level cognitive tasks (cf. Grush 2004; Hesslow 2002). Crucial to the embodiment of cognition, according to this account, is not so much the physical nature of a cognizer's body or its interaction with the environment as such, but the relation between sensorimotor and higher-level cognitive processes, more specifically, the way that the latter are fundamentally based on and rooted in the former at the level of the neural mechanisms underlying both of them. According to Rizzolatti and Arbib (1998), during the course of evolution, the capacity to voluntarily control one's own mirror system to emit signals, instead of the mere automatic leaking of parts of the mirrored actions (cf. Section 3.6), was essential for the emergence of a (basic) dialogue between two individuals which forms the core of language. Rizzolatti and Arbib further speculated that this new capacity of the mirror system was initially based on oro-facial movements, given that all primates mainly communicate through oro-facial movements. Later on, manual gestures were added, as a way of complementing the oro-facial ones, since gestures increased the sender's expressive power. The combination of oro-facial movements and gestures, according to Rizzolatti and Arbib, strongly implies the importance of controlled vocalization as an extension of oro-facial movements and gestures. The evolutionary pressure for more complex sound emission, together with the anatomical possibilities, resulted in a move of intentional interaction from its oro-facial and gestural origins to sound emission (cf. Corballis 1999). This might provide a tentative explanation of why and how the human Broca's area emerged from area F5, its homolog in the monkey. However, much further work is needed in order to clarify the relation between simulation mechanisms, gestures and the emergence of language. Although corroborating evidence comes from several disciplines and different experimental paradigms, the simulation account is not yet a well established or coherent theory of cognition in general, and there are many
Making sense ofembodied cognition
259
questions still to be answered. For example, in current accounts it is unclear exactly what constitutes the difference between an executed, overt action and a simulated/imagined, covert one. Can this be accounted for in terms of simulation theories or are other, presumably higher-level, mechanisms required after all to selectively trigger one or the other? A closely related question is exactly what it is that simulation accounts are accounts of? Do they constitute alternative theories of representation (only), as Grush (2004) seems to argue, or are they intended as more encompassing theories of cognition and representation, as probably Hesslow (2002) would argue? There seem to be good arguments for both positions. Empirical results, such as Glenberg and Kaschak's (2002) finding of an action-sentence compatibility effect even for abstract expressions (cf. Section 3.7) that cannot directly be explained in terms of perception and action, seem to indicate, as does much work in cognitive linguistics (cf. Johnson and Rohrer this volume; Rohrer this volume), that even much, if not all, abstract thought and language is (metaphorically) grounded in embodied simulations. On the other hand, Markman and Brendl (2005), for example, showed that the type of embodiment effects presented by Barsalou et al. (2003) are not always tied to the subject's body, but sometimes the actions and corresponding effects are performed in relation to a non-physical instantiation of the self (i.e., moved away from the subject's physical body). In such cases the mere simulation of actions, according to Markman and Brendl, is not sufficient for explaining the phenomena, since actions are usually tied to the subject's body and egocentric perspective. The perhaps most crucial issue, at least partly also underlying the above questions, is the problem of the right level ofgranularity or abstraction (cf. Meltzoff and Prinz 2002; Ziemke, Jirenhed and Hesslow 2005). That is, at what level of abstraction does the simulation occur? In the case of imagery it seems that the simulation occurs on a low-level including very many of the aspects of actually perceiving or acting, as indicated by neuroimaging studies and the fact that motor imagery also has physiological effects (e.g., Jeannerod 2001). In problem solving, on the other hand, more abstract aspects of actions may be employed, as indicated by the finding that Tower-of-London problem solving activity seems to activate only higher motor centers, such as prefrontal and premotor cortex (Dagher et al. 1999; cf. Section 2). However, this is a speculative interpretation of the neuroimaging results. The problem of granularity has also been suggested to be of importance in robotic models, which so far largely have been limited to the lowest level (cf. Stening, Jacobsson and Ziemke 2005; Ziemke, Jirenhed
260
Henrik Svensson, Jessica Lindblom and Tom Ziemke
and Hesslow 2005), and in studies of language understanding. Glenberg and Kaschak (2003), for example, pointed out that it is unlikely that in comprehending a sentence all aspects of the situation need to be simulated. But exactly what the crucial aspects are is still unclear. Hence, in conclusion one might say that, although much work remains to be done, simulation theories have come a long way in challenging traditional theories of cognition and representation, and there is an impressive wealth of corroborating empirical evidence from different disciplines. In particular simulation accounts clarify what it might mean for cognitive processes to be embodied in a strong sense, and thus provide an alternative to more conservative theories that try to integrate physical embodiment as a mere constraint into the traditional functionalist framework.
References Anderson, Michael L. 2003 Embodied cognition: A field guide. Artificial Intelligence 149: 91130. Arbib, Michael A. 2005 From monkey-like action recognition to human language: An evolutionary framework for neurolinguistics. Behavioral and Brain Sciences 28: 105-167. Barsalou, Lawrence W. 1999 Perceptual symbol systems. Behavioral and Brain Sciences 22: 577660. Barsalou, Lawrence W., Paula M. Niedenthal, Aron K. Barbey and Jennifer A. Ruppert 2003 Social embodiment. In: Brian H. Ross (ed.), The Psychology of Learning and Motivation 43: 43-92. San Diego, CA: Academic Press. Barsalou, Lawrence W., Karen Olseth Solomon and Ling-Ling Wu 1999 Perceptual simulation in conceptual tasks. In: Masako K. Hiraga, Christopher Sinha and Sherman Wilcox (eds), Cultural, Typological and Psychological Perspectives in Cognitive Linguistics, 209-228. Amsterdam: John Benjamins. Blakemore, Sarah-Jayne, Chris D. Frith and Daniel M. Wolpert 1999 Spatio-temporal prediction modulates the perception of selfproduced stimuli. Journal ofCognitive Neuroscience 11: 551-559.
Making sense ofembodied cognition
261
Blakemore, Sarah-Jayne, Joel Winston and Uta Frith 2004 Social cognitive neuroscience: where are we heading? Trends in Cognitive Sciences 8: 216-222. Blakemore, Sarah-Jayne, Daniel M. Wolpert and Chris D. Frith 2002 Abnormalities in the awareness of action. Trends in Cognitive Sciences 6: 237-242. Block, Ned 1983 Mental pictures and cognitive science. Philosophical Review 93: 499-542. Bonda, Eva, Michael Petrides, Stephen Frey and Alan C. Evans 1994 Frontal cortex involvement in organized sequences of hand movements: evidence from positron emission tomography studies. Society for Neurosciences Abstract 20: 353. Costall, Alan this vol. Bringing the body back to life: James Gibson's ecology of embodied agency. Chrisley, Ron and Tom Ziemke 2003 Embodiment. In: Encyclopedia of Cognitive Science, 1102-1108. London: Macmillan. Clark, Andy and Rick Grush 1999 Towards a cognitive robotics. Adaptive Behavior 7 (1): 5-16. Clayton, Nicola S., Timothy J. Bussey and Anthony Dickinson 2003 Can animals recall the past and plan for the future? Nature Reviews Neuroscience 4: 685-691. Corballis, Michael C. 1999 The gestural origin of language. American Scientist 87 (2): 138. Dagher, Alain, Adrian M. Owen, Henning Boecker and David J.Brooks 1999 Mapping the network for planning. Brain 122: 1973-1987. Dautenhahn, Kerstin 1997 I could be you: The phenomenological dimension of social understanding. Cybernetics and Systems 25 (8): 417-453. Decety, Jean Do imagined and executed actions share the same neural substrate? 1996 Cognitive Brain Research 3: 87-93. 2002 Is there such a thing as functional equivalence between imagined, observed and executed action. In: Andrew Meltzoff and Wolfgang Prince (eds.), The Imitative Mind: Development, Evolution and Brain Bases, 291-310. Cambridge, MA: Cambridge University Press. Decety, Jean and Thierry Chaminade 2003 Neural correlates of feeling sympathy. Neuropsychologia 41: 127138.
262
Henrik Svensson, Jessica Lindblom and Tom Ziemke
Decety, Jean and Marc Jeannerod 1996 Mentally simulated movements in virtual reality. Behavioural Brain Research 72: 127-134. Decety, Jean, Marc Jeannerod and Claude Prablanc 1989 The timing of mentally represented actions. Behavioural Brain Research 34: 35-42. Decety, Jean, Marc Jeannerod, M. Germain and J. Pastene 1991 Vegetative response during imagined movement is proportional to imagined effort. Behavioural Brain Research 42: 1-5. Decety, Jean and Jessica A. Sommerville 2003 Shared representations between self and other. Trends in Cognitive Sciences 7: 527-533. Dechent, Peter, Klaus-Dietmar Merboldt and Jens Frahm 2004 Is the human primary motor cortex involved in motor imagery? Cognitive Brain Research 19: 138-144. Deiber, Marie-Pierre, Vincente Ibanez, Manabu Honda, Norihiro Sadato, Ramesh Raman and Mark Hallet 1998 Cerebral processes related to visuomotor imagery and generation of simple fmger movements studied with positron emission tomography. Neurolmage 7: 73-85. Desmurget, Michel and Scott Grafton 2000 Forward modeling allows feedback control for fast reaching movements. Trends in Cognitive Sciences 4: 423-431. di Pellegrino, Giuseppe, Luciano Fadiga, Leonardo Fogassi, Vittorio Gallese and Giacomo Rizzolatti 1992 Understanding motor events. Experimental Brain Research 91: 176180. Donald, Merlin 1991 Origins of the Modern Mind. Cambridge, MA: Harvard University Press. Dreyfus, Hubert L. 2002 Intelligence without representation. Phenomenology and the Cognitive Sciences 1: 367-383. Fadiga, Luciano, Leonardo Fogassi, Giovanni Pavesi and Giacomo Rizzolatti 1995 Motor facilitation during action observation. Journal ofNeurophysiology 73: 2608-2611. Farah, Martha J. 1988 Is visual imagery really visual? Psychological Review 95 (3): 307317. 2000 The neural bases of mental imagery. In: Michael S. Gazzaniga (ed.), The New Cognitive Neurosciences, 965-974. Cambridge, MA: MIT Press.
Making sense ofembodied cognition
263
Finke, Ronald A. 1989 Principles ofMental Imagery. Cambridge, MA: MIT Press. Frak, Victor, Yves Paulignan and Marc Jeannerod 2001 Orientation of the opposition axis in mentally simulated grasping. Experimental Brain Research 136: 120-127. Fried, Itzhak, Arieh Katz, Gregory McCarthy, Kimberly J. Sass, Peter Williamson, Susan S. Spencer and Dennis D. Spencer 1991 Functional organization of human supplementary motor cortex studied by electrical stimulation. Journal of Neuroscience 11: 36563666. Frith, Chris and Ray Dolan 1996 The role of the prefrontal cortex in higher cognitive functions. Cognitive Brain Research 5: 175-181. Gallagher, Shaun this vol. Phenomenological and experimental contributions to understanding embodied experience. Gallese, Vittorio 2003 a The manifold nature of interpersonal relations: the quest for a common mechanism. Philosophical Transactions of the Royal Society of London, B 358: 517-528. 2003 b A neuroscientific grasp of concepts. Philosophical Transactions of the Royal Society ofLondon, B 358: 1231-1240. Gallese, Vittorio and Alvin Goldman 1998 Mirror neurons and the simulation theory of mind-reading. Trends in Cognitive Sciences 2: 493-501. Gallese, Vittorio, Christian Keysers and Giacomo Rizzolatti 2004 A unifying view of the basis of social cognition. Trends in Cognitive Sciences 8: 396-403. Ganis, Giorgio, William L. Thompson and Stephen M. Kosslyn 2004 Brain areas underlying visual mental imagery and visual perception. Cognitive Brain Research 20: 226-241. Glenberg, Arthur M. and Michael P. Kaschak 2002 Grounding language in action. Psychonomic Bulletin and Review 9: 558-565. 2003 The body's contribution to language. In: Brian H. Ross (ed.), The Psychology ofLearning and Motivation 43: 93-126. San Diego, CA: Academic Press. Goldin-Meadow, Susan 2003 Hearing Gesture - How Our Hands Help Us Think. Cambridge, MA: The Belknap Press of Harvard University Press.
264
Henrik Svensson, Jessica Lindblom and Tom Ziemke
Goodwyn, W. Susan and Linda P. Acredolo 1998 Encouraging symbolic gestures: A new perspective on the relationship between gesture and speech. In: Jana M. Iverson and Susan Goldin-Meadow (eds.), The Nature and Functions of Gesture in Children's Communication, 61-73. New Directions for Child Development, no. 79. San Francisco: Jossey-Bass. Grafton, Scott T., Michael A. Arbib, Luciano Fadiga and Giacomo Rizzolatti 1996 Localization of grasp representations in humans by positron emission tomography - 2. Observation compared with imagination. Experimental Brain Research 112: 103-111. Grezes, Julie, Jorge L. Armony, James Rowe and Richard E. Passingham 2003 Activations related to "mirror" and "canonical" neurons in the human brain: an fMRI study. Neurolmage 18: 928-937. Grezes, Julie and Jean Decety 2001 Functional anatomy of execution, mental simulation, observation and verb generation of actions: A meta-analysis. Human Brain Mapping 12: 1-19. Grush, Rick In defense of some "Cartesian" assumptions concerning the brain and 2003 its operation. Biology and Philosophy 18: 53-93. The emulation theory of representation. Behavioral and Brain Sci2004 ences 27: 377-442. Guillot, Aymeric and Christian Collet 2005 Duration of mentally simulated movement: A review. Journal of Motor Behavior 37: 10-20. Hesslow, Germund 1994 Will neuroscience explain consciousness? Journal of Theoretical Biology 171: 29-31. Conscious thought as simulation of behaviour and perception. Trends 2002 in Cognitive Sciences 6: 242-247. Hill, L. Elisabeth 1998 A dyspraxic deficit in specific language impairments and developmental coordination disorder? Evidence from hand and arm movements. Developmental Medicine and Child Neurology 40: 388-395. Ingvar, David H. and Lars Philipsson 1977 Distribution of the cerebral blood flow in the dominant hemisphere during motor ideation and motor performance. Annals of Neurology 2: 230-237. Iverson, Jana M. and 'Susan Goldin-Meadow 1998 Why people gesture when they speak. Nature 396: 228.
Making sense ofembodied cognition
265
Iverson, Jana M. and Ester Thelen 1999 Hand, mouth and brain - the dynamic emergence of speech and gesture. Journal ofConsciousness Studies 6 (11-12): 19-40. Jeannerod, Marc 1994 The representing brain. Behavioral and Brain Sciences 17 (2): 187245. 1997 The Cognitive Neuroscience of Action. Cambridge, MA: Blackwell Publishers. 2001 Neural simulation of action. NeuroImage 14: SI03-S109. Jeannerod, Marc and Jean Decety 1995 Mental motor imagery. Current Opinion in Neurobiology 5: 727732. Jeannerod, Marc and Victor Frak 1999 Mental imagining of motor activity in humans. Current Opinion in Neurobiology 9: 735-739. Johnson, Mark and Tim Rohrer this vo!. We are live creatures: Embodiment, American Pragmatism and the cognitive organism. Johnson, Scott H. 2000 Thinking ahead: the case for motor imagery in prospective action judgements of prehension. Cognition 74: 33-70. Kilner, James, M., Yves Paulignan and Sarah-Jane Blakemore 2003 An interference effect of observed biological movement on action. Current Biology, 13 (6): 522-525. Kosslyn, Stephen M. 1994 Image and Brain: The Resolution of the Imagery Debate. Cambridge, MA: MIT Press. Kosslyn, Stephen, Giorgio Ganis and William L. Thompson 2003 Mental imagery: against the nihilistic hypothesis. Trends in Cognitive Sciences 7: 109-111. Kosslyn, Stephen M. and William L. Thompson 2000 Shared mechanisms in visual imagery and visual perception: Insights from cognitive neuroscience. In: Michael S. Gazzaniga (ed.), The New Cognitive Neurosciences, 975-985. Cambridge, MA: MIT Press. Krams, Michael, Matthew S. F. Rushworth, Marie-Pierre Deiber, Richard S. J. Frackowiak and Richard E. Passingham 1998 The preparation, execution and suppression of copied movements in the human brain. Experimental Brain Research 120: 386-398.
266
Henrik Svensson, Jessica Lindblom and Tom Ziemke
Lakoff, George 1988 Cognitive semantics. In: Umberto Eco, Marco Santambrogio and Patrizia Violi (eds.), Meaning and Mental Representations, 119-154. Bloomington: Indiana University Press. Lakoff, George and Mark Johnson 1980 Metaphors We Live by. Chicago: University of Chicago Press. 1999 Philosophy in the Flesh: The Embodied Mind and its Challenges to Western Thought. New York: Basic Books. Lindblom, Jessica and Tom Ziemke this vol. Embodiment and social interaction: A cognitive science perspective. Markman, Arthur B. and C. Miguel Brendl 2005 Constraining theories of embodied cognition. Psychological Science 16(1): 6-10. Maturana, Humberto and Francisco J. Varela 1980 Autopoesis and Cognition: The Realization of the Living. Dordrecht, The Netherlands: D. Reidel Publishing. 1987 The Tree of Knowledge - The Biological Roots of Human Understanding. Boston: Shambalaya. Mazard, Angelique, Nathalie Tzourio-Mazoyer, Fabrice Crivello, Bemard Mazoyer and Emmanuel Mellet 2004 A PET meta-analysis of object and spatial mental imagery. European Journal ofCognitive Psychology 16: 673-695. Meltzoff, Andrew N. and Wolfgang Prinz 2002 An introduction to the imitative mind and brain. In: Andrew Meltzoff and Wolfgang Prince (eds.), The Imitative Mind: Development, Evolution and Brain Bases, 1-15. Cambridge, MA: Cambridge University Press. Murata, Akira, Luciano Fadiga, Leonardo Fogassi, Vittorio Gallese, Vassilis Raos and Giacomo Rizzolatti 1997 Object representation in the ventral premotor cortex (area F5) of the monkey. Journal ofNeurophysiology 78: 2226-2230. Nielsen, Lisbeth 2002 The simulation of emotion experience. Phenomenology and the Cognitive Sciences 1: 255-286. Ojemann, A. George 1984 Common cortical and thalamic mechanisms for language and motor functions. American Journal ofPhysiology, 246: R901-R903. Papaxanthis, Charalambos, Thierry Pozzo, Xanthi Skoura and Marco Shieppati 2002 Does order and timing in performance of imagined and actual movements affect the motor imagery process? The duration of walking and writing task. Behavioural Brain Research 134: 209-215.
Making sense ofembodied cognition
267
Papaxanthis, Charalambos, Marco Shieppati, Randolphe Gentili and Thierry Pozzo 2002 Imagined and actual arm movments have similar durations when performed under different conditions of direction and mass. Experimental Brain Research 123: 447-452. Pulvermiiller, Friedmann, Hubert Preissl, Wemer Lutzenberger and Niels Birbaumer 1996 Brain rhythms of language: Nouns versus verbs. European Journal of Neuroscience 8: 937-941. Pylyshyn, Zenon 1981 The imagery debate: analogue media versus tacit knowledge. Psychological Review 88: 16-45. 2003 Return of the mental image: are there really pictures in the brain? Trends in Cognitive Sciences 7: 113-118. Ranganathan, Vinoth K., Vlodek Siemionow, Jing Z. Liu, Vinod Sahgal and Guang H.Yue 2004 From mental power to muscle power-gaining strength by using the mind. Neuropscyhologia 42: 944-956. Rizzolati, Giacomo and Michael A. Arbib 1998 Language within our grasp. Trends in Neurosciences 21: 188-194. Rizzolatti, Giacomo, Luciano Fadiga, Leonardo Fogassi and Vittorio Gallese 2002 From mirror neurons to imitation: Facts and speculations. In: Andrew MeItzoff and Wolfgang Prince (eds.), The Imitative Mind: Development, Evolution and Brain Bases, 247-266. Cambridge, MA: Cambridge University Press. Rizzolatti, Giacomo, Luciano Fadiga, Vittorio Gallese and Leonardo Fogassi 1996 Premotor cortex and the recognition of motor actions. Cognitive Brain Research 3: 131-141. Rohrer, Tim this vol. The body in space: Dimensions of embodiment. Schwoebel, John, Consuelo B. Boronat and H. Branch Coslett 2002 The man who executed "imagined" movements: Evidence for dissociable components of the body schema. Brain and Cognition 50: 116. Schall, Ulrich, Patrick Johnston, Jim Lagopoulos, Markus Jiiptner, WaIter Jentzen, Renate Thienel, Alexandra Dittmann-Balcar, Stefan Bender and Philip B. Ward 2003 Functional brain maps of tower of London performance. NeuroImage 20: 1154-1161. Sebanz, Natalie, Giinther Knoblich and Wolfgang Prinz. 2003 Representinmg others' actions: just like one's own? Cognition 88 (3): B11-B21.
268
Henrik Svensson, Jessica Lindblom and Tom Ziemke
Shallice, Tim 1982 Specific impairments of planning. Philosophical Transactions of the Royal Society ofLondon B 298: 199-209. Shepard, Roger N. and Jacqueline Metzler 1971 Mental rotation of three-dimensional objects. Science 171: 701-703. Sonesson, Goran this vol. From the meaning of embodiment to the embodiment of meaning: A study in phenomenological semiotics. Stening, John, Henrik Jacobsson and Tom Ziemke 2005 Imagination and abstraction of sensorimotor flow: Towards a robot model. In: AISB'05: Proceedings ofthe Symposium on Next Generation Approaches to Machine Consciousness - Imagination, Development, Intersubjectivity and Embodiment, 50-58. The Society for the Study of Artificial Intelligence and the Simulation of Behavior, UK. Svensson, Henrik and Tom Ziemke 2004 Making sense of embodiment. In: Proceedings of the 26th Annual Conference ofthe Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum. Thomas, Nigel J. T. 1999 Are theories of imagery theories of imagination? Cognitive Science 23: 207-245. Trojano, Luigi, David E. J. Linden, Elia Fromisano, Rainer Goebel, Alexander T. Sack and Francesco Di Salle 2004 What clocks tell us about the neural correlates of spatial imagery. European Journal ofCognitive Psychology 16: 653-672. Varela, Francisco J., Evan Thompson and Eleanor Rosch 1991 The Embodied Mind. Cambridge, MA: MIT Press. Vingerhoets, Guy, Floris P. de Lange, Pieter Vandemaele, Karel Deblaere and Erik Achten. 2002 Motor imagery in mental rotation: An fMRI study. NeuroImage 17: 1623-1633. Wexler, Mark, Stephen M. Kosslyn and Alain Berthoz 1998 Motor processes in mental rotation. Cognition 68: 77-94. Wilson, Margaret 2002 Six views of embodied cognition. Psychonomic Bulletin and Review 9 (4): 625-636. Wolpert, Daniel M. and Mitsuo Kawato 1998 Multiple paired forward and inverse models for motor control. NeuralNetworks 11: 1317-1329.
Making sense ofembodied cognition
269
Vue, Guang and Kelly J. Cole 1992 Strength increases from the motor program. Comparison of training with maximal voluntary and imagined muscle contractions. Journal ofNeurophysiology 67: 1114-1123. Ziemke, Tom 2003 What's that thing called embodiment? In: Richard Alterman and David Kirsh (eds.), Proceedings of the 25th Annual Meeting of the Cognitive Science Society, 1305-1310. Mahwah, NJ: Lawrence Erlbaum. 2004 Embodied AI as science: Models of embodied cognition, embodied models of cognition, or both? In: Fumiya Iida et al. (eds), Embodied Artificial Intelligence, 27-36. Heidelberg: Springer Verlag. Ziemke, Tom, Dan-Anders Jirenhed and Germund Hesslow 2005 Internal simulation of perception: A minimal neuro-robotic model. Neurocomputing 68: 85-104. Zlatev, Jordan this vol. Embodiment, language and mimesis. Zwaan, Rolf A. 2004 The immersed experiencer: toward an embodied theory of language comprehension. In: Brian H. Ross (ed.), The Psychology ofLearning and Motivation 44: 35-62. San Diego, CA: Academic Press.
Phenomenological and experimental contributions to understanding embodied experience Shaun Gallagher
Abstract Much recent work on the relationship between phenomenology, in the Husserlian tradition, and the contemporary cognitive sciences has focused on the question of embodiment. In this chapter I suggest that the distinction between body image and body schema, once clarified on phenomenological grounds, can contribute to an understanding of embodied experience, and how the body shapes cognition. A clear 'distinction between these two concepts can be both verified and applied in specific empirical research concerning such issues as intentional action, intermodal perception, neonate imitation, mirror neurons and pathologies that involve unilateral neglect and deafferentation. A consideration of these phenomena also leads to clarifications about the nature of intersubjectivity consistent with both phenomenological insights offered by Husserl and Merleau-Ponty and the most recent neuroscience of social cognition. Keywords: action, body image, body schema, deafferentation, intersubjectivity, neonate imitation, unilateral neglect.
1.
Introduction
It is a long-established principle in phenomenology that a strictly physical (scientific) analysis of the objective body is not sufficient to reveal its contribution to cognition. Phenomenology proposes an analysis of the body as we live it. In the context of the cognitive sciences, however, it is not sufficient to stop with a pure phenomenology of lived experience. And specifically, in the context of the cognitive sciences, one needs to appeal to empirical verifications and clarifications that will confirm phenomenological insight, and then use that insight to interpret the empirical data - a hermeneutical circle, to be sure, but not a methodologically vicious one.
272
Shaun Gallagher
Within this circle the relationship between phenomenological analysis of embodied experience found, for example, in Husserl's writings from 19121915 (Husserl 1952) and in Merleau-Ponty (1945), and recent experimental research on a variety of issues related to embodiment, can be one of mutual enlightenment. This is not an uncontroversial claim, and is a matter of ongoing discussion (see e.g., Varela 1996; Gallagher 1997; Gallagher and Varela 2002; Bayne 2004; Overgaard 2004; Zahavi 2004). Rather than enter into this debate here, however, I will adopt the following limited position: a phenomenology that understands intentionality as a form of being-in-the-world, and recognizes the importance of embodied action for shaping perception, offers an interpretational framework different from purely functional or syntactic interpretations of the empirical data. The best way to explicate this framework is actually to go to work on specific issues that lend themselves to both phenomenological and experimental approaches to embodied experience. In this chapter I will pursue two questions. (1) To what extent and in what way is one's body part of one's perceptual field? This question is clearly open to both empirical and phenomenological analysis. (2) How does the body shape perception, or more generally, cognition? In this case, phenomenology can point to certain "prenoetic" performances of the body; but empirical science is required to clarify such performances. Psychologists already have a relatively developed way of addressing the first question, about the appearance of the human body in the perceptual field, or more generally, about the image that a person has of their own body. In most instances, this is referred to as a body image. The extensive literature on body image, however, is problematic. It is not only wideranging - the concept is employed and applied in a great variety of fields, from neuroscience to philosophy, from the medical sciences to the athletic sciences, from psychoanalysis to aeronautical psychology and robotics but as often happens in such cases, the term changes meaning from one field or discipline to the next, from one author to the next, and sometimes even within a single author (Gallagher 1986, 2005). Problems about the meaning of the term body image are also bound up with the use of another term, body schema. But this is not just a terminological issue. There are more deep-seated conceptual confusions involved. Precisely such confusions, which can lead to problems involving experimental design and the interpretation of experimental results, motivate some authors to suggest that we ought to give these terms up, abandon them to history, and formulate alternative descriptions of embodiment (e.g., Poeck
Phenomenological and experimental contributions
273
and Orgass 1971; Ga11ese 2005). I argue here that with respect to our two questions, a clear phenomeno10gica11y-based conceptual distinction between body image and body schema can do some useful work despite the ambiguity involved in the historical use of these concepts. Each concept addresses a different sort of question. The concept of body image helps to answer the first question about the appearance of the body in the perceptual field; in contrast, the concept of body schema helps to answer the question about how the body shapes the perceptual field. So these terms and concepts, if properly clarified, provide a way to explicate the role embodiment plays in conscious and cognitive experience.
2.
Body image and body schema
Rather than rehearse the long history of conceptual confusion regarding these terms (see Ga11agher 1986, 1995, 2005), let me go directly to the conceptual distinction between body image and body schema. We can then examine how this distinction contributes to understanding several phenomena studied in the cognitive sciences literature, including unilateral neglect and deafferentation (i.e., a loss of peripheral sensory input), neonate imitation and our ability to understand others. Phenomeno10gical reflection tells us that there is a difference between taking an intentional attitude towards one's own body (having a perception of, or belief about, or emotional attitude towards one's body) and having a capacity to move or to exist in the action of one's own body. The concepts of body image and body schema correspond to this phenomeno10gical difference. Body image is a (sometimes conscious) system of perceptions, attitudes and beliefs pertaining to one's own body. Body schema is a system of processes that constantly regulate posture and movement: sensory-motor processes that function without reflective awareness or the necessity of perceptual monitoring. The distinction between body image and body schema is not an easy one to make because behaviorally the two systems interact and are highly coordinated in the context of intentional action, and in pragmatic and socially contextualized situations. A conceptual distinction is nonetheless useful
274
Shaun Gallagher
precisely in order to understand the complex dynamics of bodily movement 1 . and experIence. The body image, consisting of a complex set of intentional states - perceptions, beliefs and attitudes - in which the intentional object of such states is one's own body, involves a form of reflexive or self-referential intentionality. Studies involving body image (e.g., Cash and Brown 1987; Gardner and Moncrieff 1988; Powers et al. 1987) frequently distinguish among three of these elements: a) b) c)
the subject's perceptual experience ofhislher own body; the subject's conceptual understanding (including folk knowledge and/or scientific knowledge) of the body in general; and the subject's emotional attitude toward hislher own body.
Although a conceptual understanding and emotional attitude do not necessarily involve an on-going conscious awareness, they are maintained as sets of beliefs or attitudes, and in that sense form part of an intentional system. Conceptual and emotional aspects of the body image are no doubt affected by various cultural and interpersonal factors (see e.g. Roth this volume, Sonesson this volume, Zlatev this volume). It is also the case, as I will suggest below, that the perceptual content of the body image originates in intersubjective perceptual experience. In contrast to the body image, a body schema is not a perception, a belief, or an attitude. Rather it is a system of motor functions or motor programs that operate below the level of self-referential intentionality. It involves a set of tacit performances - preconscious, subpersonal processes that play a dynamic role in governing posture and movement. In most instances, movement and the maintenance of posture are accomplished by the close to automatic performances of a body schema, and for this very reason the normal adult subject, in order to move around the world, neither needs 1. In making this conceptual distinction, however, I am not reaffrrming the traditional distinction between perception and action, although I am affrrming a phenomenological distinction between the perception of my body and my bodily action. Intentional action, for example, clearly involves perception, but does not clearly involve a perceptual monitoring of my body. Furthermore, I am not making any claim about the neurological underpinnings of body image and body schematic processes. There is good reason to think that body image and body schematic processes are in part underlain by neuronal activity in the same or similar brain areas (see, e.g., Gallese 2005).
Phenomenological and experimental contributions
275
nor has a constant body percept. In this sense the body tends to efface itself in most normal activities that are geared into external goals. To the extent that one does become aware of one's own body in terms of monitoring or directing perceptual attention to limb position, movement, or posture, such awareness helps to constitute the perceptual aspect of a body image. Such awareness may then interact with a body schema in complex ways, but it is not equivalent to a body schema itself. I said that a body schema operates in a close to automatic way. This does not mean that its operations are a matter of reflex. Movements controlled by a body schema can be precisely shaped by the intentional experience or goal-directed behavior of the subject. If I reach for a glass of water with the intention of drinking from it, my hand shapes itself in a precise way for picking up the glass, and it does this completely outside my awareness. But the shape that it takes on is in complete conformity with my intention (Jeannerod 1997; Jeannerod and Gallagher 2002). Thus it is important to note that although a body schema is not itself a form of consciousness, or in any way a cognitive operation, it can enter into and support intentional activity, including cognition. In this sense motor action is not completely automatic; it is usually part of a voluntary, intentional project. When I walk across the room to greet someone or jump to catch a ball in the context of a game, my actions may be explicitly willed, and governed by my perception of objects or persons in the environment. My attention and even my complete awareness in such cases, however, are centered on the other person or the ball, and not on the precise accomplishment of locomotion. The body moves smoothly and in a coordinated fashion not because I perceptually monitor or have an image of my bodily movement, but because of the coordinated functioning of a body schema. It is also the case that a body image or percept can contribute to the control of movement. The visual, tactile and proprioceptive attentiveness that I have of my body may help me to learn a new dance step, improve my tennis game, or imitate the novel movements of others. In learning a new movement in such contexts, for example, I may consciously monitor and correct my movement. In other cases, when there is a physical threat, my movement may involve a large amount of perceptual monitoring and willed conscious control. Even in such cases the contribution made to the control of movement by my perceptual awareness of my body will always find its complement in capacities that are defined by the operations of a body schema that continues to function to maintain balance and enable movement. Such operations are always in excess of what I can be aware of.
276
Shaun Gallagher
Thus, a body schema is not reducible to a perception of the body; it is never equivalent to a body image. Am I always conscious of my own body as an intentional object, or as part of an intentional state of affairs? The distinction between consciously attending to the body and being marginally aware of the body is important. As I suggested, sometimes we do attend specifically to some aspect or part of the body. But in much of our everyday experience, and most of the time, our attention is directed away from the body, toward the environment or toward some project we are undertaking. Do we remain consciously aware of some aspect or part of the body even in cases where our attention is not directed toward the body? Such awareness may vary by degree among individuals. Some people may be more aware, others, at times, not at all aware of their body. If I am solving a difficult mathematical problem, am I also and at the same time aware of the position of my legs or even of my grip on the pencil, or are these things so much on "automatic pilot" that I do not need to be aware of them? In any case, to define the difference between body image and body schema it is not necessary to determine to what extent we are conscious of our bodies. It suffices to say that sometimes we are attentive to or aware of our bodies; other times we are not. A body image is inconstant in this sense. When I am marginally aware that I am moving in certain ways, my awareness may not capture the whole movement. If I am marginally aware that I am reaching for something, I may not be aware at all of the fact that for the sake of balance my left leg has stretched in a certain way, or that my toes have curled against the floor. Posture and the majority of bodily movements operate in most cases without the help of a body image. As distinct from body image, the body schema system involves a prenoetic performance of the body, that is, a performance that helps to structure our experience, but does not explicitly show itself in the contents of consciousness. That a body schema operates in a prenoetic way means that it does not depend on a consciousness that targets or monitors bodily movement. This is not to say that it does not depend on consciousness at all. For certain motor programs to work properly, I need information about the environment, and this is most easily received by means of perception. In my intentional actions the body acquires a certain organization or style in its relations with its environment. For example, it appropriates certain habitual postures and movements; it incorporates various significant parts of its environment into its own schema. The carpenter's hammer becomes an operative extension of the carpenter's hand, or, as Head (1920) noted, the
Phenomenological and experimental contributions
277
body schema can extend to the tip of the blind man's cane. The system that is the body schema allows us to actively engage with our environment without the requirement of a reflexive conscious monitoring directed at the body. It is a dynamic, operative performance of the body, rather than a copy, image, or conceptual model of it. In so far as I am conscious of what I am doing, the content of my consciousness is not about my body, but is specified in its most pragmatic meaning. That is, if I were to formulate the content of my consciousness when reaching to get a drink, it would not be in terms of operating or stretching muscles, bending or unbending limbs, turning or maintaining balance; it would not even be in terms of reaching and grasping. Rather, if I were stopped and asked what I was doing, I would say something like "I'm getting a drink". The details of bodily movement entailed in the action are just as much hidden in my pre-reflective experience as they are hidden in that description. I am aware of my bodily action not as bodily action per se, but as action at the level of my intentional project. Thus, prenoetic functions underpin and affect my experience, and are subsumed into larger intentional activities. In this sense, detailed aspects of movement (such as the contraction of certain muscles), even if we are not aware of them (even if they are not explicitly intentional), are intentional insofar as they are part of a larger intentional action. Establishing a conceptual distinction between body image and body schema is only the beginning of an explication of the role played by the body in action and cognition. There are reciprocal interactions between prenoetic body schemas and cognitive experiences, including normal and abnormal consciousness of the body. Such behavioral relations between body image and body schema can be worked out in detail, however, only if the conceptual distinctions between them are first understood. Armed with this phenomenological distinction we can seek verifications and clarifications in the empirical literature.
3.
Unilateral neglect and deafferentation
In sciences like psychology a good way to verify that a conceptual distinction between X and Y is valid is to identify a double dissociation, that is, to identify a case in which we find X but not Y, and another case in which we find Y but not X. For the distinction that we have been discussing, importantly, it is possible to find cases in which a subject has an intact body
278
Shaun Gallagher
image but a dysfunctional body schema, and vice versa. For example, one finds evidence of an intact body schema but the absence of a completely intact body image in some cases of unilateral neglect. Denny-Brown and his colleagues report that a patient, following stroke, who suffers from a neurologically caused defect in perception related to the left side, fails to notice the left side of her body. She excludes it from her body image. She fails to dress her left side or comb the hair on the left side of her head. Yet there is no motor weakness on that side. Her gait is normal, although if her left slipper comes off while walking she fails to notice. Her left hand is held in a natural posture most of the time, and is used quite normally in movements that require the use of both hands, for example, buttoning a garment or tying a knot. Thus she uses the motor ability of the neglected side, to dress the right side of her body (Denny-Brown, Meyer and Horenstein 1952; similar cases are reported by Ogden 1996 and Pribram 1999). In such cases, the patient's body schema system is intact despite her problems with body image on the neglected side. Dissociation of the opposite kind can be found in rare cases of deafferentation. A subject (IW) who has lost tactile and proprioceptive input from the neck down can control his movement only by cognitive intervention and visual guidance of his limbs. In effect he employs his body image (primarily a visual perception of his body) in a unique way to make up for the impairment of his body schema (see Cole 1995; Gallagher and Cole 1995). Proprioception is that bodily sense which allows us to know how our body and limbs are positioned. If a person with normal proprioception is asked to sit, close their eyes and point to their knee, it is proprioception that allows them to successfully guide their hand and find their knee. If IW is asked to close his eyes and point to his knee, he has some difficulty. If, in this situation, I move either his knee or his arm, he is unable to point to his knee since, without vision or proprioception, he does not know where either his knee or his hand are located. He assumes that they are in exactly the same location as when he last saw them and he moves his hand so as to point to where he remembers his knee to have been. Because of the loss of proprioception and tactile sense IW does not have a sense of where his limbs are or what posture he maintains without visual perception. In order to maintain motor control he must conceptualize his movements and keep certain parts of his body in his visual field. His movement requires constant visual and mental concentration. In darkness he is unable to control movement; when he walks he cannot daydream but must concentrate on his movement constantly. When he writes he needs to
Phenomenological and experimental contributions
279
concentrate on both his body posture and on holding the pen. Maintaining posture is, for him, a task rather than an automatic process. In terms of the distinction between body image and schema, IW has lost major aspects of his body schema, and thereby the possibility of normally unattended movement. He is forced to compensate for that loss by depending on his body image in a way that normal subjects do not. For him, control over posture and movement are achieved by a partial and imperfect functional substitution of body image for body schema. Proprioception is a major source of information for the maintenance of posture and the governance of movement - that is, for the normal functioning of the body schema. But proprioception is not the only possible source for the required information. lW, as a result of extreme effort and hard work, recovered control over his movement and regained a close to normal life. It is important to understand that he did not do this by recovering proprioceptive sense. In strict physiological terms, he has never recovered from the original problem. His proprioception has not been repaired. He is able to address the motor problem on a behavioral level, however, primarily by using an enhanced body image to help control movement. This case, in terms of the body image-body schema distinction, is just the opposite to neglect. If the neglect patient is capable of controlled movement even on the neglected side because of an intact body schema, lW, who is unable to depend on a body schema, must employ his body image to guide his movement. In complete contrast to neglect, IW is required to pay an inordinately high degree of attention to his body. Thus, cases of unilateral neglect and deafferentation, and the double dissociation implied, begin to provide logical and empirical reasons for thinking that there is a useful distinction to be made between body schema and body image. It is also to be noted, however, that the distinction between body image and body schema can help to make sense out of such cases.
4.
Neonate imitation
Prior to the development of a body image or a body schema in a small child, is it the case that something like a less embodied consciousness exists? Less embodied may even mean less structured, along the lines of William James' (1890) famous phrase about the "blooming, buzzing confusion" of the infant's experience. It is not unusual to find proponents of the view that conscious experience is in some way the developmental source
280
Shaun Gallagher
for both the body image and the body schema. Indeed, this is the traditional view in both psychology and philosophy. An empiricist, for example, might hold that a body image is generated only on the basis of the prolonged perceptual experience that one has of one's own body. Conceptual and emotional aspects of the body image, and the structural aspects that the body image brings with it, are obviously traceable to certain early and originary experiences that the child may have in tactile, visual and other sensations of the body. It might also be thought that a body schema originates only through the conscious experience of movement. Much as we learn habits through practice, we learn to control our movements through the practiced experience of movement. This seems to be the case in examples we referred to before, such as in the learning of a new dance movement. It seems more obviously true of learning to crawl and to walk. On this view, then, conscious experience is at the origin of such things as body image and body schema. Thus, a certain kind of consciousness, primitive and perhaps disorganized, would predate the consciousness that is shaped and structured by embodiment. This traditional view assumes that the newborn infant has no body image or body schema, and that such things are acquired through prolonged experience in infancy and early childhood. This view has been worked out in a number of ways and in a variety of contexts in scientific and philosophical discussions. Up until about thirty years ago this position was the almost unanimous consensus among developmental theorists. At that time, however, on several fronts, new evidence was developed in support of a more nativist position. The idea that body schemas may in fact be innate was put forward, for example, in studies of phantom limbs in cases of congenital absence of limb (for further discussion of aplasic phantoms, see Gallagher 2005; Gallagher et al. 1998; Gallagher and Meltzoff 1996). In the 1970s, in studies of neonate imitation (Meltzoff and Moore 1977), further evidence was provided to show that certain elements of what previously were understood to be learned motor behaviors were in fact already present in the newborn. The traditional view is that the body schema is an acquired phenomenon, built up in experience, the product of development. This traditional view is well represented by the developmental psychologist Marianne Simmel (1958, 1962, 1966), one of the few psychologists who makes a clear distinction between body schema and body image. The body schema, Simmel claims, is
Phenomenological and experimental contributions
281
[... ] built up as a function of the individual's experience, i.e., it owes its existence to the individual's capacity and opportunity to learn. This means that at some early time in the development of the human organism the schema has not yet been formed, while later [...] it is present and is characterized by considerable differentiation and stability. (1958: 499)
This position is also held by Merleau-Ponty (1945), who was greatly influenced by his study of developmental psychology and by the psychologists and psychological research he cited, including the work of Piaget (1945), Wallon (1925), Guillaume (1943) and Lhermitte (1939). Although Merleau-Ponty rightly conceives of the body schema as an anterior condition of possibility, a dynamic force of integration that cannot be reduced to the sum "of associations established during experience", still, in terms of development, the operations of the body schema are "'learnt' from the time of global reactions of the whole body to tactile stimuli in the baby [...]" (1945: 101, 122n). The body schema functions as if it were an "innate complex" (1945: 84), that is, as strongly and pervasively as if it were innate, but, as an acquired habit with a developmental history, it is not actually innate. Following Wallon (1925), Merleau-Ponty believed that experience begins by being interoceptive, and that the newborn is without external perceptual ability (1960: 121). James's "blooming, buzzing confusion" does not begin to be resolved until between the third and sixth month of life when a collaboration takes place between the interoceptive and exteroceptive domains - a collaboration that simply does not exist at the beginning of life (Merleau-Ponty 1945: 121). On this view, one reason for the lack of any organized exteroceptive perception is precisely the absence of a "minimal bodily equilibrium", an equilibrium that must be sorted out between a developing body schema and the initial and still very primitive stages of a body image. For Merleau-Ponty, motor experience and perceptual experience are dialectically or reciprocally linked. The mature operation of a body schema depends on a developed perceptual knowledge of one's own body; and the organized perception of one's own body, and then of the external world, depends on a proper functioning of the body schema. 2
2. "Up to that moment [exteroceptive] perception is impossible. [...] The operation of a postural schema - that is, a global consciousness of my body's position in space, with the corrective reflexes that impose themselves at each moment, the
282
Shaun Gallagher
The infant does not yet have a body schema, according to MerleauPonty, because of a certain lack of neurological development. The development of the body schema can happen only in a gradual and fragmentary way as the central nervous system develops. Motor schemas are then gradually integrated, and in a reciprocal system with external perception and sensory inputs, become "precise, restructured and mature little by little" (1960: 123). Simmel and Merleau-Ponty are good representatives of the traditional view that both body schema and body image are acquired through experience. This view clearly implies what is possible and what is not possible in the conscious experience of the infant: conscious experience is disorganized; exteroceptive perception is impossible. Also, according to this view, the capacity for imitation - an important capacity directly related to questions about perception, social recognition, the ability to understand another person and the origins of a sense of self - is non-existent in infants younger than 12 months. 3 Jean Piaget (1945) expresses this view in the most precise terms. The question is about a certain kind of imitation called "invisible imitation". Piaget defines invisible imitation as the child's imitation of another person's movements using parts of the child's body that are invisible to the child. For example, if a child does not see its own face, is it possible for the child to imitate the gesture that appears on another person's face? Piaget's answer is that at a certain point in development it is possible; but in early infancy it is not. The reason is that invisible imitation requires the operation of a relatively mature body schema. Thus, according to Piaget (as well as most other classical theorists of development), invisible imitation is not possible prior to 8 to 12 months of age. The intellectual mechanisms of the [child under 8 months] will not allow him to imitate movements he sees made by others when the corresponding movements of his own body are known to him only tactually or kinesthetically, and not visually (as, for instance, putting out his tongue). [...] Thus since the child cannot see his own face, there will be no imitation of movements of the face at this stage. [...] For imitation of such movements to be possible, there must be co-ordination of visual schemas with tactilokinesthetic schemas [...]. (Piaget 1945: 19, 45)
global consciousness of the spatiality of my body - all this is necessary for [exteroceptive] perception (Wallon)" (Merleau-Ponty 1960: 122). 3. Merleau-Ponty (1945: 352) does recognize this kind of ability in an infant at 15 months.
Phenomenological and experimental contributions
283
Merleau-Ponty follows Guillaume and Piaget in regard to these issues. Thus, to imitate [... ] it would be necessary for me to translate my visual image of the other's [gesture] into a motor language. The child would have to set his facial muscles in motion in such a way as to reproduce [the visible gesture of the other]. [...] If my body is to appropriate the conducts given to me visually and make them its own, it must itself be given to me not as a mass of utterly private sensations but instead by what has been called a "postural" or "corporeal schema". (Merleau-Ponty 1960: 116-117)
In complete contrast to this traditional view, studies on imitation in infants conducted by Meltzoff and Moore (1977, 1983) show that invisible imitation does occur in newborns. Their experiments, and others that replicate and extend their results (see Meltzoff and Moore 1994 for summary), show that newborn infants less than an hour old can indeed imitate facial gestures. A brief review of several of their experiments will help to clarify the results and their relevance to the issues of body schema, body image and intermodal perception. Meltzoff and Moore (1983): 40 normal and alert newborn infants ranging in age from less than 1 hour to 71 hours were tested. The experimenter presented each infant with a mouth-opening gesture over a period of 4 minutes, alternating in 20-second intervals between the mouth opening and a passive facial appearance. The same procedure was then followed using tongue protrusion as the target gesture. The study showed a clear and statistically significant result in terms of both the frequency and duration of the infants' response gestures, demonstrating that normal and alert newborn infants systematically imitate adult gestures of mouth opening and tongue protrusion. Notably, even the youngest infant in the study, 42 minutes old at the time of the test, showed a strong imitation effect. Other experiments have extended the range of gestures that young infants imitate to a wider set, including lip protrusion, sequential finger movement, head movements, smiling, frowning and surprised expressions. Meltzoff and Moore (1977): showed some form of memory to be involved in early imitation. Infants between the ages of 16-21 days imitated facial gestures after a delay. This involved putting a pacifier in the infant's mouth as it was shown a facial gesture. After the presentation of the facial gesture was complete, the pacifier was removed and the in-
284
Shaun Gallagher
fant imitated the gesture. Thus, imitative responses were delayed and only allowed when the gesture had vanished from the perceptual field. Experiments also show that infants imitated after a delay of up to 24 hours, and that infants improve their gestural performance over time (Meltzoff and Moore 1994). Their first attempts at imitation do not necessarily replicate the seen gesture with a high degree of accuracy. When tongue protrusion is displayed, infants quickly activate the tongue; but they improve their motor accuracy over successive efforts. 4 What aspects of embodiment allow for these possibilities in the neonate? There are two things that need to be considered. First, a relatively developed body schema already existing at birth. If we follow the logic expressed by the proponents of the traditional view, namely, that imitation requires a developed body schema, then the studies on newborn imitation suggest that there is at least a primitive body schema from the very beginning. This is an innate body schema sufficiently developed at birth to account for the ability to move one's body in appropriate ways in response to environmental stimuli, and specifically for the possibility of invisible imitation. Here I use the word "innate" to mean, literally, "something existing prior to birth". Second, an intermodal sensory system is required to enable the infant to recognize a structural equivalence between itself and the other person. A large number of experiments have now been done to show that perception is intermodal from the very beginning (for summary see Meltzoff 1993; Gallagher and Meltzoff 1996). In an intermodal system, proprioception and vision are already in communication with each other. In certain cases, what I see automatically gets translated into a proprioceptive sense of how to 4. The fmdings of imitation under these experimental conditions rule out "reflexes" or release mechanisms as potential mediators of this activity. Reflexes and release mechanisms are highly specific - that is, narrowly circumscribed to limited stimuli. One cannot have a reflex or release mechanism for imitation in general. As a result, the range of behaviors displayed by infants would require the unlikely postulate of distinct reflexes or release mechanisms for each kind of imitative behavior: tongue protrusion, tongue protrusion to one side, mouth openings, smile, frown, etc. While it may not be difficult to imagine how evolution might provide for a reflex smile, it is difficult to understand why it would furnish a reflex for angular tongue protrusion. Furthermore, neither delayed response nor improvement in response is compatible with a simple reflex or release mechanism.
Phenomenological and experimental contributions
285
move. Proprioception and vision are intermodally linked in several ways, and these linkages are part of a more general link between sensory and motor activities. For example, and quite relevant to the possibility of neonate imitation, both proprioception and vision are integrated with vestibular information about head motion and orientation. s Importantly, these structures, involving self-awareness, are mature at birth. Thus in the case of neonate imitation, the imitating subject depends on a complex background of embodied processes, a body-schema system involving visual, proprioceptive and vestibular information. In the foreground, what the infant sees gets translated into a proprioceptive awareness of her own relevant body parts; and proprioceptive information allows her to move those parts so that her proprioceptive awareness matches up to what she sees. This intermodal intra-corporeal communication, then, is the basis for an inter-corporeal communication. Just here we can postulate the beginnings of a body image - based on the infant's sense that the face of the other person is like its own face, defined pragmatically, as something it can move in the same way. This has profound implications for the child's relations with others. Meltzoff and Moore (1997) propose a psychological-cognitive model, a set of theoretical black boxes representing "comparison function", "act equivalence", "recognition of my own capability", etc. Here I want to suggest that both phenomenology and neuroscience are required to open up these black boxes and to show that the embodied processes that make imitation possible make sense only in a larger context defined in terms of intersubj ectivity.
5. The vestibular nucleus, a relatively large midbrain structure, serves as a complicated integrative site where fIrst-order information about head position is integrated with whole-body proprioceptive information from joint receptors and oculo-motor information about eye movement. This integrated, multimodal information projects to the thalamus, informing connections that project to cortical areas responsible for control of head movement. Vestibu1ar neurons in the parietal lobe respond to vestibular stimulation, but also to somatosensory and optokinetic stimuli, and more generally there is cortical integration of information concerning self-motion, spatial orientation, and visuo-motor functions (Guldin, Akbarian, and Griisser 1992; Jouen and Gapenne 1995).
286
Shaun Gallagher
5.
Intersubjectivity
Husserl, in texts from 1906-1913, suggests that our understanding of others involves processes that happen on the level of bodily sensations, and that this provides access to others that predates or prefigures anything that would involve inference or analogy. For Husserl, understanding another person is not a matter of intellectual inference but a matter of sensory activations that are unified in or by the animate organism or lived body that is perceiving another animate organism. But Husserl's thought here is offered more as a question than as a confirmed view. Re asks: "Can what effects the unitary lived embodiment [Leiblichkeit] extend itself to the separate and movable bodies in the spatial world?" (1973: 33). And here he suggests that the perception of another person's body as object [Korperwahrnehmung] is in some way different from the perception of another person's lived body [Leibwahrnehmung]. These suggestions are reminiscent of Russerl's writings in Ding und Raum (1907). There he talks about kinaesthetic sensations that are activated in perception. When I perceive something, the sensory activation involved is joined by a corresponding activation of kinaesthetic sensations in my lived body. If Russerl were Merleau-Ponty he would have put it in this way: my body reverberates with the things of the world. Kinaesthesia is the sensory experience of one's own movement and is closely related to proprioception or position sense. Quite simply, when I move, I have a pre-reflective and recessed sense of moving. Russerl's claim, however, is not simply that we have kinaesthetic sensation when we move, but that we have kinaesthetic sensation when we perceive something - the something that we perceive registers in a certain way within our proprioceptive-kinaesthetic system, or more generally, within our body schema. Specifically in regard to intersubjectivity, when we see someone else act in a certain way, our own body-schematic system is activated. This kind of process is directly relevant to imitation, and in part, is what provides us with a primary understanding of the other person. 6 6. For the notion of primary understanding, see Dilthey (1926). Scheler (1913 [1948]: 254) is of the same mind and insists that this primary understanding is perceptual in nature. "For we certainly believe ourselves to be directly acquainted with another person's joy in his laughter, with his sorrow and pain in his tears, with his shame in his blushing, with his entreaty in his outstretched hands [...] And with the tenor of this thoughts in the sound of his words. If anyone tells me that this is not 'perception', for it cannot be so, in
Phenomenological and experimental contributions
287
What remains somewhat tentative in Russerl is fully developed in Merleau-Ponty. It is consistent with his phenomenological insights to say that we emerge from pre-natal life immersed in a set of prenoetic, natural processes and these make up the complex sensory fields upon which appear the world and others. In regard to intersubjectivity, Merleau-Ponty puts things in the right order: "The very first of all cultural objects, and the one by which all the rest exist, is the body of the other person as the vehicle of a form of behavior" (1945: 348). He takes one's own lived body to be the locus of intersubjective experience: I have the world [and others ... ] through the agency of my body as the potentiality of this world. [... ] [B]etween this phenomenal body of mine and that of another as I see it from the outside, there exists an internal relation which causes the other to appear as the completion of the system. (1945: 350,352)
And in a much later text he continues the same thought: "My corporal schema is a normal means of knowing other bodies... " (Merleau-Ponty 1956-1960: 218). This is what he terms intercorporeality. Recent studies in neuroscience suggest that there are specific neurophysiological processes that can account for this intercorporeality, understood as a body-schematic reverberation that depends on the close intermodal connections between visual perception, kinaesthetic-proprioception and motor behavior. These are body-schematic, and specifically motor processes that operate prenoetically, as general conditions of possibility for motor stability and control. They are also directly related to the possibility of imitation. I refer here to what neuroscientists now describe as processes that involve mirror neurons and resonant systems (Gallese 1998; Gallese et al. 1996; Rizzolatti et al. 1996). Mirror neurons link up motor processes with visual ones in ways that are directly relevant to the possibility of imitation. When I see another person act in a certain way, the neurons activated in the pre-motor cortex are precisely the same neurons that are activated when I act in the same way. More generally, overlapping brain areas, or "shared representations" in the motor, premotor and prefrontal cortexes, are activated in the following conditions: during motor action, during the view of the fact that a perception is simply a 'complex of physical sensations' [...] I would beg him to turn aside from such questionable theories and address himself to the phenomenological facts." For the relevance of a direct perceptual or primary understanding in the context of contemporary debates, see Gallagher (2001, 2004).
288
Shaun Gallagher
observation of another's motor action, and during the imaginative enactment (conscious simulation) of my own or another's motor action, and during preparation for imitating the other (Georgieff and Jeannerod 1998; Grezes and Decety 2001; Jeannerod 2001; Ruby and Decety 2001). That is, the same neuronal areas are activated when I engage in intentional action and when I see or imagine such action performed by another person. What both the phenomenology and the neuroscience show is that my intersubjective understanding of others is not a purely intellectual accomplishment. I perceive the emotions and the intentions of the other person in their bodily movements and gestural expressions, and in doing so, my own embodiment acts as the template for understanding (see Gallagher 2001).
6.
Conclusion
My primary concern has been to show in a partial way, but in sufficient detail, how embodiment provides certain innate capacities that enable and condition our experience of ourselves and others. There is much more to say in regard to all of these issues, but I hope that I have given sufficient indication of how the distinction between body image and body schema can be used in a productive way. More specifically, I think the use of these concepts entails a multi-disciplinary approach that involves both phenomenology and the empirical studies of psychology and neuroscience. This is a two-way process, however. Not only does the phenomenological distinction between body image and body schema receive verification and clarification by appealing to empirical evidence, but the distinction itself goes some distance towards the clarification of a variety of issues in the empiricalliterature. Only through this combination of disciplines can we begin to map out the details of how the body shapes our mental experience.
References Bayne, Timothy 2004 Closing the gap? Some questions for neuro-phenomenology. Phenomenology and the Cognitive Sciences 3 (4): 349-364. Cash, Thomas F. and Timothy A. Brown 1987 Body image in anorexia nervosa and bulimia nervosa: A review of the literature. Behavior Modification 11: 487-521.
Phenomenological and experimental contributions
289
Cole, Jonathan 1995 Pride and a Daily Marathon. Cambridge, MA: MIT Press. Denny-Brown, Derek, John S. Meyer and Simon Horenstein 1952 The significance of perceptual rivalry resulting from parietal lesion. Brain 75: 433-471. Dilthey, Wilhelm 1926 The understanding of other persons and their life-expressions. Trans. Kurt Mueller-Vollmer. In: Kurt Mueller-Vollmer (ed.), The Hermeneutics Reader, 152-164. New York: Continuum, 1988. Gallagher, Shaun 1986 Body image and body schema: A conceptual clarification. Journal of Mind and Behavior 7: 541-554. 1995 Body schema and intentionality. In: Jose Bermudez, Naomi Eilan and Anthony Marcel (eds.), The Body and the Self, 225-244. Cambridge: MIT/Bradford Press. 1997 Mutual enlightenment: recent phenomenology in cognitive science. Journal ofConsciousness Studies 4 (3): 195-214 2001 The practice of mind: Theory, simulation, or interaction? Journal of Consciousness Studies 8 (5/7): 83-107. 2004 Hermeneutics and the cognitive sciences. Journal of Consciousness Studies 11 (10/11): 162-174. 2005 How the Body Shapes the Mind. Oxford: Oxford University Press. Gallagher, Shaun, George Butterworth, Adina Lew and Jonathan Cole 1998 Hand-mouth coordination, congenital absence of limb and evidence for innate body schemas. Brain and Cognition 38: 53-65. Gallagher, Shaun and Jonathan Cole 1995 Body schema and body image in a deafferented subject. Journal of Mind and Behavior 16: 369-390. Gallagher, Shaun and Andrew Meltzoff 1996 The earliest sense of self and others: Merleau-Ponty and recent developmental studies. Philosophical Psychology 9: 213-236. Gallagher, Shaun and Francisco Varela 2002 Redrawing the map and resetting the time: Phenomenology and the cognitive sciences. In: Steven Crowell, Lester Embree and Samuel J. Julian (eds.), The Reach of Reflection: The Future of Phenomenology, 17-45. ElectronPress; Reprinted in Canadian Journal of Philosophy, Supplementary Volume 29, 2003: 93-132. Gallese, Vittorio 1998 Mirror neurons: From grasping to language. Paper read at Tucson III Conference: Towards a Science of Consciousness (Tucson 1998). 2005 Embodied simulation: From neurons to phenomenal experience. Phenomenology and the Cognitive Sciences 4 (1): 23-48
290
Shaun Gallagher
Gallese, Vittorio, Luigi Fadiga, Leonardo Fogassi and Giacomo Rizzolatti 1996 Action recognition in the premotor cortex. Brain 119: 593-609. Gardner R. M. and C. Moncrieff 1988 Body image distortion in anorexics as a non-sensory phenomenon: A signal detection approach. Journal 0/ Clinical Psychology 44: 101107. Georgieff, Nicolas and Marc Jeannerod 1998 Beyond consciousness of external events: A "Who" system for consciousness of action and self-consciousness. Consciousness and Cognition 7: 465-477. Grezes, Julie and Jean Decety 2001 Functional anatomy of execution, mental simulation, observation and verb generation of actions: A meta-analysis. Human Brain Mapping 12: 1-19. Guldin, W. 0., Schahram Akbarian and o. J. Griisser 1992 Cortico-cortical connections and cytoarchitectonics of the primate vestibular cortex: A study in squirrel monkeys (Saimiri sciureus). Journal o/Comparative Neurology 326: 375-401. Guillaume, P. 1943 Psychologie. Paris: Presses Universitaires de France. Head, Henry 1920 Studies in Neurology. Vol. 2. London: Oxford University Press. Husserl, Edmund 1907 Ding und Raum. Husserliana 16. The Hague: Martinus Nijhoff, 1973; English translation, Richard Rojcewicz. Thing and Space: Lecturs of1907. Dordrecht: Kluwer Academic. 1952 Ideen zu einer reinen Phiinomenologie und phiinomenologischen Philosophie, WaIter Biemel (ed.), Husserliana 4. Dordrect: Kluwer. English translation: Richard Rojcewicz and Andre Schuwer. Ideas Pertaining to a Pure Phenomenology and to a Phenomenological Philosophy Second Book: Studies in the Phenomenology of Constitution. Dordrecht: Kluwer. 1989. 1973 Zur Phiinomenologie der Intersubjektivitiit Texte aus dem Nachlaj3. Erster Teil. 1905-1920.1. Kern (ed.). Dordrecht: Kluwer Academic. James, William 1890 The Principles ofPsychology. New York: Dover, 1950. Jeannerod, Marc 1997 The Cognitive Neuroscience of Action. Oxford: Blackwell Publishers. 2001 Neural simulation of action: A unifying mechanism for motor cognition. NeuroImage 14: 103-109
Phenomenological and experimental contributions
291
Jeannerod, Marc and Shaun Gallagher 2002 From action to interaction: An interview with Marc Jeannerod. Journal ofConsciousness Studies 9 (1): 3-26. Jouen, Fran90is and Olivier Gapenne 1995 Interactions between the vestibular and visual systems in the neonate. In: Pierre Rochat (ed.), The Self in Infancy: Theory and Research, 277-301. Elsevier Science B. V. Lhermitte, J. 1939 L'image de notre corps. Paris: Nouvelle Revue Critique. Meltzoff, Andrew 1993 Molyneux's babies: Cross-modal perception, imitation, and the mind of the preverbal infant. In: N. Eilan, R. Mccarthy, and B. Brewer (eds.), Spatial Representation: Problems in Philosophy and Psychology, 219-235. Oxford: Basil Blackwell. Meltzoff, Andrew and M. Keith Moore 1977 Imitation of facial and manual gestures by human neonates. Science 198: 75-78. 1983 Newborn infants imitate adult facial gestures. Child Development, 54: 702-709. 1994 Imitation, memory and the representation of persons. Infant Behavior and Development 17: 83-99. 1997 Explaining facial imitation: A theoretical model. Early Development and Parenting 6: 179-192. Merleau-Ponty, Maurice 1945 Phenomenologie de la perception. Paris: Gallimard; English translation: Colin Smith Phenomenology ofPerception. London: Routledge and Kegan Paul, 1962. 1956-60 La nature: Notes, cours du College de France. Paris: Editions de Seuil, 1995. La Nature: Course notes from the College de France. Trans. Robert Vallier. Evanston: Northwestern University Press, 2003. 1960 Les relations avec autrui chez l'enfant. Paris: Cours de Sorbonne; English translation: William Cobb, The child's relations with others. In: James M. Edie (ed.), The Primacy of Perception. Evanston: Northwestern University Press, 1964. Ogden, Jenni A. 1996 Fractured Minds: A Case-Study Approach to Clinical Neuropsychology. Oxford: Oxford University Press. Overgaard, Morten 2004 On the naturalising of phenomenology. Phenomenology and the Cognitive Sciences 3 (4) 365-379.
292
Shaun Gallagher
Piaget, Jean 1945 La formation du symbole chez l'enfant, Neuchatel-Paris: Delachaux et Niestl6; English translation: G. Gattegno and F. M. Hodgson. Play, Dreams and Imitation in Childhood. New York: Norton, 1962. Poeck, K. and B. Orgass 1971 The concept of the body schema: A critical review and some experimental results. Cortex 7: 254-277. Powers, P. S., R. G. Schulman, A. A. Gleghom and M. E. Prange 1987 Perceptual and cognitive abnormalities in bulimia. American Journal ofPsychiatry 144: 1456-1460. Pribram, Karl H. 1999 Brain and the composition of conscious experience. Journal of Consciousness Studies 6 (5): 19-42. Rizzolatti, Giacomo, L. Fadiga, M. Matelli, V. Bettinardi, E. Paulesu, D. Perani and G. Fazio Localization of grasp representations in humans by PET: 1. Observa1996 tion compared with imagination. Experimental Brain Research 111: 246-252. Roth, Wolff-Michael this vol. Communication as situated embodied practice. Ruby, Perrine and Jean Decety 2001 Effect of subjective perspective taking during simulation of action: a PET investigation of agency. Nature Neuroscience 4 (5): 546-550. Scheler, Max 1948 Wesen und Formen deer Sympathie. Frankfurt am Main: G. SchulteBumke, 1948; originally published as Zur Phlinomeno-logie der Sympathiegefuhle und von Liebe und Hass. Halle: Niemeyer, 1913. English translation: Peter Heath, The Nature of Sympathy. London: Routledge and Kegan Paul, 1954. Simmel, Marianne L. 1958 The conditions of occurrence of phantom limbs. Proceedings of the American Philosophical Society 102: 492-500. Phantoms - experiences following amputation in childhood. Journal 1962 ofNeurology, Neurosurgery and psychiatry 25: 69-78. Developmental aspects of the body scheme. Child Development 37: 1966 83-95. Sonesson, Goran this vol. From the meaning of embodiment to the embodiment of meaning. A study in phenomenological semiotics. Varela, Francisco 1996 Neurophenomenology: A methodological remedy for the hard problem. Journal ofConsciousness Studies 3 (4): 330-349.
Phenomenological and experimental contributions
293
Wallon, Henri 1925 Stades et troubles du developpement psycho-moteur et mental chez l'enfant. Paris: Alcan. Zahavi, Dan 2004 Phenomenology and the project of naturalization. Phenomenology and the Cognitive Sciences 3 (4): 331-347. Zlatev, Jordan this vol. Embodiment, language and mimesis.
Section C
Body, language and culture
Embodiment, language, and mimesis
Jordan Zlatev For years now, leading representatives of theoretical linguistics have been arguing that humans, being governed by a blind 'language instinct', can be exhaustively described in physico-biological terms. ... [T]his conception has been shown to be fundamentally false. Humans are also, and crucially, social, normative, and conscious beings, occasionally capable of acts of free will. Esa Itkonen, What is Language?
Abstract The present focus on embodiment in cognitive science undervalues concepts such as convention/norm, representation and consciousness. I argue that these concepts constitute essential properties of language, and this makes it problematic for "embodiment theories" to account for human language and cognition. These difficulties are illustrated by examining a particular, higWy influential approach to embodied cognition, that of Lakoff and Johnson (1999), and exposing the problematic character of the notion of the "cognitive unconscious". To attempt a reconciliation between embodiment and language, I turn to the concept of (bodily) mimesis, and propose the notion of mimetic schema as a mediator between the individual human body and collective language. Keywords: bodily mimesis, consciousness, conventions, mimetic schemas, representation.
1.
Introduction
The main goal of this chapter is to investigate the relationship between language and the concept of embodiment which has become a central, if ambiguous, notion within cognitive science (e.g. Varela, Thompson and Rosch 1991; Clark 1997; Ziemke 2003), the neuroscience of consciousness (e.g. Edelman 1992; Damasio 1994, 2000), (neuro)phenomenology (e.g.
298
Jordan Zlatev
Varela 1996; Thompson 2001; Thompson and Varela 2001; Gallagher 1995, 2005, this volume), cognitive linguistics! (e.g. Lakoff 1987; Johnson 1987; Zlatev 1997; Svensson 1999; Evans 2003) and to some extent developmental psychology (e.g. MacWhinney 1999; Mandler 2004). The notion of embodiment is, indeed, even intended to unite efforts in these different fields into what is often called "second generation cognitive science" (Lindblom and Ziemke this volume) or "embodied cognition" (Johnson and Rohrer this volume). There is much to recommend in this (re)turn to the body in the study of the mind, especially since in many ways it can be seen as a justified reaction to the many shortcomings of "classical" informationprocessing cognitive science according to which the "mindlbrain" works essentially as a computer (e.g. Fodor 1981; Jackendoff 1987; Pinker 1994). There are, however, three major unresolved issues within the current "embodiment turn" in the sciences of the mind. The first was mentioned in passing already: there is not one but many different meanings behind the term "embodiment", both between and within fields, and the corresponding theories are in general not compatible (Ziemke 2003). In particular, I would claim, there is no uniform concept of representation within "embodied cognition", and this is a constant source of (misguided) debate, both between proponents of embodiment and between them and representatives of the "algebraic mind" (Marcus 2001). Second, by their nature, embodiment theories have a strong individualist orientation, and despite recurrent attempts to connect embodiment to social reality and culture (e.g. Palmer 1996; Zlatev 1997; Sinha 1999), there is still no coherent synthesis. In particular, within the work of those emphasizing the role of the "body in the mind" there is no adequate notion of convention or norm, which is essential for characterizing both human culture and the human mind. Third, there is a dangerous tendency to underestimate the role of consciousness in many - though not all - embodiment theories. There seems to be some sort of fear that in appealing to anything that is irreducible to either biology or behavior, one is bound to fall into the clutches of "Cartesian dualism". The consequence is, however, that such "non-dualistic" approaches run the risk 1. When using small letters, i.e. cognitive linguistics, I will refer to the work of linguists who regard language and cognition as intimately connected (e.g. Itkonen, Levinson and Jackendoff). When used with capital letters, Cognitive Linguistics refers to the school of linguistics departing from the work of Lakoff, Langacker and Talmy. The borders are admittedly fuzzy, but in general, Cognitive Linguistics is a hyponym (extensionally speaking a subset) of cognitive linguistics.
Embodiment, language, and mimesis
299
of one form or another of physico-biological reductionism, which as pointed out by Itkonen in the motto to this chapter is deeply misguided. To substantiate these claims in detail would require an extensive review of the literature, which the allotted space of a book chapter does not permit me. My strategy will therefore be to single out one of the above mentioned fields, cognitive linguistics, and even more narrowly, focus on a single exposition of "embodiment theory": Philosophy in the Flesh (PitF) by George Lakoff and Mark Johnson (1999). This choice is motivated by the following reasons: (a) Lakoff and Johnson are two of the foremost proponents of "embodied cognition" not only in (cognitive) linguistics, but in general, (b) PitF is their most recent extensive joint publication, and it is often mentioned as one of the three major reference works on embodiment up to date, along with Varela et al. (1991) and Damasio (1994), and (c) while philosophically oriented, the work deals with implications from linguistic research, and it is precisely in relation to language that the difficulties of "embodiment theory" are most clearly accentuated. 2 The problem reveals itself when we ask the seemingly simple question: In what sense can (knowledge of) language be said to be "embodied"? Prior to answering this question, however, we need to step back and address, ifbriefly, the fundamental question: What is language? In the monograph with this title, from which the opening quotation was taken, Esa Itkonen persuasively argues that the nature of language has been commonly misunderstood in modem "theoretical linguistics" (including both the generative and the cognitive/functional paradigms). Instead of "instincts", "cognitive modules", "neural mechanisms" or "usage", Itkonen (1978, 1983, 1991, 2003) offers a very clear and intuitive answer: Language is a social institution for communicating meanings, a conception with sound roots in the tradition, e.g. Saussure (1916), Trubetzkoy (1939) and Wittgenstein (1953). As such, language exists primarily between people rather than (only) within people. It is "shared" by the members of the community who speak it - in the strong sense in which people can "share a secret": they all know it, and they know that they know it, rather than in the weak sense of "sharing a bottle of wine". But what is it that people share when they know a language? Above all: linguistically encoded concepts, i.e. 2. I should point out that my own previous work on language and embodiment (Zlatev 1997) suffers from the same three drawbacks listed above, i.e. it lacks coherent concepts of representation and convention and, in addition, disregards their dependence on consciousness. My criticism of "embodiment theory" in the frrst part of this chapter is therefore also a form of (former-)self-criticism.
300
Jordan Zlatev
lexical meanings, and rules for their combination. In Section 2 of this chapter I will elaborate on this, and argue that it is impossible to account for linguistic meaning without the concept of representation. Nearly as obviously, the conventionality of language, as well as the fact that we follow rules (which we are free to break) rather than mechanical deterministic procedures shows that our knowledge of language is (in principle) accessible to consciousness. This also implies that linguistic knowledge involves declarative, and not only procedural knowledge. 3 This characterization of language in terms of conventionality, representation and accessibility to consciousness appears to be on a collision course with attempts to explain language in terms of "embodiment", since as pointed out above, it is precisely these three concepts that are at best underdeveloped, and at worse rejected by proponents of embodied cognition. In the recent work of cognitive linguists such as Johnson and Lakoff,4 and especially in PitF, this dissonance turns into an outright contradiction. In Section 3 I analyse the concept of embodiment as explicated within PitF (with some references to other Cognitive Linguistic work to show that PitF is by no means an exception), in order to make this contradiction as clear as possible. In brief: if language has the properties that I claim, and if embodiment has the properties that Lakoff and Johnson claim, then language can not be embodied. And since language is not just a "module" of the human mind - something that Cognitive Linguistics emphasizes - but largely constitutive of it (e.g. Vygotsky 1934; Nelson 1996; Tomasello 1999), then the human mind cannot be embodied either.
3. Mandler (2004) eloquently argues for the need to distinguish between declarative, conceptual knowledge, which is accessible to consciousness, and procedural, sensorimotor skills, which are not (cf. Ikegamie and Zlatev, this volume). While language learning and use undoubtedly involve both types, it is a mistake to attempt to reduce all linguistic knowledge to procedural "know-how" as e.g. done by Zlatev (1997). Consciousness is a multifaceted phenomenon (and concept) but similarly to Mandler, in this chapter I focus on the deliberative aspect of consciousness, rather than on its qualitative, experiential aspect. Also it should be noted that in stating that something is accessible to consciousness, this does not imply that it is, of course, accessed in any particular moment. Consciousness has a center-periphery structure, so of necessity some of the objects of consciousness will be in the "margins" (Gurwitsch 1964). 4. Though admittedly, this was less obvious in their earlier formulations, such as their rather inspiring Metaphors We Live By (Lakoff and Johnson 1980), as well as Johnson (1987).
Embodiment, language, and mimesis
301
However, the overall goal of this chapter is not to criticize the shortcomings of "embodiment theory", but to attempt to show how the concept can be developed in order to resolve the contradiction laid out in the previous paragraph. The first step is to argue in Section 4 that the PitF notion of "embodiment" is indeed not viable, and therefore a replacement is required. Then I proceed in Section 5 with an attempt if not to fill, at least to minimize the gap between language and embodiment through the concept of bodily mimesis, understood along the lines of Donald (1991, 2001) as the volitional use of the body for constructing and communicating representations. On this basis, I offer conceptual and empirical support for a novel theoretical concept, mimetic schemas, which constitute body-based, pre-linguistic, consciously accessible representations that serve as the child's first concepts (Zlatev 2005). Furthermore, mimetic schemas possess a basic intersubjectivity which can serve as the foundation for developing a conventional-normative semiotic system, i.e. language. In Section 6, I briefly outline how the concept of mimetic schemas can contribute to the (hopeful) resolution of a number of puzzles in explaining language evolution, acquisition and spontaneous gesture. Finally, I summarize the argument.
2.
Language
The claim that language is primarily a social institution for communicating meanings, stated in the introduction, is customarily met with incomprehension by linguists and psychologists. 5 To put the objection into the terminology of this volume: what is the "embodiment" of this institution? Part of it may be in writing systems and other artifacts (Donald 1991; Clark 1997; Sonesson this volume), but would not language cease to exist if it were not instantiated within the minds of its users, the individual speakers? Well, this can be debated since one can argue that "dead languages" are not really dead if they have been preserved in written texts and especially in a 5. This statement may seem to contradict occasional remarks in the cognitive linguistic literature concerning the "social dimension" of language, and the frequent use of phrases such as "conventional imagery" (Langacker 1987) and "conventional metaphor" (PitF). The implications of these remarks are, however, never explored. In particular, it is never explained how is it possible that individual mental phenomena such as imagery and metaphorical "mappings" can at the same time be conventional, i.e. social.
302
Jordan Zlatev
grammatical description, because that would allow them to be "recreated" by studying the texts and grammar, which is more like (collective) remembering than rediscovery. But, of course, it must be granted that language is an individual as well as a social phenomenon and none (or very few) of the social accounts of language has ever denied this. However, even as an individual psychological phenomenon, as say, knowledge of English rather than the social institution English, language can be shown to consist of conventional representations accessible to consciousness. Let me try to explicate. What do I need to know in order to understand (1), which has been uttered by, say, Peter? Minimally, I would need to know the (social) facts (2) - (7). (1) (2) (3)
(4) (5) (6) (7)
John kissed Mary. The word kiss means KISS. The words John and Mary are names of a male and a female human being, respectively. The word order shows that John kissed Mary, rather than vice versa. The past tense signifies that the event described occurred sometimes in the past relative to the time of utterance. The sentence (normally) expresses an assertion. The names John and Mary actually refer to individual X and Y.
But this is not enough to guarantee that I understand Peter. Imagine that I know (2-7), but Peter, who has had a rather idiosyncratic upbringing, thinks that kiss means HIT-ON-THE-HEAD. I will then fail to understand the meaning of (1) as meant by Peter. So I must also know that Peter knows (2-7). Furthermore, I must know, or at least assume, that Peter knows that I know (2-7). For if Peter thinks that I've had a strange upbringing, or maybe as a foreigner I do not have a proper command of English, then he may not be using (1) in its conventional way, even though he knows (2-7). If this seems far-fetched, consider only (7), which involves not the meaning (Sinn) of the names John and Mary but their reference - or Bedeutung according to the classical distinction of Frege (1882 [1997]). Here it is easier to see that unless Peter and I can be quite sure not only that both of us know who the names refer to in this context, but that Peter knows that I know, and I know that Peter knows, there might be a misunderstanding. For instance, I am thinking of Mary Smith, and Peter is thinking of Mary Smith. But if I don't lrnow that Peter lrnows that I am thinking of Mary
Embodiment, language, and mimesis
303
Smith rather than Mary Williams, then I couldn't be sure who he is really referring to by Mary in uttering (1). This type of reflexively shared knowledge is known as common knowledge (Itkonen 1978), mutual knowledge (Clark and Marshal 1981) or common ground (Clark 1996). A convenient way to say that (2-7) are part of common knowledge is to say that they are conventions (Lewis 1969; Clark 1996), norms (Itkonen 1978) or even rules (Wittgenstein 1953; Searle 1969).6 These closely related terms have rather complementary implications, so while I will predominantly use the term conventions to refer to our knowledge of facts such as (2-7), it is crucial to remember that this knowledge is normative, in the sense that one can be right or wrong according to public criteria of correctness (Wittgenstein 1953; Baker and Hacker 1984), in one's use of these conventions. This normativity can be on various levels of explicitness and scope ranging from prescriptive grammars for the "national language" to intuitions about "the way we talk in our family". However, it is always social and always involves a degree of conscious awareness, since to be following a convention/norm/rule - as opposed to the movement governed by a reflex or a blind habit - one must be able to compare it to actual usage and notice any potential mismatch. It is senseless to talk about this noticing of a difference between "should" and "is" without being aware of the difference and this implies at least a degree of consciousness. Such conscious processes of noticing and judgment are also essential for the acquisition of language by pre-verbal children (e.g. Bloom 2000) and by second-language learners (Schmidt 1990). As argued at length by Mandler (2004: 228), without consciousness, language acquisition could not come off the ground: The ability to make an old-new distinction requires awareness of prior occurrence or pastness; its loss is one of the hallmarks of amnesia. Amnesiacs retain the ability to be influenced by past experience and to learn at least certain new skills, but they have lost the awareness that these experiences are familiar to them.
6. Unfortunately, all these terms have other (negatively charged) meanings when applied to language, thus conventional is often identified with "arbitrary". Norm has bad connotations for linguists since it is associated with "normative grammar", which prescribes rather than describes. Finally, rule is often interpreted as an explicit, algorithmic, non-creative procedure, which is just about the opposite of what e.g. Wittgenstein (1953) meant by "rule-following".
304
Jordan Zlatev
One of the things that amnesiacs can not learn is a new language, implying that language can not be acquired by processes of implicit learning of the type that are modeled by most connectionist models (e.g. Elman 1990), which do not require conscious awareness. Thus we can conclude that knowing and learning conventions such as (2-7) involves making them accessible to consciousness. Notice that I am not claiming that consciousness is involved in every aspect of language learning and use: it is beyond doubt that implicit learning and procedural knowledge are important as well. My claim is that consciousness is at least essential for (a) the acquisition of concepts and rules, (b) the ability to notice any "breaking" of the rules and (c) all forms of meta-linguistic knowledge. It is (b) and (c) that are the basis for all grammaticality judgments and linguistic analysis and thus for traditional or "autonomous" linguistics (Itkonen 1978, 1991). On the other hand, attempts to make linguistic theories "psychologically real" have always attempted to reconcile the analysis obtained from (b-c) with the learner's perspective in (a). While there are obvious differences in the three processes (a), (b) and (c), conscious awareness unites them, and sets them apart from the "automated" procedures that underlie reflexes and habits of the kind that govern the behavior of most animals, and which are also important for human beings. Language conventions can concern pronunciation (phonology) or the combinations of words and phrases (morphology and syntax), but the most important conventions and those that distinguish language from other convention/norm/rule systems such as those in dancing tango, boxing or eating at a restaurant concern semantics and pragmatics. In all the aforementioned activities there is a "right" and a "wrong" way of doing things and that is how we know that they are conventional-normative. But in language (and some other semiotic systems) one can be right and wrong representationally. There are two ways in which linguistic utterances like (1) can be properly regarded as representations. Both are conveniently explicated by the classical semiotic triangle (Ogden and Richards 1923), displayed using generic terms for its three relata in Figure 1. First, the relationship between Expression and Meaning, the latter considered as conventional context-general content, is that of the classical Sausserian sign, the first one corresponding to the "signifier", the second to the "signified". What 100 years of theoretical linguistics and especially functional/cognitive linguistics (Giv6n 2001; Lakoff 1987) have added to
Embodiment, language, and mimesis
305
this basic insight is that the relationship need not be as "arbitrary" as Saussure assumed, especially considering that grammatical constructions are also a kind of sign, and these are at least to some degree motivated by factors such as iconicity and indexicality (and are thus not classical Peircean "symbols"). This, however, does not mean that the mapping between Expression and Meaning is any less conventional (Zlatev 2003). The first five of the conventions involved in understanding (1) as an English sentence (2-6) involve linguistic signs in this sense.
Meaning
Expression
Figure 1.
Reality
The "semiotic triangle", after Odgen and Richards (1923).
What about the relationship Meaning-Reality? First of all, age-old philosophical problems concerning the "aboutness" of language can be resolved by noting that it is not the expressions of language that relate directly to reality (this is implicit in the notion of the semiotic triangle), and not meaning in the sense of conventional content either, but rather meaning as illocutionary (speech) acts, performed by speakers and hearers by intentionally imposing illocutionary force on the propositional content of sentences. Or as expressed succinctly by Searle (1999): Language relates to reality in virtue of meaning, but meaning is the property that turns mere utterances into illocutionary acts. (ibid: 139) [... ] The conventional intentionality of the words and the sentences of a language can be used by a speaker to perform a speech act. When a speaker performs a speech act, he imposes his intentionality on those symbols. (ibid: 141)
306
Jordan Zlatev
There are three important aspects of this process in relation to our discussion of the nature of language that need to be emphasized. First, the "imposition of intentionality" on the part of the speaker (and its interpretation by the hearer) is clearly dependent on conscious awareness - unless the speaker is talking in his sleep and thus speaking "non-intentionally", in both the everyday and the philosophical sense of the word. Second, at least in the case of assertives including speech acts such as statements, descriptions and classifications which have what Searle calls a "mind-to-world direction of fit" we have a fairly clear representational relation between Meaning and Reality: the speech acts are "pictures of reality" that can be either true of false. This is not representation in the Saussurian sense but rather in the sense of the Tractatus (Wittgenstein 1923 [1961]), with the provision that it is utterances spoken by speakers that are true or false, not sentences - as famously emphasized by Strawson (1950) in his critique of Russell (1905). It is this representational relationship that is denied by pragmatism, and by many representatives of cognitive linguistics (Lakoff and Johnson 1999; Johnson and Lakoff 2002; Johnson and Rorher this volume). But such objections seem to be beside the point, since they concern the metaphysics of an "objective reality" and the epistemology of "objective truth", where both sense of "objective" are understood as mind-independent. However, all that is necessary in order to regard the relationship between a statement and a state-of-affairs (SoAs) as a representation, is for: (a) the first to be about that SoA, rather than just in association with it, (b) the speaker of the statement to be aware of (a), and (c) the possibility or the statement/representation to either match or not the SoA. Nothing in (a-e) requires either the SoA or the matching with the statement to be "mind-independent". These conditions are fulfilled in Lakoff and Johnson's definition of "embodied truth" (1999: 106), so even in their account the meaning of a (true) sentence can be regarded as a (matching) representation of a situation. Even if the representational relation between linguistic meaning and reality-as-conceived is to be rejected, for whatever reasons, then there is still the Saussurian representational or "symbolic" relationship between "the phonological" and "the semantic pole" (Langacker 1987), i.e. expression and content. In short, representation is simply inescapable in accounting for language (Sinha 1988, 2005). Finally, we should note that the "imposition of intentionality" mentioned by Searle in the previous quote is not a private, speaker-internal matter, but is constrained firstly by the conventional meaning of the ex-
Embodiment, language, and mimesis
307
pression(s). This is what makes it difficult (though perhaps not impossible) to express your love by saying I'll kick you. The second constraint is a more situation-specific and dynamic sort of intersubjectivity, exemplified by the need to have a "common ground" for figuring out the referent of the names John and Mary in (7). In order to successfully refer, you need to formulate your speech act in a way that will make the referent intersubjectively "shared" for you and your hearer, and this requires a fairly keen sensitivity to the norms of the language, to the situation and to your interlocutor's state of mind. All this is unthinkable without consciousness, as also pointed out by Donald (2001), and takes quite some time and effort to be mastered by children. To sum up, the discussion in this section has pointed out the following features that can be regarded as definitional of human language: conventionality, implying normativity; representationality: between expression and content and between an assertive speech act and reality; accessibility to consciousness: necessary for the establishment of common knowledge and for the management of successful communicative action. 7 A characteristic feature of language that has not been discussed is one that is perhaps most often mentioned in discussions of the "uniqueness of language" in respect to other human and animal systems of communication - to the extent of forgetting those listed above - namely, the systematicity of language (Saussure 1916; Deacon 1997). It is true that this is an essential feature of language, and something that for example distinguishes language from gesture (McNeil 1992; Senghas, Kita and Ozyi1rek 2004). It should be pointed out, however, that this concerns not the "syntax" of language alone, but its general capacity to express an unlimited number of meanings, both in the sense of content and speech acts. Finally, while the primary function of language is social interaction, once internalized, it becomes a representational vehicle of thought, transforming the cognition of its user (Nelson 1996; Tomasello 1999). Therefore, a suitable concise definition of language would be: A consciously supervised, conventional representational system for communicative action and thought. This is admittedly terse and different from what one usually finds in linguistics textbooks, but it is no more than the compact summary of the explication provided in this section. If this explication
7. Though, to remind once again, reflective consciousness need not be involved in every aspect of learning, producing and understanding language.
308
Jordan Zlatev
has been clear enough, then its relative non-orthodoxy is no reason for it not to be accepted.
3.
Embodiment
Let us now turn to see how embodiment is defined within Cognitive Linguistics, focusing on the recent work of Lakoff and Johnson, and above all on PitF. Somewhat surprisingly, there is no straightforward definition of "embodiment" to be found in a 624 page book with the subtitle The Embodied Mind and its Challenge to Western Thought, the closest approximation being: " ...there are at least three levels to what we are calling the embodiment of concepts: the neural level, phenomenological conscious experience and the cognitive unconscious" (PitF: 102). What are these ("at least") three levels? Starting from the bottom, we are told that "neural embodiment concerns structures that characterize concepts and cognitive operations at the neural level" (PitF: 102). It is furthermore claimed that this level "significantly determines [...] what concepts can be and what language can be" (PitF: 104). One of the most specific definitions of "an embodied concept" is provided in terms of this level only: "An embodied concept is a neural structure that is part of, or makes use of the sensorimotor system of our brains. Much of conceptual inference is, therefore, sensorimotor inference" (PitF: 20, original emphasis). However, Lakoff and Johnson make it clear that they will not deal with the nitty-gritty of neurobiology like "ion channels and glial cells" (PitF: 103) since the neural level refers to a higherlevel generalization that is heavily dependent on "an important metaphor to conceptualize neural structure in electronic terms" (PitF: 103). Thus, the connectionist model of Regier (1996) is given as an instance of "neural modeling", even though it is quite removed from what is known about the brain (and even though Regier does not apply the adjective "neural" to the model himself and repeatedly points out that his model is only inspired by some aspects of neural systems). The next level, "phenomenological embodiment", is devoted much less attention. Its first definition is "[...] the way we schematize our own bodies and things we interact with daily" (PitF: 36), with reference to the phenomenological tradition and specifically the work on the body schema and the body image of Gallagher (1995). The second definition is considerably broader: "It [i.e. phenomenological embodiment] consists of everything we
Embodiment, language, and mimesis
309
can be aware of, especially our own mental states, our bodies, our environment and our physical and social interactions. This is the level at which we speak of the "feel" of experience [...]" (PitF: 103). What the authors do not make clear is whether all conscious experience should be considered as "phenomenological embodiment", and if so, why this is the case. At the same time, they point out that "phenomenology also hypothesizes nonconscious structures that underlie and make possible the structure of our conscious experience" (PitF: 103). This heralds the arrival of the main hero of Lakoff and Johnson's account of embodiment: the "cognitive unconscious." The cognitive unconscious is the massive portion of the iceberg that lies below the surface, below the visible tip that is consciousness. It consists of all those mental operations that structure and make possible all conscious experience, including the understanding and use of language. (PitF: 103)
This level is said to be "the realm of thought that is completely and irrevocably inaccessible to direct conscious introspection" (PitF: 12) and (nearly) all-pervasive: the cognitive unconscious constitutes "the 95 percent below the surface of conscious awareness [that] shapes and structures all conscious thought" (PitF: 13). In case the reader should wonder how this all-important level (of embodiment) that is "completely and irrevocably inaccessible" was discovered, Lakoff and Johnson point out that it is "hypothesized on the basis of convergent evidence, [...] required for scientific explanation" (PitF: 115) and that "the detailed processes and structures of the cognitive unconscious (e.g., basic-level categories, prototypes, image schemas, nouns, verbs, and vowels) are hypothesized to make sense of conscious behavior" (PitF: 104). So it turns out that this all-important level of embodiment is a hypothetical theoretical construct. It is clear that Lakoff and Johnson feel pressed to defend the "reality" of this construct and they attempt to do so repeatedly. Perhaps the most revealing statement is "To say that the cognitive unconscious is real is very much like saying that neural "computation" is real" (PitF: 104). But is neural computation "real"? We will return to this in the next section. What can one say of Lakoff and Johnson's notion of embodiment? It is obviously in contradiction with the account of language presented in Section 2. Not only does PitF imply that "95 percent of all thought" and consequently of language is completely below the level of conscious awareness, Lakoff and Johnson's definition of "embodiment" has no real place for the two central concepts of conventionality and representation. Regarding the first, there are frequent references to "conventional mental
310
Jordan Zlatev
imagery" (PitF: 45), but it is not even made clear whether this imagery is conscious or only part of the "cognitive unconscious" - not to mention the question of how this imagery would be shared, and furthermore mown to be shared, which is necessary for it to be conventional. One could say the same for the use of the term "conventional metaphor" in the cognitive linguistic literature - there is nothing "conventional" about neurally realized domain-to-domain mappings, at least in any conventional use of the term convention (e.g. Lewis 1969, see footnote 5). When Lakoff and Johnson feel pressed to account for shared meanings, they do point out that "commonalities [...] exist in the way our minds are embodied" (PitF: 4) and that "we all have pretty much the same embodied basic-level and spatial-relations concepts" (PitF: 107). But this is clearly not enough to give you conventions such as those of (2-7) and to account for how a simple English sentence such as (1) is understood. Concerning the concept of representation, Lakoff and Johnson represent quite clearly the anti-representationalist Zeitgeist within "second generation" cognitive science (e.g. Varela et al. 1991), which as pointed out in the introduction eschews the concept of representation in reaction to its overuse in "classical" cognitive science (e.g. Fodor 1981). In a recent (polemical) publication of the two authors this is made explicit: As we said in Philosophy in the Flesh, the only workable theory of representations is one in which a representation is a flexible pattern of organismenvironment interactions, and not some inner mental entity that somehow gets hooked up with parts of the external world by a strange relation called 'reference' . We reject such classical notions of representation, along with the views of meaning and reference that are built on then. Representation is a term that we try carefully to avoid. (Johnson and Lakoff2002: 249-250)
A similar if not stronger form of anti-representationalism is advanced by Johnson and Rohrer (this volume: Section 6): We have been arguing against disembodied views of mind, concepts, and reasoning, especially as they underlie Representationalist theories of mind and language. Our alternative view - that cognition is embodied - has roots in American Pragmatist philosophy and is being supported and extended by recent work in second-generation cognitive science.
In their urge to dissociate themselves from any "disembodied views of
mind", scholars like Lakoff, Johnson and Rohrer, as well as many other representatives of second-generation cognitive science (e.g. Brooks 1999) can be said to overkill (mental) representations. It is one thing to (justly)
Embodiment, language, and mimesis
311
argue against "representations" in perception and active involvement, as done by Dreyfus (1972 [1993]) with support from the phenomenological tradition (e.g. Merleau-Ponty 1945 [1962]), and quite another to deny that, say, a picture is a representation of whatever it depicts, irrespective of whether the latter exists in the "real world" or not (Sonesson 1989, this volume). It is in this latter sense that some, though not all, language use is representational. Furthermore, to deny that assertions are a kind of representation is to deny that a description of a situation can be either true or false. As pointed out in Section 2, Lakoff and Johnson should not really deny this since in their definition of "embodied truth" a person holding a sentence to be "true" is said to understand the sentence to "accord" with "with what he or she understands the situation to be" (PitF: 106). This is clearly a roundabout way of saying that the person understands the sentence to represent the situation correctly. But what is won from such avoidance of the notion? There is nothing "strange" or "metaphysical" in the concepts of representation and reference once it is understood that these are performed by conscious speakers (and signers), not by the expressions in the language themselves. To restrain oneself from using these concepts in accounting for language is to make it impossible to account for the difference between language and perception, or between theatre and lovemaking. (Though admittedly, the latter may be more fun.)8 In this section I have tried to make it as clear as possible that there is a contradiction between the account of language presented in Section 2 and the account of embodiment given by Lakoff and Johnson in PitF, which I have suggested is not atypical for much of "embodied cognition" or "second generation cognitive science". If my account of language and Lakoff and Johnson's account of embodiment are both accepted, then it follows that "embodiment theory" cannot account for language, and since language is a central part of the human psyche, it cannot account for the latter either.
8. Rather more troublesome is the fact that in a pragmatist evolutionary theory insisting on the "continuity" of all cognition such as that of Johnson and Rohrer (this volume) there is no place for a qualitative distinction between the cognition of human beings and ants ... Compare: "According to our interactionist view, maps and other structures of organism-environment co-ordination are prime examples of non-representational structures of meaning, understanding, and thought." (ibid: Section 3.3) with "Ant cognition is thus nonrepresentational in that it is both intrinsically social and situated in organism-environment interactions." (ibid: Section 5)
312
Jordan Zlatev
This negative conclusion can be avoided in one of two ways: Lakoff and Johnson (and their colleagues) would presumably argue that I have misconstrued language. The alternative, which (unsurprisingly) I undertake in the following section, is to argue that the concept of embodiment presented in PitF is inadequate, as a preliminary to suggesting how the concept of bodily mimesis can contribute to a more adequate notion of human embodied cognition that naturally combines with the three essential features of language: convention, representation and accessibility to consciousness.
4.
Embodiment lost? A critique of Lakoff and Johnson (1999)
Let us begin with Lakoff and Johnson's first level of embodiment: the "neural level". An obvious question to ask is why the exclusive focus on the brain (and the rest of the nervous system) at the expense of the whole living body? One reason seems to be that the activity of the brain could possibly be understood "computationally" - using the "neural computation" metaphor - while that of the whole bio-chemistry of the body cannot, in any remotely meaningful way. Another reason seems to be that the "nonneural" parts of the body are not considered relevant for the "shaping" of cognition. It seems to be that for Lakoff and Johnson "brain and body are used as substantially interchangeable" (Violi 2003: 205). Leaving for the time being phenomenological aspects, this is still deeply problematic. Is a brain-in-a-vat just as embodied as a living body? There are at least two good, more or less obvious, reasons to doubt this. First, all sensorimotor interactions with the environment are performed by using our limbs, muscles, eyes, ears, nose, skin, tongue etc. - not with the somatosensori cortex itself. Or is it so that Lakoff and Johnson hold that these periphery systems are merely "transducers" and could equally well be substituted by artificial correlates managing the input-output of electrical signals to the brain? Whatever the tenability of this position, it is clearly a very "non-embodied" way to think of cognition, and, for that part, of the brain itself (see Lindblom and Ziemke this volume). The second reason is that the living body participates not only in interaction with the environment, but in evaluation of it - at least according to somatic theories of emotion such as that of Damasio (1994, 2000). According to Damasio certain regions of the brain constantly monitor the state of the whole body, and depending on its "wellbeing" judge external stimuli (though as we all know, people have found
Embodiment, language, and mimesis
313
many ways to trick these monitoring systems over the ages, allowing them to "feel good" while their body is not thriving). If this is still somewhat speculative, let me simply remind of an aspect of our non-neural bodies that has a strong effect on our emotional life, and thereby on our thinking: the hormonal system. What all this points to is that even when regarding the body from an external, "third-person" perspective, it is a gross simplification to consider only the nervous system as relevant for cognition. The living body as a whole is relevant, and the kind of embodiment this involves could be called simply "biological" or perhaps "organismic" (Ziemke 2003; Zlatev 2003). Turning now to the "phenomenologicallevel" of Lakoff and Johnson's three-level notion of embodiment (in PitF), we can notice the opposite tendency: if there was an under-extension of the role of the body when regarding embodiment as a biological phenomenon, there appears now to be an over-extension by equating bodily awareness with all conscious experience, i.e. "everything we can be aware of' (PitF: 103). While it is clear that phenomenal bodily experience is involved in physical interactions, either with the inanimate environment or in physical social interactions such as chasing, wrestling, love-making... it is far from obvious what role the body schema, or even the body image (Gallagher 1995, 2005, this volume) play in more detached social interactions, such as tax payment while I am presumably conscious when I fill in my tax-return forms. Lakoff and Johnson never address this problem, which is unsurprising since consciousness is on the whole treated by the authors in a rather step-motherly fashion: tolerated out of necessity but neglected. It is characteristic that others who have given a more prominent role to consciousness or "subjectivity" in linguistics and cognitive science do not view it primarily in terms of embodiment. Thus Talmy (2000) writes "Meaning is located in conscious experience. In the case of subjective data, 'going' to their location consists in introspection. [...] Consciousness is thus often a necessary concomitant at the subject end within cognitive science" (ibid: 5-6). It is not obvious that the (phenomenal) body plays any important role in such introspection. Similarly, in discussing the notion of perspectivity in language, treated as a form of embodiment by MacWhinney (1999), Violi (2003: 218) writes that "both the perspective a given grammatical construction imposes on the action, and the perspective connected to interpersonal and social frames, refer to subjectivity more than embodiment". Notice that I am not saying that this latter claim is necessarily true - it could turn out on closer inspection that the phenomenal body is
314
Jordan Zlatev
implicated in all kinds of social interaction and even in linguistic perspective-taking. One of the goals of the analysis presented in the next section is precisely to suggest a greater role for phenomenal embodiment for language and cognition. But the elucidation of the role of embodiment for subjectivity and experience is an enormous task, begun by the classical phenomenologists like Husserl and Merleau-Ponty (cf. Gallagher this volume), and continued more empirically by (neuro)phenomenologists such as Varela (1996), Thompson and Varela (2001) and Gallagher (1995, 2005, this volume), semioticians (Violi 2001; Sonesson this volume), cognitive scientists (Donald 1991), etc. One cannot simply call consciousness "phenomenological embodiment" and leave it at that. However, the major problem with the PitF approach to embodiment is neither of the above two levels - the "neural" and the "phenomenological" - but the third, and as shown earlier, crucial, element in Lakoff and JoOOson's theory: the "cognitive unconscious". In the remainder of the section I will argue that this notion is conceptually incoherent and rather than being amended should be simply disposed of. First, the notion conflates two very different kinds of entities. On the one hand are structures such as "domain-to-domain mappings", "neural computations" and "image schemas" which are hypothesized to operate with an unconscious causality that one can become as aware of as, say, synaptic growth or the operation of the immune system, that is, not at all. On the other hand Lakoff and JoOOson mention "nouns, verbs and vowels" (PitF: 104), i.e. categories which (nearly) all linguists analyzing all human languages recognize, by applying standard practices of conscious linguistic analysis. Since these analyses are not based on generalizations from speakers' "behavior", despite occasional claims to the contrary, but on the basis of linguistic intuitions (of correctness), it becomes clear that even "naIve" speakers have consciously accessible knowledge of these categories of their language. Thus the denizens of the Cognitive Unconscious are of two different ontological kinds: the first, to repeat, are hypothetical causal mechanisms, while the second are explications of linguistic knowledge that are consciously accessible. As expressed by Itkonen (1978) in a different, but analogous, context: [W]e have here a confusion between the following two types of entities: on the one hand, the concept of 'correct sentence ofa language L', which is the object of conscious knowledge; on the other, utterances of language L, which are manifestations of unconscious 'knowledge'. In the former case
Embodiment, language, and mimesis
315
'knowledge' equals consciousness, while in the latter, 'knowledge' is a hypothetical dispositional concept. (Itkonen 1978: 82)
A second objection is methodological: what is the status of the evidence for postulating the various structures of the Cognitive Unconscious? Lakoff and Johnson often refer to "converging evidence", but does this evidence really converge? On inspection it turns out to be very heterogeneous. On the one hand is intuition and introspection, resulting in e.g. analyses of semantic polysemy as "radial categories" (Lakoff 1987) or Talmy's (2000) grammatical and semantic analyses which are acknowledged to be phenomenological (see above). On the other hand there is psycholinguistic experimentation involving unconscious mechanisms such as "semantic priming" (Cuyckens, Sandra and Rice 1997; Tufvesson, Zlatev and van de Weijer 2004) as well as neurolinguistic studies getting even closer to the actual causality of the brain processes (e.g. Rohrer 2001; de Lafuente and Romo 2004). Methodological pluralism is to be applauded, but the task of combining evidence from disparate sources into a coherent framework is formidable, and is not made easier by postulating levels that are inaccessible to both introspection and empirical observation such as the Cognitive Unconscious. In contrast, the framework of "levels of investigation" proposed by Rohrer (this volume) suggests how different kind of evidence can be brought together in a nonreductionist manner, without any "cognitive unconscious". The third objection is more general (and philosophical). It involves not just the Cognitive Unconscious postulated by Lakoff and Johnson and the methodological self-understanding of Cognitive Linguistics, but all forms of "information processing" psychology and cognitive science that postulate the existence of mental phenomena which are completely divorced from and inaccessible to consciousness. The problem is the following: without consciousness, there is no basis for distinguishing mental from non-mental states within an organism. As pointed out by Searle (1992: 154): "not every state of an agent is a mental state, and not even every state of the brain that functions essentially in the productions of mental phenomena is itself a mental phenomenon". Searle's favorite examples are myelination and the OVR reflex: both are important for cognition, but in what sense can they be said to be mental? And if they are, then anything neural is mental. But in this case we have abolished the distinction mental vs. neural. Now that may be something that "identity theorists" (e.g. Armstrong 1968) and "eliminativists" (e.g. Churchland 1992) in the philosophy
316
Jordan Zlatev
of mind would applaud. However all such proposals have so far run aground, and the "mind-body problem" remains unsolved (Maslin 2001).9 Within information-processing, "classical" cognitive science a common way to make the distinction between mental and non-mental without recourse to consciousness is through the notion of computation: mental processes are involved in (symbolic) computation, non-mental ones are not (e.g. Jackendoff 1987; Pinker 1994; Marcus 2001). Despite their overall rhetorical debate with and opposition to information processing theorists, through their endorsement of "neural computation" Lakoff and Johnson come surprisingly close to the position of their opponents. Unfortunately the "computational" solution to the mental/non-mental distinction does not work for a very simple reason: there is no intrinsic computation going on in the brain, as argued at length by e.g. Searle (2002). All talk of neural computation is metaphorical, in the sense that it is a matter of attribution from the outside, just as in, say, computational interpretations of the weather processes or of water flow. And because of that, the "computational level" is not ontologically or causally distinct from the neural level: "Except in cases where an agent is actually intentionally carrying out a computation, the computational description does not identify a separate causal level distinct from the physical structure of the organism" (Searle 2002: 126). It is only a matter of "level of description", which is something completely different: a matter of epistemology rather than ontology. A possible objection to defining the mental (or the "cognitive") through consciousness and thereby denying the coherence of the notion of the Cognitive Unconscious is the existence of unconscious mental states, either of the obvious kinds including our beliefs when we sleep or otherwise not think about them, and the less obvious kind due to "repression" according to Freud (1949). The claim would be that when not conscious, unconscious mental states have some intermediate state of existence - not neural, not conscious - and when this intermediary realm is granted, then why can't it be populated by all sorts of mental phenomena, some of which could never be accessible to consciousness? However, this possibility is rejected by what Searle calls the connection principle: "all unconscious intentional
9. Furthermore, Lakoff and J ohnson (1999, Chapter 7) claim to be neither identity theorists (reductionists) nor eliminativists with respect to consciousness, so they would need a principled means to distinguish conscious experience from its neurallbiological underpinnings.
Embodiment, language, and mimesis
317
states are in principle accessible to consciousness" (Searle 1992: 156). In a nutshell, the argument is the following: All intentional states have aspectual shape: whatever they are about is seen from a certain perspective rather than other, so that extensionally identical entities such as "the Evening Star" and "the Morning Star" (cf. Frege 1882) have different aspectual shapes. Aspectual shape cannot be exhaustively characterized in third-person predicates, either as brain states or as behaviors. This finds support in Quine's (1960) thesis of the indeterminacy of translation. When unconscious, mental states exist as neurophysiological phenomena, rather than in a mental space that is kept outside the purview of conSCIousness. On this basis one can draw the conclusion: "The notion of an unconscious intentional state is the notion of a state that is a possible conscious thought or experience" (Searle 1992: 159). There has been extensive discussion of this argument in the recent philosophical literature into which I will not go (cf. Garrett 1995). But suffice it to say that while one can discuss any of the three premises above in some detail, Searle offers a coherent way to think about unconscious mental states without postulating a "cognitive unconscious". Since the concept is problematic both ontologically and methodologically, as suggested earlier, this places a heavy burden on those who appeal to "unconscious mental processing" that is different from both neuro-physiological processes and conscious thought to convince us of the reality of their claims. Lakoff and Johnson are aware of the difficulty, and spend some three pages arguing for the "causal efficacy" of their construct. However this defense is far from convincing. Rather it displays the unconventional ways in which crucial theoretical concepts are used in their work. First, it is claimed that an unconscious "basic-level concept like chair is both intentional and representational" (PitF: 116). Undoubtedly, but in what way is it unconscious? If chair is not the concept of a conscious subject, then who is it that applies the concept to whatever it is about? Intentional states are not self-interpreting so there must be an unconscious "homunculus" doing the job, in whose mind there must be yet another etc. Similarly for the claim that there are unconscious representations - if there is no ability for misrepresentation, error, we cannot speak of representation in any nonvacuous way. But when there is error, if not earlier, the discrepancy will be noticed, i.e. brought into consciousness. Notice that I am not stating that representations need to represent "objective reality" and thus I am not
318
Jordan Zlatev
committing the sin of "objectivism" that is so much abhorred within Cognitive Linguistics (Lakoff 1987) - what is essential however is that there are criteria for judging the adequacy of the representation, and at least in the case of language, these need to be public, as shown by Wittgenstein and pointed out earlier. So to summarize, Lakoff and Johnson's crucial notion of "the cognitive unconscious" faces a dilemma: If it is a generalization of neuronal activity, it is clearly causally efficacious, but then it is not separate from "neural" or rather "biological" embodiment. On other hand, if it consists of intentional, representational phenomena such as concepts, nouns and vowels, then each one of these is (potentially) conscious, and therefore "phenomenological." In both cases the Cognitive Unconscious is redundant. Furthermore, since the role of the phenomenal body for cognition and especially for language is still unclear, we are left with the provisional conclusion that language/mind may not be embodied in any interesting, non-trivial way, i.e. apart from saying that they are "realized in" or "supported by" living matter.
5.
Bodily mimesis
There is, I would argue, another and more productive way of linking the concept of embodiment to language: one that is based on the concept of bodily mimesis, understood as the use of the body for representational means (Donald 1991, 2001; Zlatev 2002, 2003). Unlike in reductionist approaches such as that of Lakoff and JOhnson (1999) and the similar sounding but very dissimilar in content "memetics" (e.g. Blackmore 1999) mimesis has by definition two of the three crucial features of language: representationality and accessibility to consciousness. This is already obvious in the most concise definition provided by Donald (1991: 168): "Mimetic skills or mimesis rests on the ability to produce conscious, selfinitiated, representational acts that are intentional but not linguistic." In this section I will first introduce the notion as done by Donald in the context of cognitive evolution, and elaborate it somewhat. Then I will relate it to a very similar concept from developmental psychology: Piaget's (1945 [1962]) notion of a symbol which plays a crucial role in mediating between the sensorimotor cognition of the infant, and the language-based cognition of the verbal child and adult. On this theoretical basis, I will introduce a relatively novel concept, the mimetic schema (Zlatev 2005),
Embodiment, language, and mimesis
319
and show how it can help resolve the apparent contradiction between embodiment and language that I have argued for so far.
5.1.
Mimesis in hominid evolution
In Donald's (1991) highly original theory of human origins, early hominids - most likely belonging to the species Homo ergaster/erectus, considering the relative jump in brain size and material culture in the hominid line around 2 million years ago - evolved a new form of cognition based on mimesis. lo This allowed our ancestors to use their bodies to perform elaborated actions that others are observed to be doing (imitation), to represent external events for the purpose of communication or thought (pantomime, gesture) and to rehearse a given skill by matching performance to an imagined goal. These are all capabilities which distinguished hominins from the common ape-human ancestor, but which precede language and are thus not dependent on it. This hypothesis is similar to so-called "gesture theories" of language origins (Stokoe 2001; Corballis 2002). However, it also differs from them, since mimesis lacks at least two properties of language (or even "protolanguage") - full conventionality and systematicity, which are likely to have appeared when vocal calls became recruited for the purpose of disambiguating gestures {Arbib 2003).11 Thus, mimesis can be seen as serving as a "missing link" in human evolution. Furthermore it has been suggested that mimesis can play a similar role in human ontogenetic development (Nelson 1996; Zlatev 2001, 2003). In order to make the concept more precise and to distinguish it from other evolutionary and developmental theories which also emphasize the role of imitation such as that of Tomasello (1999), the following {re)definition can be given, also adding the adjective 10. Donald's theory is based on evidence from paleontology, archeology, neurobiology and cognitive psychology, that I will not have the space to present, but Zlatev (2002) and Zlatev, Persson and Gardenfors (2005) offer a brief exposition of this and other empirical support for the mimetic hypothesis of human origins. 11. The difference between mimesis and a gestural (proto) language, makes mimesis a more likely stepping stone to speech, since if language fIrst emerged in the manual modality, it is difficult to explain why we do not all use sign languages today, i.e. what would force language evolution out of the manual-brachial track.
320
Jordan Zlatev
"bodily" in order to distinguish bodily mimesis from the broader concept of mimesis with Aristotelean roots (cf. Zlatev, Persson and Gardenfors 2005). (Det) Bodily mimesis: A particular act of cognition or communication is an act of bodily mimesis if and only if: (1) It involves a cross-modal mapping between proprioception and some other modality (Cross-modality). (2) It consists of a bodily motion that is, or can be, under conscious control. (Volition) (3) The body (part) and its motion are differentiated from and understood to correspond (either iconically or indexically) to some action, object or event. (Representation) (4) The subject intends the act to stand for some action, object or event for an addressee. (Communicative sign function) But it is not an act of bodily mimesis if: (5) The act is fully conventional (i.e. a part of mutual knowledge) and breaks up (semi)compositionally into meaningful sub-acts that systematically relate to other similar acts. (Symbolicity) Properties 1 to 5 are assumed to appear in this order in evolution, and logically build on one another. Thus they form an implicational hierarchy: 1 < 2 < 3 < 4 < 5. (If one has higher level properties, one must have lower-level ones, but not vice-versa). Bodily acts that lack either property 2 or 3 (or both), e.g. crying, are according to the definition not mimetic. On the other hand, signed language possesses property 5 and is excluded as well. However, not all forms of mimesis need fulfill property 4: e.g. pantomime does, but imitation does not. On this basis we can distinguish between two forms of bodily mimesis: triadic mimesis which fulfills properties 1-4 in the definition, and dyadic mimesis, where 4 is missing. Given the implicational hierarchy, it follows that dyadic mimesis is simpler than triadic mimesis and should precede it in evolution, and possibly also in ontogeny. Indeed, it is by now clear that all great apes (orangutan, gorilla, chimpanzee and bonobo) have the capacity for dyadic mimesis, as shown in e.g. mirror self-recognition (Gallup 1982) and imitation of arbitrary gestures (Custance, Whiten and Bard 1995), though in less developed form than human beings. What is especially difficult for apes, though, is the understanding that representations can be used communicatively, i.e. by the sender and receiver sharing the
Embodiment, language, and mimesis
321
x-v
(expression-content) mapping. Language-taught apes achieve this with some effort (e.g. Patterson 1980), but there is no clear evidence that it appears spontaneously, the most convincing case being certain "iconic gestures" involving sexual and play invitations in captive apes (Tanner and Byrne 1999). Thus, what distinguishes my reformulation, and corresponding theory, mostly from that of Donald (1991) is that I hypothesize that it is triadic mimesis that crucially separated Homo erectus/ergaster from the common ancestor, allowing a leap in cultural evolution. Triadic imitation implies the understanding of communicative intentions, and in this way my proposal is similar to Tomasello's (1999) suggestion that it is the understanding of others as intentional agents that distinguishes human beings from apes. However, it differs in emphasizing communicative intentions, and indeed, recent evidence has granted support to this position, since apes have been shown to understand that others have "psychological states" such as goals, at least in competitive, non-communicative contexts (Tomasello, Call and Hare 2003). On the other hand, understanding a gesture as corresponding to something presumably came naturally to our predecessors, as suggested in the following scenario: Early humans' eyes and brains would naturally have seen that their hands and their movements pointed directly to other things or reminded them of other things by looking like them. [... ] Take for instance a gesture meant by its maker and understood by its watcher to represent "the animal went up the tree". The hand would point at the animal that both individuals had seen and move upward as it pointed to the tree. What the brain would have done - a million or two years ago as now - is interpret the hand's pointing flISt to mean "that animal" and then to mean "that tree", all the time while interpreting the hand and arm movement as "climbing". (Stokoe 2001: 12)
While Stokoe is probably over-interpreting the degree of differentiation of early representational gesture (in line with his "original gestural language" hypothesis), the quote captures the essence of triadic mimesis quite accurately. Thus, my evolutionary hypothesis proposes that the common apehuman ancestor had the basic potential for dyadic mimesis. It was further selected for as a consequence of living in larger social groups and bipedalism which furthermore provided the niche for the communicative use of bodily representations, i.e. triadic mimesis. Ontogenetic development, as shown in the following subsection, can offer some corroborating evidence to this scenario.
322
Jordan Zlatev
5.2.
Piaget's epigenetic theory and mimetic schemas
Epigenesis, the co-determination of ontogenetic development by genes and environment leading to a spiral of morphologies, with "lower" states serving as preconditions for "higher" ones, is nowadays nearly unanimously accepted in biology (see Badcock 2000; Zlatev 2003). What makes epigenesis even more central for human development is the fact that the human infant is born in a highly immature state compared to other mammals. Furthermore the human enyironment is so culturally rich that "culture" impinges on "nature" to such a degree that it becomes nearly impossible to distinguish between the two (Tomasello 1999). The developmental theory of Piaget (1945, 1953, 1954) is epigenetic in this sense and it can be showed that Piaget presupposed a role for bodily mimesis in ontogenetic development that is analogous to the one envisioned for evolution above, though this seems to have remained hidden due to terminological differences. Piaget distinguishes between three different kinds of cognitive structures: sensorimotor schemas, symbols and signs, emerging in development in this order. Of the three, the first is best known in the literature, in particular in relation to theories of embodiment. In previous work (Zlatev 1997) I suggested that sensorimotor schemas, which are goal-directed structures of practical activity, can provide the "grounding" of language in experience, thus making them analogous to the "image-schemas" proposed by many cognitive linguists (Johnson and Rohrer this volume). As discussed in previous sections, however, this proposal is problematic since sensorimotor schemas are non-representational, while language is representational. Piaget was very much aware of this difference, and while he acknowledged that sensorimotor schemas play an important part in the "construction of reality for the child", he claimed that they have inherent limitations, since "sensorimotor activity involves accommodation only to present data, and assimilation only in the unconscious practical form of application of earlier schemas to present data" (Piaget 1945: 278). This prevents them from being representations, since for Piaget, as in the present account, a representation needs to be (a) accessible to the consciousness of the subject for whom it serves as a representation and (b) differentiated from whatever it represents, i.e. between the "signifier" and the "signified", in Saussurean terms. Thus, a qualitatively new stage of development emerges with the attainment of what Piaget calls the symbolicfunction:
Embodiment, language, and mimesis
323
This specific connection between "signifiers" and "signified" is typical of a new function that goes beyond sensorimotor activity and that can be characterised in a general way as the "symbolic function." It is this function that makes possible the acquisition of language or collective "signs," but its range is much wider, since it also embraces "symbols" as distinct from "signs," i.e. the images that intervene in the development of imitation, play, and even cognitive representations. (ibid: 278)
To understand this quotation, we should emphasize that Piaget is using the term "symbol" in a sense that is very different from what it implies in the Anglo-Saxon world: conventionality, systematicity and arbitrariness. Rather, "symbols" are for Piaget dynamic mental images, more or less vivid in consciousness, representing non-present actions or events. Crucially, both for Piaget and for my argument, they emerge through imitation: Hence the image is both interiorised sensorimotor imitation, and the draft of representative imitation. [... ] It is imitation that has been interiorised as a draft for future exterior imitation, and marks the junction-point between the sensorimotor and the representative. (ibid: 279)
Imitation can play this bridging role since it usually emerges through the following _ontogenetic progression: sensorimotor imitation (the imitated action of the model is contiguous in time) > deferred imitation (the imitated action is removed in time) > representative imitation - in which "the interior image precedes the exterior gesture, which is thus a copy of an "internal model" that guarantees the connection between the real, but absent model, and the imitative reproduction of it." (ibid: 279) Two important aspects of Piaget' s account of the rise of representations or "the symbolic function" should be emphasized in the present context. The first is that they arise from an overt, public activity - imitation - which with time becomes internalized. This is reminiscent of Vygotsky's (1978) "law of cultural development" stating that interpersonal forms of higher cognition precede their "intrapersonal" realizations (cf. Lindblom and Ziemke this volume). Second, as pointed out above, this makes possible the acquisition of language, which both consolidates and conventionalizes these representations, leading to a new level of cognitive structure: "Verbal representations constitute, in fact, a new type of representation, the conceptual." (Piaget 1945: 280) In other words these "symbols", i.e. internalized imitations serve as a "missing link" in the acquisition of language. The analogy to the role of bodily mimesis in phylogeny should be now obvious. On this basis, as well as a wealth of empirical data provided by Piaget, but also by many others who have studied infant imitation and ges-
324
Jordan Zlatev
ture since then (Bates et al. 1979; Acredolo and Goodwyn 1994; Zlatev 2002), I have proposed a more fitting term for the structures that Piaget is (rather confusingly for the modem reader) calling "symbols", namely, mimetic schemas (Zlatev 2005). If we refer to the definition of bodily mimesis provided above, we notice that in the case of representative imitation the first three properties: Cross-modality, Volition and Representation are fulfilled. Thus the covert imitation of a child following its "internal model" in executing an action is at least a case of dyadic mimesis. In order to become triadic, in e.g. pantomime ("baby signs") what is necessary is to understand communicative intentions. This can be seen as a wish to induce others to "activate" in consciousness schemas similar to one's own. In other words, while Piaget writes of "symbols" (mimetic schemas) as the "signifier" and the actual model as the "signified", the relation can be reversed, so that a communicative gesture becomes the signifier, while the (shared) mimetic schemas are the "signified" or perhaps in Peircian terms the "interpretant" (cf. Sonesson this volume). Let us now summarize some of the properties of mimetic schemas. Mimetic schemas can be used either dyadically (in thought) or triadically (in communication). Mimetic schemas are experiential: each schema has a different emotional-proprioceptive "feel", or affective tone (Thompson 2001) to it. For example, consider the affective contrast between the mimetic schemas KICK and KISS. Thus, mimetic schemas can be regarded as an (important) aspect of phenomenological embodiment. Mimetic schemas are representational: the "running" of the schema is differentiated from the "model event" which is represented - unlike the most common explication given to "image schemas" (Johnson 1987; Johnson and Rorher this volume; see Hampe 2005). Mimetic schemas are, or at least can be pre-reflectively shared: since my and your mimetic schemas derive from imitating culturally salient actions and objects, as well as each other, both their representational and experiential content can be "shared" - though not in the strong sense of being known to be shared in the manner of (true) symbols or conventions. They could also be called egocentric: "Imitation, with the help of images, provides the essential system of ' signifiers' for the purpose of individual or egocentric representation" (Piaget 1945: 279-280). However, it should be remembered that for Piaget, this formulation does not imply that mimetic schemas are private, but rather the contrary: "on the social plane the child
Embodiment, language, and mimesis
325
is most egocentric at the age in which he imitates most, egocentrism being failure to differentiate between the ego and the group, or confusion of the individual view-point and that ofothers" (ibid: 290, my emphasis). Mimetic schemas can serve as the basis for the acquisition of language in two ways: (a) they constitute the first form of (conscious) internal representation and help lead to the "insight" that others have internal models - a prerequisite for communicative intentions and (b) they constitute prelinguistic concepts, and in this respect correspond to Mandler's (2004) characterization of "image schemas" but not to that of Johnson and Rohrer (this volume; cf. Zlatev 2005). These properties of mimetic schemas, and particularly the last, can allow us to bridge (or at least minimize) the gap between language and embodiment, as discussed in the next section, which also retraces the argument presented in this chapter.
6.
Embodiment regained? Mimetic schemas and language
I started by pointing out three essential properties of (the knowledge of) language: conventionality, representationality and conscious accessibilityand proceeded to see if, and if so how, they can be made compatible with the currently popular conception that the (human) mind is an "embodied mind". In one of the most influential accounts of "embodiment theory", especially within Cognitive Linguistics, that of Lakoff and Johnson (1999), we saw that these three properties were essentially absent. In what followed I subjected this version of "embodiment" to criticism, and in particular its central concept of the Cognitive Unconscious. While this criticism does not automatically generalize to other accounts, it gives us reasons to worry if embodiment and language can be made compatible, not the least because of the lack a coherent concept of representation. The quest for a more adequate notion of embodiment led us to the work of Donald (1991), and the concept of (bodily) mimesis, which was explicated and related to Piaget's developmental theory. In particular, I argued for the need to acknowledge the concept of mimetic schemas, which among other things: are structures of the "lived" (phenomenal, experiential) body, meaning that they are accessible to consciousness;
326
Jordan Zlatev
are representational structures: they are differentiated from what they stand for, and can be enacted overtly (as pantomime and gesture) or covertly (as mental images); can be pre-reflectively shared with others since they (usually) arise from imitation. But notice that these three characteristics of mimetic schemas correspond to - without being identical - to the three properties of language under focus. Thus, the following hypothesis concerning the "embodiment" of language can be formulated: Public linguistic symbols are Hembodied" in the sense that part of their meaning is constituted by underlying mimetic schemas. If this hypothesis holds true, bodily mimesis can serve not only as a "missing link" between sensorimotor and linguistic cognition in evolution as envisioned by Donald (1991) and in ontogenesis as argued by Piaget and in rather different ways proposed by Nelson (1996) and Zlatev (2001, 2002) - but as a conceptual, meta-theoretical link between embodiment and language. Since language is a central aspect of human sociocultural situatedness, mimetic schemas can help integrate the two major factors that define the human mind - embodiment and situatedness - in a coherent framework. What else can we offer in support of this hypothesis? A proper treatment of this question would require a separate chapter, so here I only mention the following considerations, to be explored in more detail in the future (cf. also Zlatev 2005): First, the existence of pre-linguistic but representational mimetic schemas can help solve the puzzle how "socially shared symbolic systems" (Nelson and Shaw 2002) emerge in pre-linguistic children. Since young children lack the meta-linguistic capacity for establishing full-fledged conventions, it is still a mystery how they come from the sensorimotor to the symbolic (i.e. conventional and systematic) level. Mimetic schemas, with their implicit sharing, suggest a way out of this impasse. Second, a particular difficulty in explaining language acquisition is to account for the learning of actions terms ("verbs"). After having traditionally been considered to follow object terms ("nouns") in child language (Macnamara 1982), action words have during the past years been shown to arise simultaneously (Tomasello 1992; Nelson 1996), and if they are prominent in parental speech, even to precede nouns in some cases (Gopnik, Choi and Baumberger 1996). It is obvious how mimetic schemas
Embodiment, language, and mimesis
327
for concrete, imitable actions (e.g. RUN, EAT, SEAT... ) can serve as a basis for the acquisition of the corresponding "verbs". Furthermore, the development of shared representations for objects that can be manipulated such as cups, balls, toys, books, food etc. will be also facilitated, and thus underlie the acquisition of the corresponding "nouns".12 Notice that if mimetic schemas ground the acquisition of the first words in childhood, the prediction is that the child's early vocabulary will consist of terms such as run, sit, eat, cat and toy ... and this is indeed the case (Nelson 1996; Bloom 2000). Third, and conversely to helping explain the ease with which children acquire language, and in line with Donald's (1991) original proposal, mimetic schemas may help explain why language acquisition is so difficult even for "enculturated" apes: evolution has given us an adaptation for triadic mimesis supporting advanced imitation and gesture that is beyond the capacities of our nearest relatives in the animal kingdom. Forth, mimetic schemas as a ground for public symbols can help explain how both "cognitive" (representational) and "affective" (experiential) meaning can be communicated through language, since both aspects can be - to various degrees - shared by communicators, even if the two can be decoupled in abnormal conditions. Fifth, the close connection of linguistic symbols and mimetic schemas is consistent with the accumulating evidence from experimental psychology and neuroscience showing that language use engages motor representations, as well as the corresponding brain regions (Glenberg and Kaschak 2003; Svensson, Lindblom and Ziemke this volume). At the same time, neither this evidence, nor the present proposal implies a stronger form of "language embodiment" in which (practically) all symbolic and inferential processing is carried out by sensorimotor categories and brain regions (Lakoff and Johnson 1999; Johnson and Rohrer this volume). If that were the case it would be very hard to explain the qualitative difference between animal and human cognition, in particular with respect to language skills. To emphasize again, according to the present hypothesis, mimetic schemas ground, but do not constitute linguistic meaning - which as pointed out in
12. In the case of objects there is also another means to achieve shared reference, e.g. joint attention (Tomasello 1999), and this would serve to pick out shared perceptual attributes. But there are problems in explaining how this is done, conceptual (Quine 1960) as well as empirical (Bloom 2000); mimetic schemas for acting on the objects can help pick out the relevant properties.
328
Jordan Zlatev
Section 2 is conventional in the strong sense: not just shared, but mutually known to be shared. Sixth, the hypothesis is consistent with the recent enthusiasm surrounding "mirror neurons", which are assumed to support action recognition and imitation, and their role in the evolution of language (Rizzolati and Arbib 1998; Arbib 2003). Since there appears to be a homology between area F5 of the monkey brain where mirror neurons for grasping were originally discovered and Broca's area, it is reasonable to suppose that a developed mirror neuron system constitutes a (partial) "neural correlate" of the ability to form and entertain mimetic schemas. Seventh, and finally, a long lasting debate in the study of spontaneous co-speech gestures (e.g. McNeil1 1992) is whether they are primarily "communicative" or "cognitive", i.e. whether they are performed for the benefit of the speaker, or for the speaker himself (given that even blind people gesture to each other, as well as more mundanely, people talking on the telephone). Considering gestures to be realizations of mimetic schemas allows them to be both. The work of Kita and Ozyiirek (2003), showing the existence of non-linguistic "spatio-motoric representations" that are to some extent influenced by the language of the speaker, fits in naturally with the present proposal.
7.
Conclusion
In this chapter I have argued for the following set of interrelated theses: Language is fundamentally a socio-cultural phenomenon, based of grammatical and semantic conventions, and therefore it cannot be reduced to individual minds, and even less so to brains. However, apart from conventionality, language also presupposes representationality and conscious accessibility and these imply subjectivity. Qualitative experience is a subjective, "first-person" phenomenon as well as an interpersonal one, involving emotion and affective tone. Thus a truly experiential theory of language needs to account for the ability to communicate through linguistic signs which are shared both representationally and phenomenologically. Theories of embodiment such as that of Lakoff and Johnson (1999) which ignore these characteristics cannot satisfactorily account for language. Since language plays an important role in shaping the human mind, such theories are not capable of accounting for human cognition as well.
Embodiment, language, and mimesis
329
The concepts of bodily mimesis, and its derivative concept: mimetic schemas, can help resolve the contradiction between embodiment and language, and thus assist us in the long-term project of (re)integrating body, language and mind.
Acknowledgments
In writing this chapter, I have benefited from interactions with other members of the project Language, Gesture and Pictures in Semiotic Development at Lund University and the EU-project Stages in the Evolution and Development of Sign Use (SEDSU): Goran Sonesson, Peter Gardenfors, Tomas Persson, Ingar Brinck, and Sara Lenninger. I would also wish to thank Jorg Zinken, Gorel Sandstrom, Roslyn Frank, Alex Kravchenko; Lars-Ake Henningsson and two anonymous reviewers for comments on various earlier drafts. Finally, I wish to dedicate this essay to my friend Esa Itkonen, for his brave fight for the true nature of language against varieties ofbio-physical reductionism over the past 30 years, i.e. half his life.
References Acredolo, Linda and Susan Goodwyn 1994 Sign language among hearing infants: The spontaneous development of symbolic gestures. In: Virginia Volterra and Carol J. Erting (eds.), From Gesture to Language in Hearing and Deaf Children, 68-78. Washington, DC: Gallaudet University Press. Arbib, Michael 2003 The evolving mirror system: A neural basis for language readiness. In: Morten Christiansen and Simon Kirby (eds.) Language Evolution, 182-200. Oxford: Oxford University Press. Armstrong, David M. 1968 A Materialist Theory of the Mind. London: Routledge and Kegan Paul. Badcock, Christopher 2000 Evolutionary Psychology. A Critical Introduction. Cambridge: Polity Press. Baker, Gordon P. and P.M.S Hacker 1984 Language, Sense and Nonsense: A Critical Investigation into Modern Theories ofLanguage. Oxford: Basil Blackwell.
330
Jordan Zlatev
Bates, Elizabeth, Laura Benigni, Inge Bretherton, Luigia Camaioni and Virginia Volterra The Emergence of Symbols. Cognition and Communication in In1979 fancy. New York: Academic Press. Blackmore, Susan 1999 The Meme Machine. Oxford: Oxford University Press. Bloom, Paul 2000 How Children Learn the Meaning of Words. Cambridge, Mass.: MIT Press Brooks, Rodney 1999 Cambrian Intelligence: The Early History ofthe New AI. Cambridge, Mass.: MIT Press. Churchland, Paul M. 1992 A Neurocomputational Perspective: The Nature of Mind and Structure ofScience. Cambridge, Mass.: MIT Press. Clark, Andy 1997 Being There: Putting Brain, Body, and World Together Again. Cambridge, Mass.: MIT Press. Clark, Herbert 1996 Using Language. Cambridge: Cambridge University Press. Clark, Herbert and Catherine R. Marshall 1981 Defmite reference and mutual knowledge. In: Arivind K. Joshi, Bonnie L. Webber and Ivan A. Sag (eds.), Elements ofDiscourse Understanding, 10-63. Cambridge: Cambridge University Press. Corballis, Michael C. 2002 From Hand to Mouth: The Origins of Language. Princeton, NJ.; Princeton University Press. Custance, Deborah M., Andrew Whiten and Kim A. Bard 1995 Can young chimpanzees (Pan troglodytes) imitate arbitrary actions? Hayes and Hayes (1952) revisited. Behavior 132: 839-858. Cuyckens, Hubert, Dominiek Sandra and Sally Rice 1997 Toward and empirical lexical semantics. In: Birgit Smieja and Meike Tasch (eds.), Human Contact Through Language and Linguistics, 35-54. Frankfurt: Peter Lang. Damasio, Antonio 1994 Descartes' Error. Emotion, Reason and the Human Brain. New York: Grosset/Putnam. 2000 The Feeling of What Happens. Body, Emotion and the Making of Consciousness. New York: Harvester. de Lafuente, Victor and Ranulfo Romo 2004 Language abilities of motor cortex. Neuron 41: 178-180.
Embodiment, language, and mimesis
331
Deacon, Terry 1997 The Symbolic Species: The Co-Evolution ofLanguage and the Brain. New York: Norton. Donald, Merlin 1991 Origins of the Modern Mind. Three Stages in the Evolution of Culture and Cognition. Cambridge, Mass.: Harvard University Press. 2001 A Mind So Rare. The Evolution of Human Consciousness. New York: Norton. Dreyfus, Hubert 1972 What Computers (Still) Can't Do. A Critique of Artificial Reason. Third revised edition. Cambridge, Mass.: MIT Press. Reprint 1993. Edelman, Gerald 1992 Bright Air, Brilliant Fire: On the Matter ofthe Mind. London: Basic Books. Elman, Jerry, F. 1990 Finding structure in time. Cognitive Science, 14: 179-211. Evans, Vyv The Structure ofTime. Language, Meaning and Temporal Cognition. 2003 Amsterdam: Benjamins. Fodor, Jerry A. 1981 Representations. Cambridge, Mass.: MIT Press. Frege, Gottlob 1882 [1997] On Sinn and Bedeutung. In: Michael Beaney (ed.), The Frege Reader, 151-171. Oxford: Blackwell. Freud, Sigmund 1949 An Outline ofPsycho-analysis. London: Hogarth. Gallagher, Shaun 1995 Body schema and intentionality. In: Jose Bermudez, Naomi Eilan, and Anthony Marcel (eds.), The Body and the Self, 225-244. Cambridge: MIT/Bradford Press. 2005 How the Body Shapes the Mind. Oxford: Oxford University Press. this vol. Phenomenological and experimental contributions to understanding embodied experience. Gallup, Gordon G. 1982 Self-awareness and the emergence of mind in primates. American Journal ofPrimatology 2: 237-248. Garrett, Brain, J. 1995 Non-reductionism and John Searle's The Rediscovery of Mind. Philosophy and Phenomenological Research 55 (1): 209-215. Giv6n, Tom 2001 Syntax, VoI1-2. Amsterdam: Benjamins.
332
Jordan Zlatev
Glenberg, Arthur M. and Michael P. Kaschak 2003 The body's contribution to language. In: Brian H. Ross (ed.), The Psychology ofLearning and Motivation 43: 93-126. San Diego, CA: Academic Press. Gopnik, Alison, Soonja Choi and Therese Baumberger 1996 Cross-linguistic differences in early semantic and cognitive development. Cognitive Development 11 (2): 197-227. Gurwitsch, Aron 1964 The Field ofConsciousness. Pittsburgh: Duquesne University Press. Hampe, Beate (ed.) 2005 From Perception to Meaning: Image Schemas in Cognitive Linguistics. Berlin: Mouton de Gruyter. Itkonen, Esa Grammatical Theory and Metascience. Amsterdam: Benjamins. 1978 Causality in Linguistic Theory. A Critical Investigation into the 1983 Philosophical and Methodological Foundations of "Nonautonomous" Linguistics. Bloomington: Indiana University Press. 1991 Universal History ofLinguistics. Amsterdam: Benjamins. 2003 What is Language? Turku: University of Turku Press. Jackendoff, Ray 1987 Consciousness and the Computational Mind. Cambridge, MA: MIT Press. Johnson, Mark 1987 The Body in the Mind. Chicago: University of Chicago Press. Johnson, Mark and George Lakoff 2002 Why cognitive linguistics requires embodied realism. Cognitive Linguistics 13 (3): 245-263. Johnson, Mark and Tom Rohrer this vol. We are live creatures: Embodiment, Pragmatism and the cognitive organism. Kita, Sotaro and Asli 6zyiirek 2003 What does cross-linguistic variation in semantic coordination of speech and gesture reveal? Evidence for an interface representation of spatial thinking and speaking. Journal of Memory and Language 48: 16-32. Lakoff, George 1987 Women, Fire and Dangerous Things: What Categories Reveal About the Mind. Chicago: University of Chicago Press. Lakoff, George and Mark Johnson 1980 Metaphors We Live By. Chicago: University of Chicago Press. 1999 Philosophy in the Flesh: The Embodied Mind and its Challenge to Western Thought. New York: Basic Books.
Embodiment, language, and mimesis
333
Langacker, Ronald 1987 Foundations of Cognitive Grammar, Vol 1. Stanford, CA: Stanford University Press. Lewis, David K. 1969 Convention: A Philosophical Study. Cambridge MA: Harvard University Press. Lindblom, Jessica and Tom Ziemke this vol. Embodiment and social interaction: A cognitive science perspective. ~acnamara,John
Names for Things. Cambridge, MA: ~IT Press. Brian 1999 The emergence of language from embodiment. In: Brain ~acWhinney (ed.), The Emergence of Language, 213-256. ~ahwah, NJ: Lawrence Erlbaum. ~andler, Jean 2004 The Foundations of Mind: Origins of Conceptual Thought. Oxford: Oxford University Press. Marcus, Gary F. 2001 The Algebraic Mind: Integrating Connectionism and Cognitive Science. Cambridge, Mass.: MIT Press. ~aslin, Keith 2001 An Introduction to the Philosophy of Mind. MaIden, Mass.: Polity Press. McNeill, David 1992 Hand and Mind: What Gestures Reveal about Thought. Chicago: University of Chicago Press. ~erleau-Ponty, Maurice 1945 Phenomenology of Perception. London: Routledge and Kegan Paul. Reprint 1962. Nelson, Katherine and Lea Kessler Shaw 2002 Developing a socially shared symbolic system. In: James Byrnes and Eric Amseli (eds.) Language, Literacy and Cognitive Development, 27-57. Mahwah, NJ: Lawrence: Erlbaum. Nelson, Katherine 1996 Language in Cognitive Development. The Emergence of the Mediated Mind. Cambridge: Cambridge University Press. Ogden, C.K. and LA. Richards 1923 The Meaning ofMeaning. London: Routledge and Kegan Paul. Palmer, Gary 1996 Toward a Theory of Cultural Linguistics. Austin: The University of Texas Press. 1982
~acWhinney,
334
Jordan Zlatev
Patterson, Francis 1980 Innovative use of language in a gorilla: A case study. In Katherine Nelson (ed.) Children's Language, Vol 2,497-561. New York: Garden Press. Piaget, Jean 1945 La formation du symbole chez l'enfant, Neuchatel-Paris: Delachaux et Niestle; English translation: G. Gattegno and F. M. Hodgson. Play, Dreams, and Imitation in Childhood. New York: Norton, 1962. 1953 The Origin of Intelligence in the Child. London: Routledge and KeganPaul. 1954 The Construction ofReality in the Child. New York: Basic Books. Pinker, Steven 1994 The Language Instinct. New York: William Morrow. Quine, Willard V. O. 1960 Word and Object. Cambridge, Mass.: MIT Press. Regier, Terry 1996 The Human Semantic Potential: Spatial Language and Constrained Connectionism. Cambridge, Mass.: MIT Press. Rizzolatti, Giacomo and Michael Arbib 1998 Language within our grasp. Trends in Neurosciences 21: 188-194. Rohrer, Tim 2001 Pragmatism, Ideology and Embodiment: William James and the philosophical foundations of cognitive linguistics. In Rene Dirven, Bruce Hawkins and Esra Sandikcioglu (eds.), Language and Ideology: Cognitive Theoretic Approaches: Volume 1, 49-81. Amsterdam: Benjamins. this vol. The body in space: Dimensions of embodiment. Russell, Bemard 1905 On denoting. Mind 14: 479-93. Saussure, Ferdinand de 1916 Cours de Linguistique Generale [Course in General Linguistics]. Paris: Payot. Schmidt,Richard 1990 The role of consciousness in second language learning. Applied Linguistics 11: 17-46. Searle, John 1969 Speech Acts. Cambridge: Cambridge University Press. 1992 The Rediscovery ofthe Mind. Cambridge, Mass.: MIT Press. 1999 Mind, Language and Society. Philosophy in the Real World. London: Weidenfeld and Nicolson. 2002 Consciousness and Language. Cambridge: Cambridge University Press.
Embodiment, language, and mimesis
335
Senghas, Ann, Sotaro Kita and Asli OZyiirek 2004 Children creating core properties of language: Evidence from an emerging sign language in Nicaragua. Science 305: 1779-1782. Sinha, Chris Language and Representation. A Socio-naturalistic Approach to 1988 Human Development. New York: Harverster Press. Grounding, mapping and acts of meaning. In: Theo Janssen and 1999 Gisela Redeker (eds.), Cognitive Linguistics: Foundations, Scope and Methodology, 223-255. Berlin: Mouton de Gruyter. Blending out of the Background: Play, props and staging in the mate2005 rial world. Journal of Pragmatics (Special issue on Conceptual Blending Theory, guest eds. Seana Coulson and Todd Oakley), 37: 1537-1554. Sonesson, Goran 1989 Pictorial Concepts. Lund: Lund University Press. this vol. From the meaning of embodiment to the embodiment of meaning: A study in phenomenological semiotics. Stokoe, William C. 2001 Language in Hand. Why Sign Came before Speech. Washington D.C.: Gallaudet University Press. Strawson, Peter F. 1950 On referring, Mind 59: 320-344. Svensson, Patrik 1999 Number and Countability in English Nouns: An Embodied Model. Uppsala: Swedish Science Press. Svensson, Henrik, Jessica Lindblom and Tom Ziemke this vol. Making sense of embodied cognition: Simulation theories of shared neural mechanisms for sensorimotor and cognitive processes. TalmY,Len 2000 Toward a Cognitive Semantics, Vol I and Vol If. Cambridge, Mass.: MIT Press. Tanner, Joanne E. and Richard W. Byrne 1999 The development of spontaneous gestural communication in a group of zoo-living lowland gorillas. In: Sue T. Parker, Robert W. Mitchell and H. Lyn Miles (eds.), The Mentalities of Gorillas and Orangutans - Comparative Perspectives, 211-239. Cambridge: Cambridge University Press. Thompson, Evan 2001 Empathy and consciousness, Journal of Consciousness Studies 8 (5/7): 1-32.
336
Jordan Zlatev
Thompson, Evan and Francisco Varela 2001 Radical embodiment: Neural dynamics and consciousness. Trends in Cognitive Sciences 5 (10): 418-425. Tomasello, Michael 1992 First Verbs: A Case Study of Early Grammatical Development. Cambridge: Cambridge University Press. 1999 The Cultural Origins of Human Cognition. Cambridge, Mass.: Haryard University Press. Tomasello, Michael, Joseph Call and Brian Hare 2003 Chimpanzees understand psychological states - the question is which ones and what extent. Trends in Cognitive Sciences 7 (4): 153-156. Trubetzkoy, Nikolay S. 1939 Grudzuge der Phonologie. Gottingen: Vandenhoeck and Ruprecht. Reprint 1958. Tufvesson, Sylvia, Jordan Zlatev and Joost van de Weijer 2004 Idiomatic entrenchment and semantic priming, In: Augusto Soares da Silva, Amadeu Torres, Miguel Gonyalves (eds.), Linguagem, Cultura e Cogni9iio: Estudos de Linguistica Cognitiva, Vol 1: 309-334. Coimbra: Almedina. Varela, Francisco 1996 Neurophenomenology: A methodological remedy for the hard problem. Journal ofConsciousness Studies 3 (4): 330-350. Varela, Francisco, Evan Thompson and Eleonor Rosch 1991 The Embodied Mind. Cognitive Science and Human Experience. Cambridge, Mass.: MIT Press. Violi, Patrizia 2001 Meaning and Experience. Translated by Jeremy Carden, Bloomington: Indiana University Press. 2003 Embodiment at the crossroads between cognition and semiosis. Reserches en communication 19: 199-217. Vygotsky, Lev S. 1934 Thought and Language. Cambridge, Mass.: MIT Press. Reprint 1962. 1978 Mind in Society. The Development of Higher Psychological Processes, Cambridge, Mass.: Harvard University Press. Wittgenstein, Ludwig 1923 The Tracatus Logico Philosophicus. London: Routledge. Reprint 1961. 1953 Philosophical Investigations. Oxford: Basil Blackwell.
Embodiment, language, and mimesis
337
Ziemke, Tom 2003 What's that thing called embodiment? In: Richard Alterman and David Kirsh (eds), Proceedings of the 25 th Annual Meeting of the Cognitive Science Society, 1305-1310. Mahwah, NJ: Lawrence Erlbaum. Zlatev, Jordan 1997 Situated Embodiment. Studies in the Emergence ofSpatial Meaning. Stockholm: Gotab Press. 2001 The epigenesis of meaning in human beings, and possibly in robots. Minds and Machines 11 (2): 155-195. 2002 Mimesis: The "missing link" between signals and symbols in phylogeny and ontogeny. In: Anneli Pajunen (ed.), Mimesis, Sign and the Evolution of Language, 93-122. Publications in General Linguistics 3. Turku: University ofTurku Press. 2003 Meaning = Life (+ Culture). An outline of a unified biocultural theory of meaning. Evolution ofCommunication 4 (2): 253-296. 2005 What's in a schema? Bodily mimesis and the grounding of language. In: Beate Hampe (ed.), From Perception to Meaning: Image Schemas in Cognitive Linguistics, 313-342. Berlin: Mouton de Gruyter. Zlatev, Jordan, Tomas Persson and Peter Gardenfors 2005 Bodily mimesis as "the missing link" in human cognitive evolution. LUCS 121. Lund: Lund University Cognitive Studies.
The body in space: Dimensions of embodiment
Tim Rohrer
Abstract Recent research from a large number of fields has recently come together under the rubric of embodied cognitive science. Embodied cognitive science attempts to show specific ways in which the body shapes and constrains thought. I enumerate the standard variety of usages that the term "embodiment" currently receives in cognitive science and contrast notions of embodiment and experientialism at a variety of levels of investigation. The purpose is to develop a broad-based theoretic framework for embodiment which can serve as a bridge between different fields. I introduce the theoretic framework using examples that trace related research issues such as mental imagery, mental rotation, spatial language and conceptual metaphor across several levels of investigation. As a survey piece, this chapter covers numerous different conceptualizations of the body ranging from the physiological and developmental to the mental and philosophical; theoretically, it focuses on questions of whether and how all these different conceptualizations can form a cohesive research program. Keywords: cognitive neuroscience, Cognitive Linguistics, Embodiment, frames of
reference, mental rotation, space.
1.
Introduction: Embodiment and experientialism
1.1.
Embodiment: The return of the absent body to cognitive science
HUMAN BEINGS HAVE BODIES. Academics of every variety, so often caught up in the life of the mind, find that simple truth altogether too easy to forget. Imagine working late into the night, hotly pursuing another bit of perfect prose. But now let there be a power outage and, in the absence of electric light or the pale glow of the computer screen, imagine how we grope and fumble to find our briefcase, locate the door, and exit the building. In such circumstances, the body returns. Whenever we are unexpect-
340
Tim Rohrer
edly forced to move about in the dark, we are forcibly reacquainted with our bodily sense of space. Problems ordinarily solved beneath the level of our conscious awareness become dominant in our cognition; we find ourselves noticing subtle changes in the floor texture underfoot, carefully reaching out for the next step in the stairwell. It is a most peculiar experience, one that may well remind us of being young and just learning to walk down stairs. Unfortunately for cognitive science, many academics of that particular variety haven't simply forgotten that human beings have bodies cognitive scientists have deliberately theorized the body away. For most of its first fifty years, cognitive science was in the throes of a peculiarly devilish axis between information theory in computer science and functionalism in the philosophy of mind and psychology. Within computer science and information theory, the problem of building a thinking machine was identified with just one narrowly specified field of human cognition - computing mathematical functions (Turing 1950; Hodges 1983). Under the functionalist paradigm, the mind was treated as if it were a series of modular computer programs - or "black boxes" - whose inputs and outputs could be specified in symbolic terms. While no one would have argued that the physical architecture of vacuum tubes and transistors making up the early computers were identical to the physical architecture of the neural systems making up the brain, the functionalists did argue that the specific physical details of how such thinking systems computed were irrelevant to what they computed. In fact, they argued that as computation could take place not only in electrical and neural systems but also in mechanical systems such as a loom or Babbage's steam-powered analytical engine, cognition was independent of its physical medium. From this perspective, the only thing that mattered to simulating cognition was getting the inputs of these black boxes to compute the correct outputs (Cummins 1977). The physical body - whose architecture was seen as largely irrelevant to cognition - was redefined as a series of black boxes that computed mathematical functions. The disembodied computer was the analogical origin of the disembodied mind. In recent years however, another strain of cognitive scientists have begun to take their inspiration from the contrarian vision - embodied cognitive science. Unlike the computationalist-functionalist hypothesis, embodiment theorists working in various disciplines argue that the specific details of how the brain and body embody the mind do matter to cognition. This broad theoretical approach has been the result of many parallel devel-
The body in space: Dimensions ofembodiment
341
opments in diverse fields ranging from neurobiology and linguistics to robotics and philosophy. While there are undoubtedly many touchstones and origins of this approach, Johnson and I (Johnson and Rohrer this volume) have given a detailed account of some of the neurobiological and philosophical roots of the embodiment hypothesis in cognitive science with a particular emphasis on how the contributions of American Pragmatism anticipated modem cognitive neuroscience. By contrast, in this paper I intend only to survey the wide variety of manners in which embodied cognitive science is done, including these among others, in order to develop a general theoretic framework as a backdrop against which these research projects can be situated. One of the most central examples of how embodied cognitive science has revolutionized the field lies in the details of how the mind, brain and body interact to construct our experience of space. Tracing this example across the different disciplines of cognitive science will require the whole of this article, but as a beginning recall the basic finding of the work on mental rotation (Shepard and Metzler 1971). In their renowned experiment wherein participants were asked to determine whether one two-dimensional drawing of a three-dimensional object was identical to or a mirror image of another, they found that subjects mentally rotated the object at a linear rate - about 60 degrees per second. In other words, participants were manipulating such images as wholes, preserving their topologies while rotating them through a series of intermediate depictions. At the time of its publication, their finding was surprising because the then prevailing computationalist and functionalist view held that the mind operated in a symbolic rather than depictive fashion, and therefore argued that any such mental imagery would be merely epiphenomenal (Pylyshyn 1973). Over the ensuing thirty years, a variety of convergent evidence has established not only the fact that mental images are rotated in the brain as perceptual wholes (Kosslyn et al. 1995), but have also specified how that fact impacts our understanding of exactly what our minds are "computing" (reviewed in Kosslyn 1994; Kosslyn, Ganis and Thompson 2002). Consider, for example, how the body - and not just the brain - plays a role in modifying the rate at which mental rotations take place. Extend your left arm in front of you and hold your left hand straight out, palm upwards. Now try to rotate your hand 180 degrees to the left and then to the right. Notice that the rightward (inward) rotation is relatively easy, while the leftward is quite difficult, requiring additional shoulder and arm joint movements. A series of experiments by Parsons (1987ab, 1994; Par-
342
Tim Rohrer
sons et al. 1995) showed that when subjects were asked to perform mental rotations of images which consisted of line drawings of human hands instead of Shepard-Metzler 2D/3D block diagrams, subjects were quicker and better at identifying those rotations of the hand that were easier to perform, given the kinds of bodily constraints on joint movements we have as humans. Furthermore, Parsons found that subjects were quicker and better at judging which hand - left or right - was pictured when imagining rotating the hand that did not require difficult bodily movements. Given the details of the way the body works, the motor imagery system actively constrains how fast mental imagery is performed. 1 Even more dramatically, consider how patients with chronic arm pain in one limb perform similar mental hand rotation tasks. For their affected arm as compared to their uninjured arm patients are much slower to perform the necessary mental rotations in those conditions where the bodily movements that would be required for the actual hand rotation involve large arm movements (Schwoebel et al. 2001). A group of non-patient controls also showed no such differences between their left and right arms. Not only does the body affect how our mind works, but the body in pain affects how the mind works. Of course, this last insight should come as no surprise to anyone except those cognitive scientists who believe that our minds work just like disembodied computers. As a fallback position, a committed computationalist could simply jettison the functionalist claim that our cognition is independent of our neurophysiological architecture. One could argue that embodiment means only that "computations" of a particular kind - analog and iconic, not symbolic; physiologically embodied and perspectivally situated, not abstractly universal - are being performed as the body and brain pass topologypreserving structures forward and backward between the visual and motor systems. Admittedly, these topologies are not the "outputs" of the computations of a rigidly modular functionalist architecture, but rather dynamic activation patterns which imagistically map the perceptual contours of experience, rippling back and forth through multiple reentrant neuroanatomical connections within a web of functionally interrelated neural regions. These embodied neural "computations" compete to become the
1. Interestingly, if one adds a cylindrical "head" to Shepard-Metzler cube stimuli, one can produce similar embodied facilitation and inhibition effects to those produced by more naturalistically "embodied" stimuli such as Parsons' line drawings of hands (see Amorim, Isableu and Jarraya 2006).
The body in space: Dimensions ofembodiment
343
most salient and pragmatically useful mental constructs to address the current problem for the organism, whatever that is. While such revisions to our conception of what counts as cognitive "computations" are certainly warranted by the evidence and are important steps toward an embodied cognitive science, we might also inquire whether the focus on embodiment leads to additional constraints that are not purely physiological. Suppose that the current problem for the organism is once again the mental rotation of images. Given the results concerning how the rate of the mental rotation varies when the stimulus is a hand, are there multiple strategies for rotating mental objects that could compete to solve such problems? As one obvious difference between the Parsons stimuli and the Shepard and Metzler stimuli is that the former are line drawings of body parts while the latter are line drawings of 3D blocks, it might be possible that the motor imagery effects Parsons observed are limited to body-part images. While Kosslyn et al. (1998) had initially argued that there was this sort of stimulus-determined choice between two separate neural systems that could perform mental rotations, namely the motor imagery (hand stimulus) and visual imagery (object stimulus) systems, Kosslyn et al. (2001) now argue that there are two possible perspectives - or spatial frames of reference - that influence which strategy for mental rotation is chosen. In one such frame of reference - a viewer-centered perspective - it is possible to imagine oneself physically grasping and rotating a 3D object; while in the other frame of reference - an object-centered frame - it is possible to imagine viewing something else rotating the 3D object. Kosslyn and colleagues built wooden 3D constructs of a Shepard-Metzler block figures, and just prior to the neuroimaging had the subjects either turn by hand the wooden blocks or observe the blocks rotating on a motor-driven spindle. Participants were then instructed to imagine rotating the visual stimuli presented during the neuroimaging task in precisely the same manner. The neuroimaging results showed that the differences in the strength of activation in the motor imagery (or visual imagery) brain regions correlated with which perspective was obtained on the model via the participant's socially instructed interaction with it. Their results show that participants could voluntarily choose to adopt a particular strategy based on the frame of reference in which they were told to interact with the object and not based solely on the type of stimulus image, i.e., body parts or blocks. In other words, it is not the case that only the details of our physiology matter, such as the constraints of our joints as we imagine rotating
344
Tim Rohrer
our hands. Instead, the socially instructed choice of perspective also matters to how the embodied mind works. The Kosslyn group's experiments demonstrate why embodiment in cognitive science should never be construed as an exclusively physiological phenomenon. Even when researchers are measuring physiological changes such as changes in the blood flow or glucose uptake within the brain, both socially and environmentally induced factors can play a theoretically significant role as to what brain activity is being measured. Instructing a participant in a neuroimaging experiment to imagine using one spatial frame of reference or the other - that is, to imagine manipulating the blocks themselves as opposed to imagining the blocks spinning on their own - demonstrates how the social context influences the physiological response. Similarly, constructing a 3D physical version of a heretofore visually presented 2D stimulus predisposes the participant to interact with the stimulus using a slightly different mix of sensory modalities - resulting in a different physiological response. Note that it is not the case that the "body" enters into the measured response only in the condition where the participant physically manipulates the 3D object. In each case the body interacts with the stimuli in different ways (visually or motorically), and the resulting environmental predispositions to imagine using either the visual or the motor system are carried into the PET scanner. Unfortunately, the Kosslyn group has not yet investigated whether the social and environmental influences are separable, but it is reasonable to predict that they are. One could test this hypothesis using an experimental variation derived from semantic priming; if some participants were instructed to imagine operating in the opposite frame of reference during the scan than the one induced by their pre-scan bodily interaction with the stimulus, one would expect that their responses would be weaker in activation (inhibited) when compared to those participants for whom the social instructions coincided with the embodied environmental interaction.
1.2.
Experientialism: "The body" of cognitive science expands
The Kosslyn et al. (2001) experiment is particularly revealing because it shows that even for those of us who use methodologies dedicated to measuring the body, embodiment means not just the physiological body - or worse yet, just the physiological brain - but the body-in-space, the body as it interacts with the physical and social environment. Many of the objects
The body in space: Dimensions ofembodiment
345
we interact with every day are in fact cognitive artifacts we have designed with our bodies in mind. Consider one last set of experiments on mental rotation, one which compares the mental rotation of hands with the mental rotation of tools. Vingerhoets et al. (2002) compared the fMRI activation patterns of right-handed male subjects who were asked to decide whether a pair of pictures were different or identical (except for being rotated) when presented with either two pictures of hands (either right or left) or two pictures of hand tools (a monkey wrench, a pencil sharpener, a can opener and a soup ladle). While they found pre-motor and motor activation in both experimental conditions, their key finding was that tools, unlike hands, activated only the left hemisphere premotor and motor hand cortices contralateral to the subject's dominant hand. In other words, when we think about rotating tools - as opposed to Kosslyn's abstract shapes or Parsons' pictures of hands - we are mentally "grasping" those tools and rotating them with the same hand that we would ordinarily use to rotate them in the physical world. Thus, the body-in-the-brain is not just shaped by the body, but by the habitual interactions of the body with the environment. The point is not just that the body shapes the embodied mind, but that the experiences of the body-in-the-world also shape the embodied mind. But the experiential worlds with which we interact are more than simply physical; we are born into social and cultural milieus which transcend our individual bodies in time. Tools are an excellent example of the elements of our physical world that come to us already shaped by socio-cultural forces which predate each individual's body, if not the human body in general - for there has certainly been a long process of cultural refinement in the design of hand tools. Like tools, language is another part of the sociocultural milieu within which we exist. Can we investigate how sociocultural factors (such as the language into which we are born) shape our cognition? Let us begin by considering matters of prepositional structure, perspective and frames of reference in a linguistic context. In English we can speak metaphorically about features of the landscape in terms of the body, such as the face of a mountain, the mouth of a river, the foothills, and on. Peninsulas can be construed as fingers of land, or as heads (as in Hecata Head). In other words we understand features of the landscape metaphorically, using our bodies as the grounding frame of reference. Lakoff and Johnson (1999, 1980) have called such systematic patterns of metaphoric projection "conceptual metaphors", and have argued that they exhibit a general tendency to conceptualize more abstract entities in terms of the
346
Tim Rohrer
more bodily ones. We now know that both literal and metaphorical uses of body-part terms exhibit mental imagery effects similar to those described in the experimental lines already discussed. In an fMRI study which included instances of the LANDSCAPE IS A BODY metaphors, participants' primary and secondary hand sensorimotor cortices were active during the comprehension of both literal and metaphoric hand sentences (Rohrer 2001b, 2005). However, languages vary in how they construct space; is it possible that what is a metaphoric usage in English is the basic frame of reference habitually used by members of another culture? Linguists have documented a number of Mayan languages such as Mixtec, Tzeltal and Zapotec whose prepositional structure is entirely composed of body-part morphemes. For example, saying the stone is under the table requires saying the stone is proximal to the table's belly (yuu wa hiyaa cii-mesa / stone the be-located table-belly) (Lakoff 1987: 313). Within Cognitive Linguistics Brugman (1985) and Lakoff (1987; see also related work in MacLaury 1989) have claimed that such languages require projecting the names for body parts onto objects in the world. They argue that Mixtec speakers start off with a viewer-centered frame of reference2 and then take up the perspective of the object. This change in perspective yields an object-centered frame of reference for Mixtec spatial relation terminology, where tables metaphorically acquire bellies located where a human belly would be. From a purely neurophysiological conception of the body and one strongly influenced by an overly narrow conceptualization of the brain in terms ofjust the visual system (and not the sensorimotor system)one could conclude that this order of events was inevitable, given that in the visual system we first construe the world in our visual system in viewer-centered neural maps, and only later in object-centered maps.3 Given the cognitive neuroscience available at that time, Lakoff (1987) plausibly argued that speakers of such languages were metaphorically projecting the viewer-centered frame of reference to form another, objectcentered frame of reference.
2. Even though the topic has somewhat shifted to language, I am still using the term "frame of reference" primarily in its spatial sense, as would be found in cognitive psychology and cognitive neuroscience. I discuss the complex relation between linguistic and non-linguistic frames of reference in Section 4. 3. Current studies of the sensorimotor system reveal that there are separable frame of reference maps for body-centered (i.e., viewer-centered) mental rotation and object-centered mental rotation (see review in Parsons 2003).
The body in space: Dimensions ofembodiment
347
However, related evidence gathered in cross-cultural language acquisition studies reveals that the embodied mind is being shaped here not simply by the neurophysiology but by the particular socio-cultural practices that accompany language acquisition. The metaphoric projection hypothesis predicts that such terms would be learned first as names for the body parts, and only later extended to spatial relations terms. In a cross-cultural study of Danish- and English-speaking children on one hand and Zapotecspeaking children on the other, Jensen de Lopez and Sinha (1998; Sinha and Jensen de Lopez 2000; Jensen de Lopez 2002) investigated whether each culture's children acquire body-part morphemes first as body-part terms and then only later metaphorically project them as spatial relations terms. Their results show that Zapotec-speaking children acquire the bodypart morphemes first as spatial relations terms and only later - and seemingly independently - as names for the body parts, while Danish and English children acquire them first as body-part names and only later use them to indicate spatial relations. Furthermore, Jensen de Lopez and Sinha hypothesize that the difference derives from differing cultural practices of child-rearing. They note that Zapotec infants spend most of their first two years in a sling on the mother's back, sharing her spatial perspective, while Danish and English infants are placed in cribs and encouraged more to move about on their own. Consequently, joint attentional episodes during which the child's body parts are named may be less frequent in Zapotec child-rearing practices than in Danish or English child-rearing practices. In short, Jensen de Lopez and Sinha suggest that what might have looked like a projection of viewer-centered body-part terms in order to form an objectcentered frame of reference is instead simply the acquisition of an objectcentered frame of reference through joint attentional episodes focused on the spatial characteristics of such objects. The work of Jensen de Lopez and Sinha, along with other cross-cultural language acquisition work (Bowerman and Choi 2003), is an example of why embodied cognitive science must include the socio-cultural milieu as one dimension of variability. Note how similar the findings of Jensen de Lopez and Sinha are to the Kosslyn et al. (2001) finding concerning how the actions that directly precede the mental rotation experiment influence which neural system is chosen to perform the task. In both cases, the social context ofjoint attentional episodes, whether between caregiver and child or experimenter and participant, influences what frame of reference is chosen. To be embodied as a human being means in part that we are born into a socio-cultural milieu
348
Tim Rohrer
within which we have particular problem-solving strategies reinforced through experience. Deeply habitual experiences or an immediately prior atlentional experience can alter which strategy we choose to employ. The body of embodied cognitive science is not limited to physiological and neurophysiological influences on mind, nor to that plus the physical body's interactions with the physical world, but also incorporates the experiences of the social and cultural body as well. In other words, it has to take account of the socio-cultural context within which a particular body is situated. At this point it is clear that there are a number of different, if interrelated, senses of the term "embodiment" at play in the literature. I began this chapter with the image of how the phenomenological body intrudes upon the mind when the lights unexpectedly go out and one must fumble to find the way out of a building, and then traced some examples of how such experiential considerations motivate experiments that investigate the physiological and neurophysiological responses to the experienced body (as in imagining rotating the hand for both normal participants and participants with chronic arm pain). Continuing to examine the literature on imagination and mental rotation, I showed how even a focus on measuring the neurophysiology leads to the realization that both the physical environment and the socio-cultural context are factors which impact the embodied mind. Now it is time to begin addressing the meta-theoretic picture explicitly. How can all these senses of the term hang together as a framework for research on embodied cognition? What are the dimensions of embodiment that different theorists think it is important to measure and address? How many dimensions are there, and how do they interact to form different research clusters?
2.
Surveying the dimensions of embodiment
Like most scientists, linguists usually acknowledge that it is a difficult but admirable goal to begin as descriptively as possible before proceeding prescriptively. By my latest count the term "embodiment" can be used in at least twelve different important senses with respect to our cognition. Because theorists often (and sometimes appropriately, given their specific purposes) conflate two or more of these dimensions, it is important to get a clear picture of as many of the different dimensions of variability as possible. This list is not intended to be entirely exhaustive of the term's current
The body in space: Dimensions ofembodiment
349
usage, nor are the dimensions necessarily entirely independent of each other nor even entirely distinct from one another. Thus it is important to note that, and unlike the argumentative analyses given in Anderson (2003) or Wilson (2002), this initial survey is not intended to be a prescriptive definition ofthe term, but instead is intended only to catalogue some of the contemporary usage of the term in a way that reveals the most relevant dimensions to which one must be responsive in order to develop a general theoretic framework for embodiment theory in cognitive science. However, I do note where some theorists have used the term in several of these senses simultaneously, and in several cases I argue for making finer distinctions and recognizing more dimensions than do the original theorists themselves.
2.1.
Dimension 1: Philosophy
First - and perhaps most broadly - "embodiment" is used as a shorthand term for a counter-Cartesian philosophical account of mind, cognition and language. Descartes took problems within geometric and mathematical reasoning (such as the meaning of the term "triangle") as model problems for study, and concluded that knowledge par excellance is "disembodied" - that is, fundamentally independent of any particular bodily sensation, experience, or perspective - all of which are roots of uncertainty. In arguing that the meaning of the term "triangle" consists in the reference relationship between the word and an object that exists not in the physical, embodied world but in thought alone, Descartes' thought experiments set the stage for many thinkers within analytic philosophy, formal semantics, and early cognitive science. Broadly speaking, such philosophers of language typically construe the two central problems of meaning to be (i) mapping the reference relations between idealised objects of knowledge, their counterpart symbolic expressions in language and the objects or "states of affairs" in the real world (as in Fregean semantics), and (ii) explaining the internal logical structure of the relations which hold between these idealised objects or their corresponding linguistic symbols (as in theories of "autonomous syntax"). While Descartes was by no means unique nor alone within Western philosophy in claiming this position, his extraordinary clarity has garnered him the laurel of becoming metonymic for this package of philosophical assumptions (Lakoff and Johnson 1980, 1999; Geeraerts 1985; Johnson 1987; Rohrer 1998). Most such embodi-
350
Tim Rohrer
ment theorists, while perhaps somewhat favouring the empiricist side of the rationalist-empiricist split, generally try to dissolve such philosophical problems as hangovers from a bad metaphysics by recasting the problems in a different metaphysical basis. Many, although not all, of the theorists objecting to such Cartesian treatments of language, meaning and representation use "embodiment" in this broadly philosophical sense even as they work explicitly in one or more of the somewhat narrower dimensions of the term that follow.
2.2.
Dimension 2: The socio-cultural situation
"Embodiment" is also used to refer to the social and cultural practices within which the body, cognition and language are perpetually situated. In this sense, "embodiment" is often used to emphasize the particularistic, rather than the universalistic, tendencies of human cognition; e.g., how a particular mind in a particular body is shaped by the particular culture within which it is embedded. One example of a cognitive cross-cultural language acquisition study would be the previously discussed research by Sinha and Jensen de Lopez (2000). The cultural variation in child-rearing practices might well account for differing acquisition sequences of spatial language terms in English-, Danish-, and Zapotec-speaking children. Such socio-cultural practices can be given material form in the material artifacts that aid and manifest cognition - many of which are extensions of the body (Hutchins 1995, 1999; Fauconnier and Turner 2002). In assessing the differences between Micronesian and Western traditions of navigation, Hutchins observes that the [...] physical artifacts became repositories of knowledge, and they were constructed in durable media so that a single artifact might come to represent more than any individual could know. Furthermore, through the combination and superimposition of task-relevant structure, artifacts came to embody kinds of knowledge that would be extremely difficult to represent mentally. (1995: 96)
Hutchins cites the example of the medieval astrolabe, a set of rotating disks that embody the spatial relationships of the celestial bodies at different latitudes with much greater precision than would be possible from only the individual navigator's memory. These are set into a frame that represents the horizon, which is itself inset with a scale marking out the 24-hour day and/or the 360 degrees of the compass. He notes that the astrolabe embod-
The body in space: Dimensions ofembodiment
351
ies socio-cultural practices in two important ways. First, the astrolabe is an extension of the body in that a skilled navigator physically manipulates it by rotating its disks in order to predict celestial movements. Second, in its design the astrolabe is "a physical residuum of generations ofastronomical practice" (Hutchins 1995: 96-97). Any particular navigator using the astrolabe is the intellectual heir of a wide set of social practices which have been designed into the instrument. In contrast, a Micronesian navigator eschews such material artifacts, relying successfully instead on cultural artifacts such as chants that encode the relevant celestial relations for voyages between particular islands (Hutchins 1995: 65-92, 111). However, such cultural artifacts perform a similar function in that they also embody generations of knowledge gleaned from navigational practices.
2.3.
Dimension 3: Phenomenology
"Embodiment" has a phenomenological sense in which it can refer to the things we consciously notice about the role of our bodies in shaping our self-identities and our culture through acts of conscious introspection and deliberate reflection on the lived structures of our experience (Brandt 2000, 1999). The conscious phenomenology of cognitive semiotics can be profitably contrasted with the cognitive unconscious of cognitive psychology (see dimension 7). For example, Gallagher (this volume) traces how the work of phenomenological philosophers such as Husserl and MerleauPonty has contributed to the distinction between the conscious body image and the largely automatic body schema now emerging in cognitive neuroscience. For Husserl and Merleau-Ponty, embodiment refers not only to the lived experience of our own bodies but also to the ways in which our experience of other animate bodies moving differs from our experience of other moving objects in the physical world. This has found theoretic support from cognitive neuroscience in the discovery of the "mirror neuron" system in the premotor cortex (Gallese et al. 1996; Rizzolatti and Craighero 2004), in which primates have been shown to have neural systems which are activated not only by their own motor actions but also by witnessing another's motor action. Gallagher suggests that the emergent sense of "intercorporeality" from mirror neuron activity could be a basis of human intersubjectivity.
352
Tim Rohrer
2.4.
Dimension 4: Perspective
"Embodiment" can also refer to the particular subjective vantage point from which a particular perspective is taken, as opposed to the tradition of the all-seeing, all-lmowing, objective and panoptic vantage point. While this sense of the term can be seen as at least partly philosophical (as in Nagel 1979: 196-213; Geeraerts 1985; Johnson 1987; Rohrer 1998), the idea of considering the embodied viewpoint of the speaker has linguistic implications for the role of perspective in subjective construal (Langacker 1990; MacWhinney 2003), as well as a myriad number of psychological implications (e.g., Carlson-Radvansky and Irwin 1993; Kosslyn et al. 2002). For example, consider how the embodied perspective of the subject can interact with the canonical orientations implicit to construing spatial situations. When we give directions, we ordinarily assume that one is facing in the direction of the travel. However, in many subway trains the seats face in both directions. Should we give directions such as "after the subway goes above ground, look to your left. When you pass the automobile dealership, exit at the next stop...", imagine the confusion if the addressee should choose a seat facing opposite in the direction of travel and not make the adjustment to look to the right. Similarly, not only our bodies but also many of the objects of our world have canonical orientations that our determined by the ways in which we - that is our bodies - interact with them. Cups and trash-cans stand upright, while mattresses lie flat. When we say "the fly is over the trash can", but the trash can is lying on its side, is the fly above the side of the trash-can or adjacent to its mouth? Buildings such as cathedrals or ski lodges also have canonical orientations as to their fronts and backs; one can say "1'11 meet you in the restaurant to the right of the cathedral at noon", and inadvertently fail to specify if that is the perspective of the tourist facing the cathedral or the perspective of the cathedral as it faces the city square. We routinely project the canonical orientation of our embodiment onto the objects in the world; sometimes we take up the perspective of inanimate objects, sometimes we take up the perspective of animate bodies. Of course, problems of co-aligning frames of reference are of practical import in areas such as ship navigation practices and the internal maps built up by robots; as such this dimension frequently interacts with the other practical senses of the term as well as the metatheoretic ones.
The body in space: Dimensions ofembodiment
2.5.
353
Dimension 5: Development
In yet another important sense "embodiment" is used to refer to the developmental changes that the organism goes through as it transforms from zygote to fetus, or from infant to adult. There are at least three ways in which the developmental sense interacts with the other dimensions of "embodiment". First, certain events in the development of an organism open windows for the acquisition of a particular skill. Babies are not born speaking, nor can they handle objects or self-Iocomote at birth. As the infant acquires additional sensorimotor skills, additional patterns become available to be incorporated into its cognitive functioning. Second, and perhaps counterintuitively, such events may not expand but instead constrain the mappings between the possible patterns of embodied perceptual structures and the resulting conceptual structures of later developmental stages. For example, Bowerman and Choi (2003) have shown that while nine-month old Korean and English speaking infants can make the same spatial discriminations, at eighteen months their acquisition of language has solidified their spatial categories enough so that they are no longer able to make the discriminations which their language does not. Note that such developmental changes are not purely physiological, but take place within the relevant socio-cultural and linguistic contexts.
2.6.
Dimension 6: Evolution
An equally important temporal sense of the term "embodiment" refers to the evolutionary course the species of organism has undergone throughout the course of its genetic history. For example, an account of the gradual differentiation of the cortex into separate neural maps each representing a different frame of reference in the visual and tactile systems of mammals might provide an evolutionary explanation for which multiple frames for spatial reference were universally found by the typological studies of spatial language and cognition (Majid et al. 2004). Or on an even grander scale: humans have not always had the capacity for language and so evidence from studies on the evolutionary dimension of embodiment may often prove crucial to understanding why, for example, language processing in the brain does not appear to be exclusively concentrated as an autonomous module but instead draws on numerous subsystems from the
354
Tim Rohrer
perceptual modalities (see for treatments Edelman 1992; Donald 1991; Deacon 1997; MacWhinney 1999).
2.7.
Dimension 7: The cognitive unconscious
Additionally "embodiment" can mean those routine cognitive activities that ordinarily operate too quickly and too automatically for the conscious mind to focus on them. Lakoff and Johnson (1999: 9-15) have recently called these the cognitive unconscious. In this sense "embodiment" refers to the ways in which our conceptual thought is shaped by many processes below the threshold of our ordinary conscious awareness. As such they are generally inaccessible to introspection, though they may be measured indirectly using methods from cognitive and social psychology. Lakoff and Johnson cite examples ranging from mental imagery to semantic processing to processing sound into phonemes. They intend that this sense of the term "describes all unconscious mental operations concerned with conceptual systems, meaning, inference and language" (1999: 12). While their primary source of evidence for the cognitive unconscious is cognitive psychology, Lakoff and Johnson also intend for this sense of embodiment to include the neural modelling at least some aspects of our neurophysiological embodiment, noting that these "are obviously not independent of one another" (1999: 104). However, I would argue that these two aspects can be profitably separated out from the cognitive unconscious of experimental cognitive and social psychology as two additional dimensions of embodiment.
2.8.
Dimension 8: Neurophysiology
In a neurophysiological sense, the term "embodiment" can refer to meas-
uring the activity of the particular neural structures and cortical regions that accomplish feats like object-centered versus viewer-centered frames of reference in the visual system, metaphoric projection, and so on (Rohrer 2001b, 2005; Coulson and Van Petlen 2002). Such methods would include single-neuron recording, electroencephalography (EEG) and derivative measures (EMG and ERP), positron emission tomography (PET), functional magnetic resonance imaging (fMRI), and magneto-encephalography (MEG), as well as the neuroanatomical organization of the brain and nervous system. This dimension would comprise a portion of, but not be syn-
The body in space: Dimensions ofembodiment
355
onymous with Lakoff and Johnson's use of their term "neural embodiment" (1999: 102-103), in which they lump neurophysiologically-based methods together with the neurocomputational modelling of both highlevel cognitive tasks (such as temporal aspect in language) and low-level cognitive tasks (spatial perception). Together with observations on human physiology, the relevant neurophysiology is sometimes advanced as explaining certain constraints on the patterns exhibited in linguistic systems, such as in the regularities in the cross-cultural typology of color words (Lakoff 1987).
2.9.
Dimension 9: Neurocomputational modelling
"Embodiment" can sometimes also refer to research using neurocomputational models. Such neural networks may be said to be "embodied" in at least four different ways. First, they may more or less closely model the actual neurophysiology of the neural circuitry whose function they seek to emulate. Second, some kinds of neural networks build on better-understood neurocomputational models of the actual neurophysiology to provide "existence proofs" that a series of neural nets could in principle account some kind of cognitive behaviour - as in "structured connectionism". Neurocomputational models such as those in Lakoff and collaborators' "neural theory of language" thus are not explicit models of the underlying neurophysiology, but instead (and by using as their input structures the output from better understood perceptual neural structures) they seek to demonstrate how the known computational facts about the neurophysiology could produce certain kinds of observable linguistic behaviours (such as the metaphoric structuring of more abstract experience in terms of perceptual experience) (Lakoff and Johnson 1999: 569-83; see also Regier 1992, 1995; Feldman and Narayanan 2003). Third, and most often without any explicit reference to any intermediate structures in the underlying neuroanatomy, connectionist neural networks are taken to be models of experiential activity at the conceptual and/or psychological levels of processing, as in the stochastically-based arguments that there is no "poverty of stimulus" but instead plenty of experience to account for the acquisition of syntax by children - (Elman et al. 1996; MacWhinney 2003). Fourth, neural networks can be seen as models of how socio-cultural norms can be internalized within a specific "mind" (Zlatev 1997, 2003; Howard 2001). For example, by developing neural models of how the age/gender contrasts are
356
Tim Rohrer
marked in English and Spanish, Howard argues that biased socio-cultural norms of age and gender are partly the result of predispositions of how the human brain and nervous system learn. It is important to note that the neurocomputational use of this sense of "embodiment" is partly motivated by the fact that some other physiological, experiential and/or socio-cultural dimension of that term is explicitly being modeled by the neural network. Yet such models seek to ground their models not only in what they seek to model, but also in the fact that "neurocomputational embodiment" is explicitly anti-functionalist. All neural networks are anti-functionalist in that the particular shape of the neural model is at least partly determined by analogizing some of its computational properties to the underlying neurophysiology, rather than presuming that the cognition or behavior to be modeled is computationally independent of any such bodily constraints (as in functionalist models).
2.10. Dimension 10: Morphology The terms "embodiment" and "embodied cognition" are now also widely used in robotics (Chrisley and Ziemke 2002) where any computational modelling necessarily requires a body of some type for interaction with the world. While in robotics it is perhaps most saliently associated there with humanoid robot projects (Brooks and Stein 1994), it can also refer to cases where the work done by the robot depends on the particular morphological characteristics of the robot body (Pfeifer and Scheier 1999). For example, Comell University's Passive Dynamic Walker uses no motors and no centralized computation but instead relies on gravity, mechanical springs and cleverly designed limb morphology to "walk". By exploiting the capacities of the morphology, cognition is offloaded onto the body - a design principle that is consonant with both evolutionary theory and embodied cognitive science (Collins, Wisse and Ruina 2001; Bertram and Ruina 2001). The morphology of the physiological body also yields important constraints for measuring cognition in cognitive psychology or cognitive neurophysiology, as in the already discussed studies by Parsons (1987ab, 1994; Parsons et al. 1995) on the mental rotation of line drawings of the hand and in the Vingerhoets et al. (2002) fMRI study of the activation courses in response to images of tools.
The body in space: Dimensions ofembodiment
357
2.11. Dimension 11: Directionality of metaphor Within Cognitive Linguistics, the term "embodiment" has two often conflated senses that stem from Lakoff and Johnson's (1980: 112) initial formulation of the embodiment hypothesis as a constraint on the directionality of metaphoric structuring. More accurately, this sense of "embodiment" could be termed the directionality of metaphor mappings. In this strong directionality constraint Lakoff and Johnson claim that we normally project image-schematic patterns of knowledge unidirectionally from a more embodied source domain to understand a less well-understood target domain. In other words, they claim that each and every mapping between the elements of the source and the elements of the target is unidirectional; the logic of the image-schema is projected from the source to the target, and not from target to source. For example, in their analysis of the metaphors shaping various theories of visual attention in cognitive psychology, Fernandez-Duque and Johnson claim that: [...] each submapping is directional, going from source to target. We understand aspects of the target domain via the source domain structures and not the reverse. Such unidirectionality shows itself clearly in the reasoning we do based upon conceptual metaphors. (Femandez-Duque and Johnson 1999: 85)
This constraint has been the source of much controversy within Cognitive Linguistics. Fauconnier and Turner (1995, 2002) and others have argued that there is a much greater role for feedback between target and source to the extent that they have proposed an alternate theory in cognitive semantics, conceptual blending, which is in part a response to the unidirectionality constraint. Non-unidirectional and blending-like phenomena can be observed with respect to the same theories of visual attention analyzed by Fernandez-Duque and Johnson. For example, they correctly argue that a VISUAL ATfENTION IS A SPOTLIGHT metaphor shapes research questions in cognitive psychology, such as for experiments designed to measure the speed at which the attentional spotlight moves across the visuo-spatial field when experimental participants shift the focus of their attention (Shulman, Remington and McLean 1979) and whether the subject would attend to intermediary objects in the path of the attentional spotlight (Tsal 1983). However, the cognitive psychologist Miisseler (1994) observed that although relatively proximal shifts in the focus of visual attention seemed to behave as if the attentional spotlight did follow an analog path across the visual field illuminating everything in its path, larger shifts in the focus of
358
Tim Rohrer
visual attention did not follow an analog path. This observation about the target domain (visual attention) initiated a re-examination of the source domain. Miisseler initially proposed that the attentional spotlight was "reset" during large attentional shifts. However, convergent evidence from both other psychological studies of attention and from cognitive neuropsychology on colour, motion, shape and other visual subsystems caused an even more radical shift in the source domain of the metaphor - visual attention began to be understood as an array of multiple spotlights, as would be found in a theatre (Rohrer 1998). In no case was this process of feedback, revision, and accommodation strictly unidirectional; it was always motivated by observed changes in the target domain of visual attention that required making changes to the source domain of the spotlight(s).
2.12. Dimension 12: Grounding Finally, "embodiment" can be used to refer to a particular hypothesis as to how we might explain how abstract symbolic behaviour is grounded in experience. Within Cognitive Linguistics even Lakoff and Johnson's original formulation (1980: 112) of the embodiment hypothesis contained the germ of a broad generalisation about the kinds of basic conceptual domains which were typically serving as source domains for conceptual metaphors, rather than as explicitly referring to the directionality of projection for each and every element mapped within a particular metaphor. We might call this additional sense of embodiment the directionality of explanation in order to distinguish it from the directionality of metaphor mappings. Lakoff and Turner specifically acknowledge this sense of embodiment in their Hgrounding hypothesis ", in which they argued that meaning is grounded in the sense that we must choose from a finite number of semantically autonomous source domains to understand more abstract experiences (Lakoff and Turner 1990: 113-120). This sense of the term is related to the symbol-grounding problem in cognitive science generally (Harnad 1990), though it is important to note that many embodiment theorists would want to address that issue without conceding any sort of Cartesian-like split between words, thoughts or symbols and the worldly things to which they refer. This descriptive list is meant to illustrate that the embodied cognitive science requires thinking through evidence drawn from a multiplicity of perspectives on embodiment, and therefore drawn from multiple method-
The body in space: Dimensions ofembodiment
359
ologies. Of course almost no researcher or research project can attend to all these different senses of the term at once and still produce scientific findings, but research projects that build bridges or perform parallel experiments across these differing dimensions are of particular interest. However, once the descriptive work has been done it can be seen that many of these senses cluster about at least two poles of attraction. Critiques of embodied cognitive science from within have often given voice to two broad senses in which the term "embodiment" is used. These two could be well described as "embodiment as broadly experiential" and "embodiment as the physical substrate". In one cluster the term refers to dimensions that focus on the specifically subjective, contextual, social, cultural and historical experiences of language speakers. Dimensions (2) through (4) of my enumeration of the term's usage would typically cluster in this realm, while dimensions (7) through (10) would often cluster about the pole which emphasizes the physiological and neurophysiological bodily substrate that is typically associated with supposedly more "objective" methodologies. Such a division is at best rough and provisional however. Clearly not all the dimensions of the term can be so clustered, given that the attention to temporal character which characterizes the developmental (5) and evolutionary (6) dimensions can place them about either pole. Similarly, there are many interesting studies which bridge the gap between the experiential and the physiological poles even while largely measuring a dimension typically construed as mostly one or the other, such as the Kosslyn group's neuroimaging research into alternate strategies of mental rotation (2001). Depending on the behaviour modelled, embodiment as neurocomputational modelling (9) can also cross the line from the physical substrate to more experiential matters. Finally, the more explicitly meta-theoretical dimensions of the term, (1), (11) and (12), have much traffic with both the experiential and physical substrate poles and also do not lend themselves easily to such a rough and ready distinction. Given such considerations, at least two more poles of attraction emerge - temporal and metatheoretical studies. In the end, an adequate theoretic framework for embodiment theory in cognitive science will have to acknowledge all of the wide variety of senses in which the term "embodiment" is being used and provide a nonreductionistic framework for reconciling research across all these different dimensions.
360
Tim Rohrer
3.
The levels of investigation theoretic framework
The rough and ready distinction between experiential and physiological embodiment does have the virtue, however, of illuminating how we might assess the utility of different approaches to embodied cognition. Two other recent attempts to clarify the uses of the term "embodiment" have come to diametrically opposed conclusions as to how embodiment theory can contribute to future work. On one tack, the computer scientist Anderson (2003) proposes evaluating embodiment theory in terms of its practical utility in reconceiving how our work and our lives can be enhanced. His objectives are fundamentally technological, inquiring whether embodiment theory has any import for efforts to offload difficult cognitive tasks onto the social and cultural environment, such as teamwork or the design of intelligent material artifacts such as embedded computers. On another tack, theorists such as the psychologist Wilson (2002) propose an experimentally-based evaluation of embodiment theory, arguing that we need more investigations of how offline, subconscious bodily processes structure real-time cognition while explicitly rejecting efforts to explain cognition as being situated in and distributed across socio-cultural practices. Clearly, these approaches differ not only as to what direction future research should take but also in terms of the physical scale of their research scope. Wilson's concerns are primarily with how the individual physiological organism interacts with its environment, arguing that distributed social and cultural patterns of cognition are too impermanent to constitute a unit of analysis having explanatory force (2002: 630-631). While Anderson also agrees that insights from how the embodied individual organism interacts with the physical world are relevant, he argues that embodied cognition has "social significance, for the construction of meaning, of the terms through which we encounter the world, is not generally private, but is rather a shared and social practice" (2003: 125). In considering whether and how a material artifact such as patient's medical chart might be extended by embedded computers, he notes that such efforts should not seek to obliterate important facets of how the material artifact encodes the fact that cognition is distributed across the social group caring for the patient, such as how the handwriting indicates what member of the group made a particular observation or where the chart is kept indicates what member of the medical team has responsibility for the patient. On Wilson's argument the scope of the research would be limited to the size of space within which the individual interacts with the artifact, but for those cognitive scientists interested in improving the func-
The body in space: Dimensions ofembodiment
361
tioning of such social groups, the scope of the physical scale expands to encompass the entire cultural and communicative space of the team. However, from my own vantage as a cognitive scientist originally trained in the philosophical tradition of American Pragmatism, both of these theorists are sailing in the right direction - given that the relevant operational scale of the phenomena they are studying has changed. The first challenge for developing a theoretical framework in which we can address such differing approaches is to propose the adoption of a simple and well-understood organising criterion. Unfortunately, most previous proposals have generally accorded an ontological status, rather than an epistemological or methodological status, to the organisation of their theoretic framework. Thus most such frameworks postulate "higher" and "lower" levels of cognition in ways which imply that the higher levels may be reduced to operations at the lower levels, ultimately arguing for the elimination of higher-levels of description in favour of lower levels of description (Churchland 1981, 1989). One exception is Posner and Raichle's (1994) schematisation of the levels of investigation in cognitive neuroscience, in which the primary emphasis is given to the methodologies used to investigate the phenomena rather than their ontological status. Similarly, Edelman (1992) points out that in the physical sciences, the phenomena are operationally grouped in levels according to the physical scale of the methodology with which the phenomena are being studied. Thus the most basic organising criterion of this theoretic framework is the scale of the relative physical sizes of the embodied phenomena which produce the different kinds of socio-cultural, cognitive or neural events to be studied. In Figure 1, physical size is mapped on the y-axis, providing a relative distribution of the "higher to lower" methodological levels of cognitive processes. A general name for each level is indicated by boldface type in the first column. To provide clarification, the next column provides examples of what the relevant physiological structures are at a given physical scale. For example, at the communicative, cultural and social level we study spatial language as it used between people, and hence multiple central nervous systems; alternatively, it is possible to measure one individual's (and hence one central nervous system's) performance on a similar set of linguistic tasks. Similarly we can examine, with even more granularity, relative changes in cerebral blood flow to regions of the brain in response to spatial linguistic tasks; or we can construct neurocomputational models of those brain regions. However, Posner and Raichle's key
362
Tim Rohrer
Sample Operative Theoretic Constructs Viewer-centered, Multiple Communicative Cross-cultural lm investigations of object-centered, and cultural Central and mental rotation geo-centered systems in Nervous up frames of referand frames of Systems anthropology, ence in language; language, science reference; language child-rearing and philosophy acquisition; practices; conceptual norms as to which metaphor; gesture spatial frame used Individual Spatial frames of Performance .5m Central reference, speed performance on domain; Nervous to Cognitive, frames of referof mental 2m Systems ence and mental rotation; conceptual, morphological rotation tasks; gestural and linguistic systems measuring ability constraints to gesture in as performed by individual subjects direction-giving situations contrived to inhibit it IO-lm Gross to Neural systems Activation course Body-image, in somatosensory, motor and visual medium size to IO-2 m neural cortices, whatauditory, and visual processing where pathway; Regions areas when (anterior processing spatial cingulate, frames of referparietal lobe, ence tasks or etc.) mental rotation tasks IO-lm Neural Neuroanatomy; Neuroanatomical Motor and visual Neural circuitry in connections from cortices, parietal networks, to IO-4 m maps and topographic neural maps, pathways, visual, auditory, somatosensory maps sheets pathways regions to language areas IO-J m Individual Fine neuroOrientation-tuning Neurocellular systems; Cellular anatomicalorgani- cells; ocular neurons, to IO-6 m cortical sation of particular dominance and very small structures recolumns columns intercellular cruited in lanstructures Iguage processing None-beyond Neurotransmitter, Subcellular Less Neurotheoretical scope sYnapse, ion Than transmitters, systems; IO-6 m ion channels, subcellular, mochannels lecular and sYnapses electrophysical
Size
Physiological Level of Investigation Structures
Figure 1.
Typical Cognitive Science Tasks
Sample Methods o/Studyand Measurement Linguistic analysis, cross-linguistic typology, videotaped interview, cognitive ethnography
Verbal report, observational neurology, discourse analysis, cognitive and Developmental studies examining reaction time (RT)
Lesion analysis, neurological dissociations, neuroimaging using fMRI and PET, ERP methods, neurocomputational simulations
Electrocellular recording, anatomical dyes, neurocomputational simulations Electrocellular Recording, anatomical dyes, neurocomputational simulations Neuropharmacological, neurochemical, and neurophysical methods
Theoretic framework for embodied cognitive science.
The body in space: Dimensions ofembodiment
363
insight is that it is important to consider how the basic inquiry changes given the different tasks and methods at various levels of investigation. All methodologies have constraints and freedoms which limit or enhance their scope of investigation and define the theoretical constructs that they develop, and these are a product of the physical scale at which the measurement is taking place. The final two columns acknowledge this by specifying some of the relevant theoretic constructs and the various methodologies operative at each level of investigation. This framework can be used to structure studies of various topics of interest to cognitive scientists, such as mental imagery, frames of reference, metaphor and so on. While this type of theoretic framework is becoming commoner within much of cognitive neuroscience, most embodiment theorists have been slow to give explicit attention to the problem of how we are to theoretically situate and reconcile these different levels of investigation, perhaps due to a fear of appearing to favor reductionism. I have included just a single level of cultural and communicative analysis, but by no means should this be taken as indicative of its importance relative to other the other levels. Of course, one could argue for a multiplicity of levels embedded within this one, though they might not be clearly differentiated from one another in terms of physical scale. In choosing to include a general level situated at a meter and up on the physical size axis I mean to emphasize only that human beings should be considered not simply in terms of physiological size, but also in terms of the standard scale of their interactional distance in speaking and interacting with one another. At this level of the chart the "physiological structures" column reads "Multiple Central Nervous Systems", but that awkward term is intentionally inadequate so as to emphasize that the physiology is less relevant here - what primarily matters on this level are the social and cultural interactions between human beings. Investigations at the cultural level are occasionally given short shrift by some versions of embodied cognitive science, but generally this has been and should remain a strong thrust of future research in the field. Note also that difficult phenomena such as cultural and linguistic norms, or individual consciousness and awareness, are situated at the physical scale at which they are measured and observed, rather than attempting to place them on (or reduce them to) a lower level of investigation. Nonetheless, it is certainly possible and sometimes useful to ask, for example, how long the neural processing of a visual experience takes before impacting conscious decision-making, or how the linguistic norm of forming the English past tense might be performed in a neurocomputa-
364
Tim Rohrer
tional model. However, research in embodied cognitive science should not seek to reduce such phenomena to another level but should instead bridge across these levels in important ways - for example, the linguistic corpora used to train the neurocomputational model should be based on naturalistic recordings of an actual child's utterances rather than text harvested from internet newsgroups, and so on. While the chart depicting the theoretic framework is designed to give an overview of the relationship between body, brain and culture, this representation is not as illustrative for issues pertaining to evolutionary and developmental time scales, which may be considered at any of these levels. However, this failing is more a limitation of the imagery of a twodimensional chart than of the theoretic framework itself. If we were to add another axis for time perpendicular to the surface plane of the chart, we could imagine this framework as a rectangular solid. I have omitted representing this dimension because such an illustration would make it difficult to label the levels, but I make the point explicit here because both the developmental and evolutionary time courses of these phenomena are a central dimension to understanding them, and their bearing on the embodied mind.
4.
Applications of the theoretic framework
This theoretic framework can help link related research from one level of investigation to another, providing opportunities to test similar hypotheses and incorporate insights originally developed at one level of investigation at another. As in the introduction I have already given several examples of embodied cognition linking physiological and neurophysiological experiments on cognition, consider three examples from the cultural and performative levels of investigation involving mental rotation and frames of reference. First, in a series of cross-cultural and cross-linguistic typological studies on spatial cognition, Pederson et al. (1998) have found that the linguistic frame of reference4 which predominates in a language strongly influences 4. As I summarize their typological work I use Levinson's (2003) terminology for linguistic frames of reference, though I try to indicate a rough alternative term from the broader literature on spatial frames of reference in the cognitive sciences where their nomenclature may be uninformative to the naIve reader. However, such indications are intended only as clarifying approximations, and
The body in space: Dimensions ofembodiment
365
the spatial cognition problem-solving strategy chosen. In one of several tasks they use, experimenters place three animal figurines in a row on a table and ask participants to memorize the scene. The participants are then rotated 180 degrees and asked to reconstruct the scene on a second table. Speakers of languages which predominately use a relative frame of reference normally recreate the scene relative to their own body position, i.e., the animal to their left in the original rotation remains the animal on their left in the new 180-degree rotation. However, speakers of languages in which an absolute (geo-centric) frame of reference predominates recreate the same scene relative to the position of the animals with respect to invariant features of the landscape, i.e., the animal to the north is still placed on the north side even though the participant's body position has been rotated 180 degrees from the original scene. Despite numerous challenges, they have successfully replicated their findings in a variety of experimental environments and across a wide variety of languages (Majid et al. 2004). A typological finding, gathered at the cultural level of investigation, has been transposed onto the performative level of investigation. Their experimental research has been extended by a second source of evidence concerning spatial frames of reference which is of substantial interest to researchers working on embodiment - studies of gesture. By cleverly manipulating the body position of his experimental subjects relative to the directions the experimenter requests, Kita (2003) has shown that in giving directions speakers of languages in which the relative frame of reference predominates will often make difficult, torso-twisting gestures across their bodies in an attempt to co-align their current bodily perspective with the perspective that they would need to have in order to tell if a landmark at that point in the directions will be on the right or left. Just as with the work on mental rotation and mental imagery, in direction-giving the embodied mind simulates following the path of the directions being given. This process of mental imagery - the mental gymnastics done by the speaker's mind as one visualizes how the hearer will need to follow the directions - surfaces in the form of physical gymnastics - that is, the gestures involving the rotation of the torso. Once again, these experiments link the cultural and performative levels of investigation. Third, evidence of the ways in which language and culture embody spatial frames of reference has also been found in studies linking gesture
readers should refer to Levinson's own work for detailed defmitions of his terms for the linguistic frames of reference.
366
Tim Rohrer
and conceptual metaphor. In Indo-European languages such as English and Spanish, time is typically conceptualized using two basic TIME IS SPACE metaphors - one in which the observer stands still while times pass by (e.g., "The end of the year is coming up on us soon"; and another in which the observer is moving through a landscape of times (e.g., "We're coming up on the end of the year") (Lakoff and Johnson 1980, 1999). However, in both metaphor systems the observer faces the future, while the past is behind the observer. Nufiez and Sweetser (2006) have shown that for speakers of Aymara, who almost exclusively use a stationary version of the TIME IS SPACE metaphor, the past is in front and the future comes from behind. In Aymara, the orientation of the spatial frame of reference is reversed. Using videotaped interviews with bilingual Spanish and Aymara speakers both recounting Aymara legends and talking about their own communities' immediate future, they demonstrate how speakers gesture to areas in front of them when referring to the past, while gesturing to future events with over the shoulder motions. Furthermore, the gestures reveal that more recent past events are closer to the speaker's point of view than events in the more distant past. For example, as he contrasts ancient times with current events, an informant gestures by pointing outward and upward as opposed to pointing four times closer to his body. Together with their linguistic research on the Aymara conceptual metaphors for time, the research of Nufiez and Sweetser shows that the Aymara map the temporal frame of reference onto a viewer-centered frame of reference in an inverse direction to that found in English or Spanish. These three studies, largely situated at the "top" levels of the theoretic framework, are excellent examples of how the cultural level of embodiment is expressed in the performances of individual experimental participants. Not only that, but they also provide clues of what researchers might ask next at other levels of investigation. For example, one might ask what sorts of analogous linguistic stimuli would show neural responses similar to spatial stimuli. One could design neuroimaging experiments investigating whether navigating directions on a visual display required mental imagery rotation in either the visual or motor imagery system. Analogously, one could also investigate whether or not sentence stimuli concerning a similar navigational task requiring either a viewer-centered, objectcentered, or geo-centered set of coordinates would activate distinct brain regions. There are several convergent neural studies which indicate that this is a hypothesis worth pursuing. In a recent review article, Parsons (2003) ar-
The body in space: Dimensions ofembodiment
367
gues that we can already distinguish between superior parietal cortical regions active when mentally rotating objects about their own axis (i.e., in an object-centered frame) and superior parietal regions active when mentally rotating one's own body (a viewer-centered frame). Several neuroimaging studies using spatial relations terms as linguistic stimuli also show activation of either left (Damasio et al. 2001) or both (Emmorey et al. 2005; Carpenter et al. 1999) superior parietal cortical areas, though none of those tasks were designed to elicit distinctions between the frames of reference. Similarly, there is also convergent evidence from the literature on attention in visual hemi-neglect (a syndrome typically resulting from damage to the right superior parietal cortex) suggesting that viewer-centered and object-centered neglect are dissociable phenomena in experimental and observational neurology (Behrmann and Tipper 1999; see Rohrer 2001a for discussion). Halligan et al. (2003) reviews the neurological evidence that neglect of near and far space occurs independently, suggesting that they are likely to be mapped in separable parietal areas - evidence which suggests that the geocentric frame of reference in language could be correlated with the map of far space. Furthermore, while most studies of neglect do not report language deficits on standard naming or visuo-auditory picture matching tasks, a recent study by Eden et al. (2003) shows that when dyslexic children were asked to draw a clock face they neglected to fill in the numbers on the left side of the clock face in much the same manner as observed for the standard clock drawing task given visual neglect patients. Their results suggest that there may be a common underpinning for spatial representation and language in right superior parietal cortical areas. Yet more evidence of parietal involvement in spatial language can be found in Coslett and Lie (2000), who have reported on parietal patients who exhibit spoken language and comprehension deficits correlated with the spatial orientation of their body (see also Chatterjee 2001). In sum, while there are not yet any definitive studies showing that linguistic stimuli can drive distinct regions of the parietal cortex thought to be responsible for maintaining viewer-centered, object-centered and geo-centered frame of reference maps, studies of both parietal involvement in language and the parietal processing of different spatial frames of reference provide good reason to investigate the possibility. Although the mapping between the frames of reference observed in linguistic typologies (Majid et al. 2004) and the frames of reference used in neural processing is not likely to be isomorphic, experiments bridging the two lines of investigation could prove fruitful.
368
Tim Rohrer
One could also ask about the role of the parietal regions in understanding metaphoric expressions of spatial relations. If linguistic stimuli can drive the brain regions demonstrably involved in tasks involving the different spatial frames of reference, then would metaphoric linguistic stimuli drive those same brain areas given in tasks involving a temporal frame of reference? Similar to the Aymara gestural results, other studies of how temporal reasoning and spatial reasoning interact (see Matlock, Ramscar and Boroditsky 2005) have shown that temporal problem-solving uses the resources of spatial problem-solving. One could design linguistic comprehension and linguistic problem-solving tasks using the TIME Is SPACE metaphor that might differentially activate those parietal areas. It is also important to see that the dialogue between the levels of investigation in this theoretic framework is not a one-way street. Similar questions can also be taken to the cultural and communicative level from the neurophysiological level. For example, and with respect to multimodal user interface design, as we know that altering the viewer-centered versus the object-centered perspective influences which neural system performs mental rotations (Kosslyn et al. 2001), can we take advantage of that fact to design better hand controls and visual displays for real-world tasks that involve real-time mental rotation, such as airplane piloting or air traffic control? The appropriate display and control interface should prompt for the appropriate frame of reference. How might such controls help mitigate errors in frames of reference problems that result from the finding that people find it more difficult to rotate object-centered frames than egocentered frames (Wraga, Creem and Proffitt 1999)? If the speakers of some languages prefer to use a certain frame of reference, how does that alter how one might teach technical skills (such as navigation) that involve mental rotations in a cross-cultural setting? The list of such questions is endless, for the details of embodiment play an enormous role in every cognitive activity.
5.
Conclusions
Human beings have bodies, and those bodies shape and constrain how we think. At the outset of this chapter, I asked the reader to imagine how an event as mundane as a power outage can reawaken an awareness of the body. However, and as I have argued throughout this chapter, the body is not some other thing to which the mind returns when thinking is inter-
The body in space: Dimensions ofembodiment
369
rupted - thinking itself is shot through and through with the body. In the words of the early American Pragmatist philosopher-psychologist William James: [... ] our own bodily position, attitude, condition, is one of the things of which some awareness, however inattentive, invariably accompanies the knowledge of whatever else we know. We think; and as we think we feel our bodily selves as the seat of the thinking. If the thinking be our thinking, it must be suffused through all of its parts with that peculiar warmth and intimacy that make it come as ours. (James 1900, 1: 241-242, emphases original)
When we think about rotating a hand in space, we feel that it is more difficult to rotate in ways which conflict with the natural motions of our joints; when we have pain in the appropriate joints, we rotate it slower. As James notes, we often may be inattentive to this sort of awareness, especially when we are physically fit and habitually lost in thought; at such times we may need experiments to reacquaint ourselves with the body. But for those of us who feel pain in their wrists as they imagine gripping and rotating a hammer, no argumentative reintroduction is necessary - for them, the awareness that our own thinking is embodied is inescapable. James' felt sense of the experienced body, or what I have called the phenomenological dimension of embodiment, is only one prominent dimension among the many dimensions of the term discussed in this article. Over the first fifty or so years of cognitive science the field deliberately theorized away the many contributions of the body to thinking, a theoretical failing that is only now beginning to be corrected. Interdisciplinary research, ranging from linguistics and neuroscience to philosophy and psychology, using methods ranging from the experimental and physiological to the phenomenological and sociological, has begun to show that there are multiple dimensions along which the human body shapes human thought. For example, in classical cognitive science our mental representations were assumed to be logical and symbolic; however, cognitive neuroscience has shown them to be embodied and image-like. Like the Pragmatist philosophers, the new approaches see human cognition as action situated within a practical context, and mental representation as instrumental rather than absolute. Our ordinary experience of space is not one in which we seek to discover some absolute external frame of reference, but one in which we are obsessed with coordinating multiple frames of reference - body-, head-, or other-centered - in order to solve the practical problem at hand. That practical problem may be getting directions to an unfamiliar place, helping
370
Tim Rohrer
a patient recover from and cope with visual neglect, a team of navigators piloting a ship, file management within a computer's graphical user interface, or tracking a lost child through the forest; but all of these examples share a common thread of understanding how the body establishes frames of reference and moves in space. As the neuroscientist Antonio Damasio puts it: The body, as represented in the brain, may constitute the indispensable frame of reference for neural processes that we experience as the mind; that our very organism rather than some absolute external reality is used as the ground reference for the constructions we make of the world around us and for the construction of the ever-present sense of subjectivity that is part and parcel of experiences; that our most refmed thoughts and best actions, our greatest joys and deepest sorrows, use the body as a yardstick. (Damasio 1994: xvi)
Embodied cognitive science begins with the realization that the body, along all of the dimensions I have outlined in this chapter, grounds and shapes human cognition. Wakened from its long dormant slumber, the body has returned to cognitive science. In this survey chapter I have traced how our physiological, neurophysiological, interactional and sociocultural embodiment impinges on how we think. The embodied mind is not something which should be narrowly identified with anyone of these levels of investigation, nor with anyone of the dimensions of variability that I have noted. The embodied mind cannot be reduced only to the brain any more than it can be reduced to culture. Nor is the embodied mind merely a computer, in any traditional sense of that term; it can be said to perform "computations", but the substance and structure of these computations are imagistic due to the particular kind of bodies we have and the environments we inhabit. We are just at the beginning phase of understanding the myriad ways in which the body is in the mind.
References Amorim, Michel-Ange, Brice Isableu and Mohammed Jarraya 2006 Embodied spatial transformations: "Body analogy" for the mental rotation. Journal of Experimental Psychology: General 135: 247327.
The body in space: Dimensions ofembodiment
371
Anderson, Michael L. 2003 Embodied cognition: A field guide. Artificial Intelligence 149: 91130. Behrmann, Marlene and Steven P. Tipper 1999 Attention accesses multiple reference frames: Evidence from visual neglect. Journal ofExperimental Psychology 25: 83-101. Bertram, John E. A. and Andrew Ruina 2001 Multiple walking speed-frequency relations are predicted by constrained optimization. Journal ofTheoretical Biology 209: 445-453. Bowerman, Melissa and Soonja Choi 2003 Space under construction: Language-specific spatial categorization in fITst language acquisition. In: Dedre Gentner and Susan GoldinMeadow (eds.), Language and Mind, 387-427. Cambridge, MA: MIT Press. Brandt, Per Aage 1999 Domains and the grounding of meaning. In: Jose Luis Cifuentes Honrubia (ed.), Estudios de Lingiiistica Cognitiva, 467-478. Alicante: Department de Filologia Espaiiola, Lingiiistica General y Teoria de la Literatura, Universidad de Alicante. 2000 Metaphor, catachresis, simile: A cognitive and semiotic approach and an ontological perspective. Ms., Aarhus, Denmark: Center for Semiotic Research, University of Aarhus. Brooks, Rodney A. and Lynn A. Stein 1994 Building brains for bodies. Autonomous Robots 1 (1): 7-25. Brugman, Claudia 1985 The use of body-part terms as locatives in Chalcatongo Mixtec. In: Report No. 4 of the Survey of Californian and Other Indian Languages, 235-290. Berkeley: University of California. Carlson-Radvansky, Laura A. and David E. Irwin 1993 Frames of reference in vision and language: Where is above? Cognition 46: 223-244. Carpenter, Patricia A., Marcel Adam Just, Timothy A. Keller, William F. Eddy and Keith R. Thulbom. 1999 Time course of fMRI activation in language and spatial networks during sentence comprehension. Neuroimage 10: 216-224. Chatterjee, Anjan 2001 Language and space: Some interactions. Trends in Cognitive Sciences 5: 55-61. Chrisley, Ronald and Tom Ziemke 2002 Embodiment. Encyclopedia of Cognitive Science, 1102-1108. Macmillan Publishers.
372
Tim Rohrer
ChurcWand, Paul 1981 Eliminative materialism and the propositional attitudes. Journal of Philosophy 78: 67-90. 1989 A Neurocomputational Perspective: The Nature of Mind and the Structure ofScience. Cambridge, MA: MIT Press. Collins, Steven H., Martijn Wisse and Andrew Ruina 2001 A 3-D passive-dynamic walking robot with two legs and knees. International Journal ofRobotics Research 20: 607-615. Coslett, H. Branch and Eunhui Lie 2000 Spatial Influences on Language Performance. J. Cognitive Neuroscience (Suppl.): 76. Coulson, Seana and Cynthia Van Petten 2002 Conceptual integration and metaphor: An event-related potential study. Memory and Cognition 30: 958-968. Cummins, Robert 1977 Programs in the explanation of behavior. Philosophy of Science 44: 269-287. Damasio, Antonio R. 1994 The Feeling of What Happens: Emotion, Reason and the Human Brain. Cambridge, MA: MIT Press. Damasio, Hanna, Thomas J. Grabowski, Daniel Tranel, Laura L. B. Ponto, Richard D. Hichwa and Antonio R. Damasio 2001 Neural correlates of naming actions and of naming spatial relations. NeuroImage 13: 1053-1064. Deacon, Terrence 1997 The Symbolic Species: The Co-evolution ofLanguage and the Brain. New York: W. W. Norton. Donald, Merlin 1991 Origin ofthe Modern Mind: Three Stages in the Evolution of Culture and Cognition. Cambridge, MA: Harvard University Press. Edelman, Gerald M. 1992 Bright Air, Brilliant Fire: On the Matter ofMind. New York: Basic Books. Eden, Guinevere F., Frank B. Wood and John F. Stein 2003 Clock drawing in developmental dyslexia. Journal of Learning Disability 36: 216-228. Elman, Jeffrey L., Elizabeth A. Bates, Mark H. Johnson, Annette Karmiloff-Smith, Dominico Parisi and Kim Plunkett 1996 Rethinking Innateness. A Connectionist Perspective on Development. Cambridge, MA: MIT Press.
The body in space: Dimensions ofembodiment
373
Emmorey, Karen, Hanna Damasio, Stephen McCullough, Thomas J. Grabowski, Laura L. B. Ponto, Richard D. Hichwa and Ursula Bellugi 2005 Neural systems underlying spatial language in American Sign Language. Neuroimage 17: 812-824. Fauconnier, Gilles and Mark Turner 1995 Conceptual integration and formal expression. Metaphor and Symbolic Activity 10: 183-204. 2002 The Way We Think. New York: Basic Books. Feldman, Jerome and Srini Narayanan 2004 Embodied meaning in a neural theory of language. Brain and Language 89: 385-392. Femandez-Duque, Diego and Mark L. Johnson 1999 Attention metaphors: How metaphors guide the cognitive psychology of attention. Cognitive Science 23: 83-116. Gallagher, Shaun this vo!. Phenomenological and experimental contributions to understanding embodied experience. Gallese, Vittorio, Luciano Fadiga, Leonardo Fogassi and Giacomo Rizzolatti 1996 Action recognition in the premotor cortex. Brain 119: 593-609. Geeraerts, Dirk 1985 Paradigm and Paradox: Explorations into a Paradigmatic Theory of Meaning and its Epistemological Background. Leuven, Belgium: Leuven University Press. Halligan, Peter, Gereon R. Fink, John C. Marshall and Giuseppe Vallar 2003 Spatial cognition: Evidence from visual neglect. Trends in Cognitive Sciences 7: 55-61. Hamad, Steven 1990 The symbol grounding problem. Physica D 42: 335-346. Hodges, Andrew 1983 Alan Turing: The Enigma. New York: Walker and Co. Howard, Harry 2001 Age/gender morphemes inherit the biases of their underlying dimensions. In: Esra Sandikcioglu and Rene Dirven (eds.), Language and Ideology, 165-196. AmsterdamlNew York: John Benjamins. Hutchins, Edwin 1995 Cognition in the Wild. Cambridge, MA: MIT Press. 1999 Blending and material anchors. Ms. San Diego, CA: Cognitive Science Department, University of California at San Diego. James, William 1900 Psychology (American Science Series, Briefer Course). New York: Henry Holt and Co.
374
Tim Rohrer
Jensen de Lopez, Kristine 2002 Language-specific patterns in Danish and Zapotec children's comprehension of spatial grams. In: Eve Clark (ed.), The Proceedings of the 31st Stanford Child Language Forum: Space in Language Location, Motion, Path and Manner, 50-59. Stanford University: Center for the Study of Language and Information. Jensen de Lopez, Kristine and Chris Sinha 1998 Corn stomach basket. Spatial language and cognitive development in Danish- and Zapotec-acquiring children. Paper presented at the Child Language Seminar, Sheffield University England, September 1998. Johnson, Mark L. 1987 The Body in the Mind: The Bodily Basis of Meaning, Imagination and Reason. Chicago: University of Chicago Press. Johnson, Mark L. and Tim Rohrer this vol. We are live creatures: Embodiment, American Pragmatism and the cognitive organism. Kita, Sotaro Interplay of gaze, hand, torso orientation and language in pointing. 2003 In: Sotaro Kita (ed.), Pointing: Where Language, Cognition and Culture Meet, 307-328. Mahwah, NJ: Lawrence Earlbaum. Kosslyn, Stephen M. 1994 Image and Brain: The Resolution of the Imagery Debate. Cambridge, MA: MIT Press. Kosslyn, Stephen M., William L. Thompson, Irene J. Kim and Nathaniel M. Alpert 1995 Topographic representations of mental images in primary visual cortex. Nature 378: 496-498. Kosslyn, Stephen M., Gregory J. DiGirolamo, William L. Thompson and Nathaniel M. Alpert Mental rotation of objects versus hands: Neural mechanisms revealed 1998 by positron emission tomography. Psychophysiology 35: 151-161. Kosslyn, Stephen M., William L. Thompson, Mary J. Wraga, and Nathaniel M. Alpert 2001 Imagining rotation by endogenous versus exogenous forces: Distinct neural mechanisms. NeuroReport 12: 2519-2525. Kosslyn, Stephen M., Giorgio Ganis and William L. Thompson 2002 Neural foundations of imagery. Nature Reviews Neuroscience 2: 635-642. Lakoff, George 1987 Women, Fire and Dangerous Things. Chicago: University of Chicago Press. Lakoff, George and Mark L. Johnson 1980 Metaphors We Live By. Chicago: University of Chicago Press.
The body in space: Dimensions ofembodiment 1999
375
Philosophy in the Flesh: The Embodied Mind and Its Challenge to Western Thought. New York: Basic Books. Lakoff, George and Mark Turner 1990 More than Cool Reason: A Field Guide to Poetic Metaphor. Chicago: University of Chicago Press. Langacker, Ronald 1990 Concept, Image and Symbol: The Cognitive Basis of Grammar. New York, NY: Mouton de Gruyter. Levinson, Stephen C. 2003 Space in Language and Cognition: Explorations in Cognitive Diversity. Cambridge: Cambridge University Press. MacLaury, Robert 1989 Zapotec body-part locatives: Prototypes and metaphoric extensions. International Journal ofAmerican Linguistics 55: 119-154. MacWhinney, Brian 1999 The emergence of language from embodiment. In: Brian MacWhinney (ed.), The Emergence ofLanguage, 213-256. Mahway, NI: Lawrence Earlbaum. 2003 The emergence of grammar from perspective-taking. Ms., Pittsburgh, PA: Camegie Mellon University. Majid, Asifa, Melissa Bowerman, Sotaro Kita, Daniel B. M. Haun and Stephen C. Levinson 2004 Can language restructure cognition? The case for space. Trends in Cognitive Sciences 8: 108-114. Matlock, Teenie, Michael Ramscar and Lera Boroditsky 2005 The experiential link between spatial and temporal language. Cognitive Science 29: 655-664. Miisseler, Iochen 1994 Position-dependent and position-independent attention shifts: Evidence against the spotlight and premotor assumption of visual focusing. Psychological Research 56: 251-260. Nagel, Thomas 1979 Mortal Questions. New York: Cambridge University Press. NUfiez, Rafael and Eve Sweetser 2006 In Aymara, Next Week Is Behind You: Convergent Evidence from Language and Gesture in the Crosslinguistic Comparison of Spatial Construals of Time. Cognitive Science 30: 1-49. Parsons, Lawrence M. 1987 a Imagined spatial transformations of one's hands and feet. Cognitive Psychology 19: 178-241. 1987 b Imagined spatial transformation of one's body. Journal of Experimental Psychology 116: 172-191.
376
Tim Rohrer 1994
Temporal and kinematic properties of motor behavior reflected in mentally simulated action. Journal of Experimental Psychology: Human Perception and Performance 20: 709-730. 2003 Superior parietal cortices and varieties of mental rotation. Trends in Cognitive Science 7: 515-551. Parsons, Lawrence M., Peter T. Fox, J. Hunter Downs, Thomas Glass, Traci B. Hirsch, Charles C. Martin, Paul A. Jerabek and Jack L. Lancaster 1995 Use of implicit motor imagery for visual shape discrimination as revealed by PET Nature 375: 54-58. Pederson, Eric, Eve Danzinger, David Wilkins, Stephen Levinson, Sotaro Kita and Gunter Senft 1998 Semantic typology and spatial conceptualization. Language 74: 557589. Pfeifer, Rolf and Christian Scheier 1999 Understanding Intelligence. Cambridge, MA: MIT Press. Posner, Michael and Marcus Raichle 1994 Images ofMind. New York: Scientific American. Pylyshyn, Zenon W. 1973 What the mind's eye tells the mind's brain: a critique of mental imagery. Psychological Bulletin 80: 1-24. Regier, Terry 1992 The acquisition of lexical semantics for spatial terms: A connectionist model of perceptual categorization. PhD. Dissertation, University of California at Berkeley. 1995 The Human Semantic Potential. Chicago: University of Chicago Press. Rizzolatti, Giacomo and Laila Craighero 2004 The mirror neuron system. Annual Review ofNeuroscience 27: 169192. Rohrer, Tim When metaphors bewitch, analogies illustrate and logic fails: Con1998 troversies over the use of metaphoric reasoning in philosophy and science. PhD. dissertation, University of Oregon. 2001 a Pragmatism, Ideology and Embodiment: William James and the philosophical foundations of cognitive linguistics. In: Rene Dirven, Bruce Hawkins and Esra Sandikcioglu (eds.), Language and Ideology: Cognitive Theoretic Approaches. Volume 1, 49-81. Amsterdam: John Benjamins. 2001 b Understanding through the Body: fMRI and of ERP studies of metaphoric and literal language. Paper presented at the 7th International Cognitive Linguistics Association Conference, July 2001.
The body in space: Dimensions ofembodiment
2005
377
Image schemata in the brain. In: Beate Hampe (ed), From Perception to Meaning: Image Schemas in Cognitive Linguistics, 165-196. Berlin: Mouton de Gruyter. Schwoebel, John, Robert Friedman, Nanci Duda and H. Branch Coslett 2001 Pain and the body schema evidence for peripheral effects on mental representations of movement. Brain 124: 2098-2104. Shepard, Roger N. and Jacqueline Metzler 1971 Mental rotation of three-dimensional objects. Science 171: 701-703. Shulman, Gordon L., Roger W. Remington and John P. McLean 1979 Moving Attention Through Visual Space. Journal of Experimental Psychology: Human Perception and Performance 5: 522-526. Sinha, Chris and Kristine Jensen de Lopez 2000 Culture and the embodiment of spatial cognition. Cognitive Linguistics 11: 17-41. Tsal, Yehoshua 1983 Movements of attention across the visual field. Journal of Experimental Psychology: Human Perception and Performance 9: 523530. Turing, Alan M. 1950 Computing machinery and intelligence. Mind 59: 433-460. Vingerhoets, Guy, Floris P. de Lange, Pieter Vandamaele, Karel Deblaere and Erik Achten 2002 Motor Imagery in Mental Rotation: An fMRI study. Neuroimage 17: 1623-1633. Wilson, Margaret 2002 Six views of embodied cognition. The Psychological Bulletin and Review 9: 625-636. Wraga, Maryjane, Sarah H. Creem, and Dennis R. Proffitt 1999 The influence of spatial reference frames on imagined object- and viewer rotations. Acta Psych 0 logica 102: 247-264. Zlatev, Jordan Situated Embodiment: Studies in the Emergence ofSpatial Meaning. 1997 . Stockholm: Gotab Press. 2003 Polysemy or generality? Mu. In: Hubert Cuyckens, Rene Dirven and John Taylor (eds.), Cognitive approaches to lexical semantics, 447494. Berlin: Mouton de Gruyter.
On the biosemiotics of embodiment and our human cyborg nature*
Claus Emmeche "my faculty of discussion is equally localized in my inkstand" Peirce 1
Abstract This chapter briefly introduces biosemiotics, a new perspective on living systems based upon standard contemporary biology reinterpreted through a qualitative organicist tradition in biology inspired by Jakob von Uexkiill and Charles S. Peirce. To be emphasized is the difference between a living organism as a general semiotic system with vegetative and self-reproductive capacities ("the body of biology") and an animal body also with sentience and phenomenal states. The "representationalism" invoked by critiques of cognitive science tends to focus only on simplistic notions of representations. However, these must be distinguished from a Peircean notion of representation as an embedded ongoing process. The implications of this distinction for theorizing about the physical, biological, animate, phenomenal and social body will be made. It will be argued that these very dimensions of embodiment must be accounted for coherently and seen within an evolutionary emergentist perspective. Such a perspective requires a "Non-standard Neo-Aristotelian Pluralist notion of Causation" (NNPC), and I will argue that biosemiotics offers an approach to animate embodiment compatible with NNPC. However, biosemiotics is not enough to account for the characteristics of human embodiment because of what shall be called our intrinsic cyborg nature as techno-culturally embedded beings within a space of meanings that are not only symbolic, but argumentative and socially empowered by different kinds of sociocultural systems.
* The author is indebted to Frederik Stjernfelt, Simo Keppe, Jesper Hoffmeyer, Charbel Niiio EI-Hani and Joao Queiroz for valuable discussions, and to Tom Ziemke and an anonymous reviewer for critical comments on earlier versions of this paper. 1. Peirce ([1931-1958] 1902, CP 7.366)
380
Claus Emmeche
Keywords: animal, biosemiotics, causality, cyborg, embodiment, emergence, organicism, representation, society.
1.
Introduction
The desire we have as persons for knowledge about the body is often redirected to science. If the body is a living organism, the science of the body should be biology, as one might envision this "science of life" as being able to account for the nature of human bodies and the embodiment of human existence. It was only some time after having started training as a biologist that it became clear to me that biology cannot really be a science of living bodies. It is a science of organisms and their parts and relations within systems of organisms and environments, and these two domains of inquiry, living bodies and organisms, are not co-extensive. Intuitively I had held the idea that biology because of its generality simply subsumed humans as a species, just as physics subsumed biological processes in the sense that any biological theory of these processes has to conform to physical laws. Embodiment had to be seen as a biological phenomenon, I thought. This stance was naIve for two reasons, an epistemic one and an ontological one. The epistemic one is that each science in an important sense actively constitutes is own paradigmatic objects of investigation, and thus body and embodiment in biology is different from body and embodiment in anthropology. In fact, embodiment is usually neither considered a technical term nor a theoretical issue within biology. The ontological reason why embodiment (how to be a body, have a body, use and express one's body2) is not simply a biological phenomenon is that the human body in every specific concrete context is a far more complex and multi-Ievel3 phenomenon than any single scientific perspective can account for. 2. Often expressions in everyday language exemplifying the non-identity between the notions of self, person and body (like "I moved my body"), appear as if presupposing a metaphysical mind/body dualism, leading to questions like "Do you use your body to express yourself, or is your body and your self the same expressive entity?". As Wittgenstein remarked, we are often bewitched by language. 3. As we shall see later, "multi-level" means including at least a physical, a general biological-organismic, a specific animate-zoological, and a human-sociocultural level. There is a one-way relation of inclusion between the levels (cf. the notion of a "specification hierarchy" in Salthe 1989). The biological level includes the
On the biosemiotics ofembodiment and our human cyborg nature
381
To understand the full implications of embodiment, one has to grasp at least the distinction between the concept of an organism and the concept of a body. This distinction matches the distinction between biology and sociology (roughly speaking, leaving out animals for a moment). Of course, being an organism is a precondition for having a living body which is neither a thing not a corpse. We could not, as humans, be embodied - that is, be material creatures co-evolving with an existential-phenomenal world of situated activity involving emotional experiences, feelings, cognitive processes, perception and action, a specific perspective, placed in historical and biographic time regimens - if we were not rooted in an organic world by our very being, that is, if we were not biological creatures. 4 Yet in significant ways we transcend our biological form of existence by producing, through culture, language and social institutions, specific dynamic modes or patterns in which our organic and animate existence is realized. Here we shall try to bring together some perspectives upon the body that may help increase theoretical sensitivity to our (minimally double) existence as belonging both to a first biological nature bound by animate and metabolic processes, and a second social and semioticS nature in which the significance of the organism's forms of movement can only be comprehended if
zoological - being an animal implies being an organism - but not the other way around. However, even though humans are animals, zoology as a science is not adequate and too unspecific to acount for all there is to say about the cultural life of Homo sapiens, a being socialized in a human society. 4. Some AI researchers, cognitive psychologists and philosophers might consider that bio-chauvinism if the intended implication is the impossibility of "intrinsic" machine intentionality. However, machines should not be reified as something outside a human sociocultural context. Machine intentionality might be possible, but the action of signs and interpretation (i.e., semiosis) in machines must be considered as tightly bound to their sociocultural context. Machines are extensions of human capacities and intrinsic parts of human sociocultural systems; they do not originate ex nihilo or fall readymade from heaven (Marx 1867; Ziemke and Sharkey 2001). 5. "Social" can be said to mean concerning human actions within a community (as studied by sociology, the science of social processes). "Semiotic" is here taken to mean concerning sign production, information transmission, communication and interpretation within some system (as studied by semiotics, the general science of sign action and interporetation, as founded by the philosopher and scientist C. S. Peirce). The word nature has several meanings but here I rely on my reader as a competent language user.
382
Claus Emmeche
seen as formed by the norms, social roles and institutions of a human society. However, we shall circumvent the all too common dualism of seeing these modes as exclusive, or seeing, as often expressed within the humanities, meaning and significance as an exclusively human social phenomenon, implying non-human nature to be either merely an abstraction or a boundary concept denoting forms of materiality devoid of any intrinsic meaning or significance. First, the modes of existence of embodiment are inclusive; briefly, the human body includes the biological organism. 6 Secondly, the organism as our "first" biological nature is not something nonsemiotic. On the contrary, the roots of processes involving communication, information, signs and interpretation (in one word: semiosis) - and more specifically, human forms of sign action and interpretation (anthroposemiosis) - must be found within an evolutionary framework. This should not, however, simply be standard neo-Darwinism (as this paradigm during the history of twentieth century biology unfortunately has become dominated by a physicalist metaphysics), but supplemented with a biosemiotic perspective. In Section 2 I shall briefly introduce biosemiotics as one useful resource to sketch a framework for an overall understanding of embodiment in its material, processual, externalist as well as internalist aspects. A brief comment on the current interest in the phenomena of embodiment may be adequate to discern the specific message ofbiosemiotics. The concern with embodiment has several sources: (1) the study of human cognition, especially criticisms of the classical functionalism of traditional cognitive science and AI, alternatives like "enactive cognitive science" (Varela, Thompson and Rosch 1991) or ideas of "situated and embodied robots" Brook 2002), which have led to a radical questioning of the need for representations in cognitive science (Riegler, Peschl and Stein 1999); (2) the study of human language, especially critiques of modeling language as a disembodied abstract module, and the alternative movement in cognitive semantics that sees basic metaphoric forms of language as rooted in the organization of human bodily action, and claims these unconscious metaphors are based on common bodily experiences (Lakoff and Johnson 1999); and (3), a widespread interest, within the human sciences and interdisciplinary cultural studies, in the transformation of the body, in which
6. See also note 3. This notion ofinclusivity (see below) was developed in relation to emergent levels of reality in Emmeche, Keppe and Stjernfelt (1997).
On the biosemiotics ofembodiment and our human cyborg nature
383
"the new body" is seen no longer as a "fixed, material entity subject to the empirical rules of biological science" (Csordas 1996). Instead, the "new body" is seen as a body with a history, behaving "in new ways at particular historical moments" (ibid.), characterized by indeterminacy, flux, selfcreating processes. This has lead to a deep questioning of a set of supposedly basic boundaries, namely the boundaries of corporeality itself, and the boundaries between physical and non-physical, animal and human, and between organic bodies and machines (cf. Hayles 1999). Let me state preliminarily and in simple terms what biosemiotics is about: It is an attempt to study life not only by chemical approaches seeing cells and organisms as assemblies of molecules, but from the perspective of semiotics (cf. footnote 5) seeing those same molecules (correctly described by chemistry and molecular biology) as sign vehicles for information and interpretation processes. One main concern of biosemiotics is thus to see the phenomena of significance, meaning and interpretation in the human sphere as rooted in and to some extent continuous with the same very general kind of phenomena in the non-human sphere. We can take the words of the Peirce scholar T. L. Short (1982: 298) to be emblematic of this evolutionary yet nonreductionist perspective ofbiosemiotics: Of course human speech is unique: it consists of the ability to replicate legisigns in ever new patterns, as well as in the ability to create new legisigns, not by the slow process of evolution, but within the lifetime of individuals, and by their own volition. The point, rather, is that the distinctive power of human speech is not a supernatural gift, but is a remarkable development of basic principles found elsewhere in nature."
From a biosemiotic point of view, more than one approach to embodiment is needed for the above intellectual movements (1) and (2) may lead too hastily to radical anti-representationalist stances, while (3), within the posthuman or postmodern discourse, they may tend simply to dissolve real boundaries between the categories of body/machine, human/animal, or nature/culture, instead of analyzing their dynamics, inclusivity relations, or how they are being contested. Biosemiotics is critical about these implications; it builds on a more adequate notion of representation (namely that of Peirce; see below) than classical cognitive science; and it allows for keeping fundamental distinctions while seeing these in a dynamic evolutionary perspective which involves a modest dialectics of continuity and emergence. Yet I will also claim that as a scientific perspective, biosemiotics has a limited area of validity, corresponding more or less to the subject
384
Claus Emmeche
matter of contemporary biology, and it cannot fully capture the specificities of human embodiment. The idea to be presented is in some respects a very simple one of emergent levels of embodiment: A human body - e.g., the body of a child, a soccer player, or a diplomat - includes (in the sense specified in footnote 3) an organism, but is also something more, transcending the mere set of organismic properties (like metabolism, growth, homeostasis, reproduction), just as, if we go down one step to a lower level of integration, an organism is a physical system, yet it transcends the basic physics of that system. "Transcending" here does not imply some transcendentalism or metaphysical dualism. The expression "Z transcends Y" has two aspects. (a) One aspect is epistemic, i.e., "Z's description cannot adequately be given in terms of a theory generally accounting for Y, even though this Zdescription in no way contradicts a description of the Y-aspects of Z". (b) The other aspect is ontological, i.e., "crucial properties and processes of Z are of a completely different category than the ones of Y, even though they may presuppose and depend on Y". 7 The organism is a material, physical processual entity with a form of movement so specific that physics alone (as a science) cannot account for that entity. The organism is a very special type of physical being, as it includes certain purposeful (functional) partwhole relations, based upon genuine sign systems of which the genetic code is the most well-known but not the only example. It is time to insert a one-paragraph crash-course in semiotics (cf. Peirce [1931-1958] 1902): A sign is anything that can stand for something (an object) to some interpreting system (e.g., a cell, an animal, a legal court), where "standing for" means "mediating a significant effect" (called the interpretant) upon that system. Thus, semiosis, or sign action, always involves an irreducibly triadic process between sign, object and interpretant. Just as in chemistry we see the world from the perspective of molecules, in semiotics (as a general logic of sign action) we see the world from the perspective of sign action, process, mediation, purposefulness, interpretation, generality. Those are not reducible to a dyadic mode of mechanical actionreaction, or merely efficient causality.8 Thus, organisms are certainly com7. Both emergence and variants of supervenience have been suggested as candidates for an ontological dependence relation, but we need not enter the debate on these technicalities. 8. The details of this argument are more complex in the sense that the distinction fmallefficient causation and organismic/physical processes are not co-extensive, and thus, the irreducibility of fmal (teleologic) causation to efficient causation
On the biosemiotics ofembodiment and our human cyborg nature
385
posed of molecules, but these should be seen as sign vehicles having functional roles in mediating sign action of, e.g., genotype and environment, upon the phenotype. Biological organization is emergent upon physical order, i.e., the organism includes its physical processes (for instance, one can break a leg) but from the point of view of physics their processual forms are radically new, unpredictable and realizing processes (e.g., self-reproduction) that are not strictly deducible from properties found in the non-biological realm. However, discussing embodiment, one cannot limit oneself to the gross primary levels of reality including the physical, the biological and the social level of embodiment, one has to do a more fine-grained analysis of various biological forms of organismic existence before we can make sense of a distinction between biological and human embodiment. To this end biosemiotics is valuable although it cannot alone account for human embodiment.
2.
Biosemiotics - a qualitative organicist account of embodiment
Biosemiotics is a growing field that studies the production, action and interpretation of very different types of signs (such as sounds, objects, smells, movements, but also signs on molecular scales normally not perceived by an organism) in the physical and biological realm, in an attempt to integrate the findings of biology and semiotics. One goal ofbiosemiotics is to form a new view of life and meaning as immanent features of the natural world, rather than as epiphenomena, however complex they might be. Early pioneers ofbiosemiotics include Charles S. Peirce (1839-1914), Jakob von Uexkiill (1864-1944), Charles Morris (1901-1979), Thure von Uexkiill (b. 1908), Heini Hediger (1908-1992), Thomas A. Sebeok (19202001), and Giorgio Prodi (1928-1987). Contemporary scholars who have contributed to the field or commented extensively include, among others, biologists Jesper Hoffmeyer, Kalevi Kull, Alexei Sharov, Anton Markos, Marcello Barbieri and Seren Brier, and semioticians Walther A. Koch, Floyd Merrell, John Deely, Frederik Stjemfelt, Winfried N6th and Lucia Santaella (for a historical introduction, see KuII1999).
(sensu Peirce) is not necessarily the same as the irreducibility of organismic processes to physical processes, as for Peirce, every lawful process has an element offmality (see Hulswitt 2002).
386
Claus Emmeche
One of the central characteristics of living systems is the highly organized nature of their physical and chemical processes. These processes are based, in part, on the informational and molecular properties of what came to be known in the 1960s as the genetic code. Some biologists, such as Emst Mayr (1982), have viewed these program-like properties of the genetic code as distinguishing life from anything else in the physical world, except computers. However, although the informational teleology (or goaldirectedness based upon a stored informational code) of a computer program is not an original form of teleology because the program is designed by humans to achieve specific goals, the teleology and informational characteristics of organisms are intrinsic, as they evolved naturally through self-organized evolutionary processes. Traditional biology, at least in popular and text book versions, and a large part of mainstream philosophy of biology have regarded such informational processes as in the end purely physical; unfortunately also adopting a restricted notion of the physical as having to do with only efficient causation. Thus these traditions have often aimed at a "naturalization" of teleology (of, e.g., genetic information) and intentionality (of cognitive information). The admirable purpose of this is making these phenomena continuous with the rest of nature, and to account for them in a manner acceptable to physical science. Contemporary science, however, is often prejudiced as the only legitimate basis of judgments on the nature of Nature. Taking the animal or human mind and its teleological and intentional processes to be embodied is often perceived, within biology and cognitive science, as implying a search for reductive explanations of mental phenomena in terms of, e.g., neural information processing, or dynamical systems approaches continuous with the kind of explanations given for non-linear complex systems within physics. This is the case even for some of the search for other notions of embodiment (e.g., "radical embodiment", Clark 1999). In many programmatic naturalization strategies, the "natural" seems to be reduced beforehand to restricted metaphysical versions of nature as only involving certain kinds of properties or causal processes, like, as said, classical efficient causation. 9 Drawing upon the insights of Peirce who founded semiotics as a logic and scientific study of dynamic sign action in human and non-human nature, biosemiotics attempts to use semiotic concepts to investigate ques-
9. Partly based on mathematical arguments Scott (2004) convincingly shows that a reductionist notion of causality is not even defensible for many non-linear complex systems in physics.
On the biosemiotics ofembodiment and our human cyborg nature
387
tions about the biological and evolutionary emergence of meaning, intentionality and a psychic IO world which are difficult if not impossible to answer within a mechanicist or physicalist framework. II Biosemiotics see the evolution of life and the evolution of semiotic systems as two aspects of the same process. The main tenet of biosemiotics is the belief that the scientific approach to the origin and evolution of life has given us highly valuable accounts of the external aspects of these processes, but has overlooked the "inner", qualitative and significant aspects of sign action, thus leading to a reduced picture of causality (cf. Santaella Braga 1999). Complex self-organized living systems are governed by formal and final causality (Van de Vijver, Salthe and Delpos 1998). They are mereologicallyI2 governed by formal causality in the sense of a non-temporal "downward causation" (Andersen et al. 2000) "from" a whole structure (such as the organism) "to" its individual molecules, constraining their action but also endowing them with functional meanings in relation to the whole metabolism. They are governed by final causality in the sense of their tendency to take habits (find new attractors in phase space) and to generate future interpretants of present sign actions. In this sense, biosemiotics draws upon insights of fields like systems theory, theoretical biology and the physics of complex self-organized systems. I3 Particular scientific fields like molecular biology, cognitive ethology, cognitive science, robotics and neurobiology deal with information processes at various levels and thus - in that minimal sense of biosemiosis as informational processes in living organisms - spontaneously contribute to knowledge about biosemiosis and living sign action, even though these
10. Allow me to use "psychic" instead of "mental", as the former word has less Cartesian connotations than the latter and is rooted in a realist, neo-Aristotelian and Peircean philosophical tradition. Intentionality is used here not according to Brentano's dualist metaphysics but in a way inspired by Peirce to designate both the representational "aboutness"-aspect of animate sign processes, and their qualitative, psychic, phenomenal aspects. 11. This applies equally for non-reductive physicalism within philosophy of mind. 12. Mereology, the study of parts and wholes, usually refers to a mathematical or at least formal theory thereof, such as that of Lesniewski or Goodman (cf. Varzi 2004), developed by the former in the hope of forming an alternative to set theory as a foundation for mathematics. 13. See Hoffmeyer (1996), Emmeche, Kull and Stjernfelt (2002), Hulswitt (2002), Emmeche (2004a), EI-Hani, Queiroz and Emmeche (2005). (More on causality, see also section 5 below).
388
Claus Emmeche
findings are not framed within a theory of biosemiotics. However, biosemiotics is not so much a specific disciplinary research programme as a general scientific perspective on living systems. It attempts to integrate and re-interpret empirical findings, and to build a new foundation for biology that acknowledges the genetic "information talk" of codes, messenger molecules, translation etc., as being not simply problematic metaphors but a symptom of real organic semiosis, emphasizing that a notion of "information processing" can be provided, that is theoretically richer than standard notions of information processing in contemporary cognitive science (cf. Lindblom and Ziemke, this volume). What is this alternative notion of information and representation? According to biosemiotics, the cognitive system is not a dual system of hardware (a physical brain) and software (an algorithmic symbolic system), and similarly the organism is not a dual system of "information and flesh", with a DNA code as governor of all bodily processes. Sign processes are active components of the overall physiology of the organism, and accordingly everything which may become "a difference that makes a difference" to the organism (to use Bateson's cybernetic definition of information; Bateson 1972) is acting as a sign. Just as the mind is not a computer program, DNA is not the central governor of the body, and genes are not preformed descriptions of phenotypic characters with a privileged causal status in every cell, as often assumed in standard genetics and molecular biology. It is possible to use Peircean semiotics to analyse sign processes not only in protein synthesis but in any significant process within and between cells. Though not identical, the notions of sign action sensu Peirce and of information sensu Bateson are intersecting and not conflicting, if interpreted correctly: Both notions are generally denoting processes of production, transmission and interpretation of something of significance for some interpreting system (like a body, an organism, a cell), and the important aspects of both notions are relational and semiotic. If information is a difference that makes a difference (implied is "to an organism" or "to some interpreting system"), then by Peircean terminology this can be restated more precisely in a semiotic definition of information as a triadic process we may formulate as follows: Information is a process in which a sign (representamen), simple or composite, makes a difference (interpretant) to some system (the interpreter) by making the interpretant (the effect of the sign as a difference) stand in a similar relation to something else (the object of the sign) as that to which the representamen stands or refers, thus mediating object and
On the biosemiotics ofembodiment and our human cyborg nature
389
interpretant, i.e., conveying to the interpretant a dynamic form that signifies the object. 14 This definition can be exemplified on the level of both molecular biology and cognitive ethology: Fine-grained analysis of genes and protein synthesis as semiotic processes can be made (EI-Hani, Queiroz and Emmeche 2005) as well as analyses of, for instance, the alarm system in vervet monkeys (Queiroz and Ribeiro 2002). Here one has to remember the very general character of the formal sign definition in Peirce, of which the proposed definition is a variant that emphasizes the processual character. That the meaning of the sign is its causal effects (i.e., the interpretant) implies that these effects have to be physically, biologically, or culturally embodied in an interpreting system (or in the dialogic process between such systems; see Petrilli 1999) to the extent that we are dealing with material systems of a physical, biological or cultural kind. In philosophical jargon, meaning is supervenient on the physics of the system, not reducible to physics, yet dependent upon a material basis. Furthermore, biosemiotics helps to resolve some remnants of Cartesian dualism that are still haunting philosophers and scientists. It provides an alternative to the implicit Cartesianist metaphysics in evolutionary biology that tend to reduce organisms to non-sentient mechanical beings, a reduction that makes it almost impossible to conceive of a satisfying evolutionary account of the qualitative aspects of consciousness (Emmeche 2004a). By subscribing to a process rather than substance metaphysics, and by describing the continuity between matter and mind, biosemiotics may also help to understand higher embodied forms of mind and the evolutionary roots of cultural phenomena. In a historical account of "embodiment", Keller (2003) comments upon the tendency of science writers and journalists to articulate both a quasireligious awe when facing the structure of the genetic code or the human genome, seeing it as "the word becoming flesh", but also repeating, by such expressions, a dualist framework that has been operative since the foundation of modem science: "The body as a machine, as a closed system, 14. Cf. one ofPeirce's own defmitions: "A Sign, or Representamen, is a First which stands in such a genuine triadic relation to a Second, called its Object, as to be capable of determining a Third, called its Interpretant, to assume the same triadic relation to its Object in which it stands itself to the same Object." ([19311958] 1902, CP 2.274), emphasis in the original. Peirce distinguished between several kinds of signs, objects, and interpretants, which may be used to develop further the informational variant of the sign defmition.
390
Claus Emmeche
to which spirit, mind and God, if they exist, were posited as external agents, dominated the western imagination until the twentieth century" (ibid., 255). Even though Keller locates this new "gospel of genetics" in a culture-historical trend towards more inclusive senses of embodiment (deriving from not only the process theology of Alfred North Whitehead, but also from feminist and ecological movements) and thus sees it as contributing to re-opening the closed system of the body as a machine, these very rhetorical figures of popular genetics tend to reinforce the dualisms between mind and body; information and flesh. Here, biosemiotics may function as one of the resources for critical study of mind/body dualism in Western science and religion. The biosemiotic account of embodiment is a form of organicicm, an important tradition in biological thought. The resolution of the debate between vitalism and mechanicism in the history of biology looks on the surface like a victory for mechanicism, but was really a compromise, a sort of mainstream organicism, exemplified by the writings of such well-mown biologists as J. Needham, P. Weiss, C. H. Waddington, J. Woodger, E. Mayr, R. C. Lewontin, R. Levins and S. J. Gould, (see also Gilbert and Sarkar 2000), and functioning more or less tacitly as a background philosophy of biology (Emmeche 2001). Organicism takes the complexity and physical uniqueness of the organism as a symptom of the distinctiveness of biology as a natural science sui generis. As a compromise, although often framed within a naturalist evolutionary perspective, it was anticipated by Kant's more critical (non-naturalist) notion of a living organism as an organized product of nature in which every part is reciprocally purpose (end) . and means (cf. Van de Vijver, Van Speybroeck and Vandevyvere 2003). However, within mainstream organicism this teleology is interpreted as a more or less "mechanical" teleonomy being the result of the forces of blind variation and natural selection, plus eventually some additional "order for free" or physical self-organization (Kauffman 2000). Mainstream organicism is thus non-vitalist, ontologically non-reductionist (methodologically allowing for reductionist research strategies though) and emergentist. What is studied as emergent properties are common material structures and processes within several levels of living systems (developmental systems, evolution, self-organizing properties etc.), all of which are treated in the usual way as objects with no intrinsic experiential properties. In contrast, qualitative organicism represents a more colored view on living beings; it emphasizes not only the ontological reality of biological higher level properties or entities (such as systems of self-reproducing
On the biosemiotics ofembodiment and our human cyborg nature
391
organisms being parts of the species' historical lineage) but also the existence of phenomenological or qualitative aspects of at least some higher level properties. When sensing light or colors, an organism is not merely performing a detection of some external signals which then are processed internally (described in terms of neurochemistry or "information processing"); something additional is part of the story, at least if we want the full story, namely the organism's own experience of the light, and this experience is seen as something very real. 15 As a scientific position qualitative organicism is concerned with qualities which are not only of the famous category of "primary" qualities (roughly corresponding to the scientifically measurable quanta) including shape, magnitude and number; but also concerned with the "secondary" qualities of color, taste, sound, feeling, etc. Neither qualitative nor mainstream organicism are fully coherent stances, theories or paradigms; many authors cannot be consistently categorized as belonging to either one or the other position exclusively; the important thing is to recognize that in fact two different conceptions of life are at stake. To the extent that biosemiotics is based in a Peircean logic of signs of varying levels of complexity, it is a form of qualitative organicism. We leave it to others to form a taxonomy of the variants of qualitative organicism. Here we will concentrate on its biosemiotic brand instead. Biosemiotics recognizes not only the external effects of signs upon their interpreters, forming new interpretants (remember that the interpretant, being an effect of the sign, need not be the whole interpreter organism), but also the internalist and phenomenal aspects of this process. The phenomenology of basic biological ("vegetative") processes of life is not yet developed, and we shall primarily focus on the animal level of biosemiosis. A living being that is not simply an organism but also an animal, which is emphatically "animated", that is, it realizes a specific form of movement which is not reducible to pure physical change or basic biologic processes (like metabolism). Movement, understood as living change and as semiotic, when realized by self-moving organisms (i.e., animals), crucially depends on mediating all the elements of quality, physical force and interpretation three general ontological aspects of every semiotic process that correspond to the triadic structure of categories in Peirce. Single cells, plants and animals are all, so to speak, semiotic machines (the machine metaphor is not
15. This is the case in Jakob von Uexkiill's biology (von Uexkiill1982) as well as in Peirce's synechism, i.e., his philosophy of mind/matter continuity (Santaella Braga 1999).
392
Claus Emmeche
implying any mechanicism here, cf. Noth 2003), nodes in complex webs of sign interpretation processes. Every process,I6 even in non-living nature, has an aspect of (i) "tone" (i.e., possibility, chance and phenomenal quality), (ii) "token" (existence, force, here-and-now), and (iii) "type" (generality, being governed by a general law). This is true for sign processes as well. The characteristic level of sign interpretation processes in plants are intra- and inter-cellular, with molecular signs representing the immediate environment and longterm morphogenetic changes (processes like metabolism, growth, differentiation and reproduction). A plant's body is the result of an organism's vegetative interpreting systems, and irritability is a crucial component of vegetative embodiment. I? Animate embodiment is different. Although it includes the vegetative form of embodiment (thus an animal's skin may indeed get irritated, its growth may be disturbed, etc.), it is a form of embodiment where self-movement is a crucial new development of the general biological self-control one finds in plants and single cells. In selfmovement, animal agency emerges as a temporary structure of bodily sign action processes. Animate agency is semiotic, meaning that it mediates the three dimensions of (i) a qualitative phenomenal feeling, i.e., the "inner" side of the sign considered not in relation to anything else (thus, a form of Firstness in Peirce); (ii) the here-and-nowness of signs or stimuli (the "haecceity" of a stimulus being an example of Secondness) that can have an environmental or bodily proprioceptive origin, but whose being is existent and thus simply consists in their immediate effects, action-reaction, their "actually acting and being acted on" (Peirce [1931-1958] 1902, CP. 6.318); and (iii) the cybernetically organized cycles of organism16. Peirce was not satisfied with a standard type/token distinction which only operates with Thirdness (type, a form of generality) and Secondness (token, existence), so he added "tone" (quality) to represent Firstness. It is not possible in brief space to explain his three basic categories, so I only can offer my reader these hints (see also Hausman 1993; Hulswitt 2002; Peirce [1931-1958] 1902). 17. Irritability is used here in the sense of a general semiotic capacity to feel and interpret stimuli so as to adapt to changes in the environment or in other parts of the organism. Compare the term's original denotation, viz. the capacity of certain parts of the body to contract when stimulated, as introduced by the English physician Francis Glisson (circa 1597-1677) who saw it as a property of all the body's fibres independent of consciousness and the nervous system (cf. Lawrence 1981). It has played an important role in debate between mechanicists and vitalists over the basic defInition of life.
On the biosemiotics ofembodiment and our human cyborg nature
393
environment interaction, the "functional cycle" of Jakob von Uexkiill's biology, describing how the environmental stimuli are integrated and interpreted in the Umwelt 18 of the organism - this integrative mediating process involves emotional and proprioceptive interpretation systems that mediate the past and present Umwelt into anticipatory schemes for further action and habit-formation (i.e. the mediative and future-directed category of Thirdness). Animal embodiment, accordingly, is the graceful integration, accomplished by a living animal body, of physical change, biologicmetabolic processes (neural as well as general physiological) and the emerging interpretative Umwelt relation (of a body within its specific perceived and acted upon environment) representing the animal's own detailed movements in an ever-changing flux of environmental and bodily stimuli.
3.
Triadic representations vs. representationalism
It is important to note that a biosemiotic notion of animal embodied cognition is not committed to the same kind of functionalist representationalism often criticized by "new AI" or cognitive semantics, i.e., the notion that cognition and language consist of rule-governed manipulation of a distinct set of symbolic representations inside an organism referring to an outside world (cf. Riegler and Stein 1999; Ziemke and Sharkey 2001). The representationalism of classical cognitive science, like the physical symbol system hypothesis of functionalism, has a dyadic notion of representation (tending to disregard the open-ended and processual character of interpretation) and an atomistic concept of symbol, in strong contrast to the triadic, general and evolutionary nature of sign processes and causality involved in a Peircean conception of representation. On the level of general semiotics, signs are forms of mediation (i.e., generals embodied as relational tokens) that are embedded within a network of local processes of ongoing generation of new interpretants; this applies to cellular "vegetal" signs as well as 18. The Umwelt is the species-specific subjective universe, or phenomenal world, of an organism; the part of the environment of a subject that it selects with its species-specific sense organs according to its organization and biological needs. Everything in the Umwelt is labeled with perceptual cues and effector cues of the subject. According to J. von Uexkiill, every subject is the construtor of its Umwelt; cf. J. von Uexkiill 1940 [1982], with a good introduction by his son Thure von Uexkiill (1982). See also the volume edited by Kull (2001).
394
Claus Emmeche
cognitive "animate" ones. Embodiment in this basic sense is a sine qua non for all sign action in nature and culture, and here, the term "embody" designates a general metaphysical aspect of semiosis: A sign must, so to speak, materially incarnate its meaning. 19 It is on this level of generality, we might call it semiotic embodiment, one must answer the question whether embodiment is necessary or sufficient for meaning or mind: From the point of view of semiotics, meaning and mind are all-pervasive phenomena; meaning always has to be embodied in signs, but need not be confined to organic bodies. 20 Although Peirce's own semiotics is embedded within his broader views on the nature of reality, evolution and thought, contemporary Peircean semiotics and biosemiotics need not subscribe to the total ofPeirce's fascinating blend of pragmaticist philosophy of science, evolutionary cosmology and metaphysics, although these inform our understanding of his notion of logic as tightly connected to semiotics. What is real is not merely what is existing, because the existing "here and now" of individual events (their Secondness) are always, on the one hand, embedded within the generalities of law, tendencies, habit formation and thought, that is, the general logic of things (the category of mediation, Thirdness), and on the other hand, influenced by genuine chance, random deviations from lawfulness, spontaneous generation of new possibilities (belonging to the category of Firstness). Thus, any existing sign, such as this word "hand", is only understandable as a sign of something to some interpreter, by being a token (Peirce called it a sinsign) of some general type of sign (a legisign), namely the kind of word just mentioned. As Santaella states, "The legisign depends on individual cases to be actualized. The symbolic legisign" - here, the word hand - "is embodied in individual cases. In this same act of embodi19. A physical precondition for signification, so to say. Sign action is mediation, and as such requires a material medium for this process (as well as energy, at least under normal thermodynamic conditions; it is well-known that information processing has physical limits and cost energy, cf. Zurek 1989) 20. "Thought is not necessarily connected with a brain. It appears in the work of bees, of crystals, and throughout the purely physical world; and one can no more deny that it is really there, than that the colors, the shapes, etc., of objects are really there. Not only is thought in the organic world, but it develops there. But as there cannot be a General without Instances embodying it, so there cannot be thought without Signs. We must here give 'Sign' a very wide sense, no doubt, but not too wide a sense to come within our defmition. (Peirce [19311958] 1902, CP 4.551).
On the biosemiotics ofembodiment and our human cyborg nature
395
ment the individual cases are conformed to the symbol's domain. The symbol functions as a rule for the formation of a certain sub class of sinsigns which are called replicas. The rules for the formation of the replicas also involve the interpretative rules of these replicas. Hence, the replica of a symbol is a special kind of index which acts to apply the general rule or habit of action or expectance associated to the symbol to a something particular" (Santaella 2003; cf. Short 1988). Also symbolic representations are embedded ongoing processes, and thus we see that the notion of symbol in semiotics is very different from the notion of symbol within classical cognitive science. Regarding the specific human form of embodiment, the symbol attains a specific role accounting for the social nature of human consciousness.
4.
The body within evolutionary emergent levels
As we have seen, the animate body as a self-moving organism transcends the simpler category of an organism realizing the vegetative functions of growth and reproduction. Already at this stage biosemiotics provides not only an alternative (non-physicalist) "naturalization" of semiotic processes in general, but also a framework for distinguishing between levels of embodiment corresponding to different degrees of complexity involved in sign interpretation processes. To clarify the structure of the present argument for a multiplicity of phenomena of embodiment, a simple schematic list of levels of embodiment may help. The full meaning of the scheme's concepts will be explained in the following. Here is the scheme (adapted from Emmeche 2002a): 1) 2)
3) 4)
The body of physics: Dissipative self-organizing structures. Thermoteleology. The body of biology: The organism as a vegetative, physiologichomeostatic self-organizing structure. Bio-functionality and irritability. The body of zoology: The animal as an autonomous, self-moving organism. Intentionality and consciousness. (a) The body of anthropology: The human body as a signifying animal, incarnating a socio-cultural specific life-world. Desire, histronics and conscience. (b) The body of sociology: A "cybody", i.e.,
396
Claus Emmeche
a societal body dependent upon technology, embedded in a civilization. Cosmopolitics, hybridicity, posthumanity. The scheme indicates ordering relations between some forms of embodiment. Their epistemic dimension is mentioned first, by organizing those forms according to different domains of science each constituting its own objects; their ontic dimension is implied by an underlying ontology of levels of organization in Nature. Four such levels are mentioned. The point is not the exact number of levels (these are contingent upon a historically relative state of science) but the fact that irreducible levels do exist. E.g., the animal as a phenomenon includes an organism, yet the animal form of embodiment transcends (in the sense explained above) the vegetative form of embodiment; that of a cell or a plant. No distinction is made between embeddedness and situatedness, but between their vegetative, animate and sociocultural forms: Organismic embodiment: The organism is situated in its ecological nice. Animate embodiment: The animal is additionally embedded in a complementary subjective Umwelt (see below). Anthropic embodiment: A sociocultural Umwelt constitutes a lifeworld enacting social roles and expectations (histrionics) as well as conscience. Thus, different notions of embodiment are partly constituted by different areas of science, yet from a point of view of "semiotic realism", science can indeed capture true aspects of "the joints of nature". Those joints are inclusive, in the sense that being a blackbird includes being a bird, which in turn includes being a vertebrate (and so forth), and more generally, being an animal includes being an organism, and this in turn includes being physical (cf. footnote 3). The modes of human embodiment are also inclusive: The point of calling non-living entities "bodies" and talking about "the body of physics" is that we only get a more complete comprehension of even non-living matter when we realize that it can evolve into higher forms of organization whose properties transcend mere physical properties. 21 Such properties are emergent, though still depending on physical processes. Physical embodiment. Classical as well as modem physics deal with three kinds of objects; first, general forces in nature, particles, general 21. This point is reminiscent of Marx' use of Hegel's philosophy. Cf. Marx' aphorism "The anatomy of man is a key to the anatomy of the ape" (from Introduction to Grundrisse) that should be understood within a frame of a radically reconstructive historical approach as opposed to naively constructive chronological historiography (see Stah11975: 59-62).
On the biosemiotics ofembodiment and our human cyborg nature
397
bodies (matter in bulk), and the principles ("laws") governing their action; second, more specifically the structural dynamics of self-organized bodies (galaxies, planets, solid matter clusters, etc.); third, physical aspects of machines (artefacts produced by human societies and thus only fully explainable also by use of social sciences, like history of technology). In the history of science one has often seen attempts to reduce all of physics to a formalism equivalent to some formal model of a machine, but there are strong arguments against the completeness of this program (Rosen 1991), i.e., mechanical aspects of the physical world are only in some respects analogous to a machine. Some of the general properties of bodies studied in physics have a teleonomic character (a kind of directedness or finality), in the scheme called "thermo-teleology", as this phenomenon of directedness is most known from the second law of thermodynamics (a directedness towards disorder), as well as the opposing self-organizing tendencies in far from equilibrium dissipative systems. "Final causation" indeed is a pervasive aspect of purely physical processes (Hulswitt 2002). Organismic embodiment. A biological notion of function 22 is not yet present in physics, while it is crucial for all biology. Biofunctionality is not possible unless the living system is self-organizing in a specific way, based upon a memory of how to make components of the system that meet the requirement of a functional metabolism of a high specificity. For Earthly creatures this principle is instantiated as a code-plurality between a "digital" code of DNA, a dynamic regulatory code of RNA (and other factors as well, partly digital partly analogic), and a dynamic mode of metabolism involving molecular recognition networks of proteins and other components. This establishes a basic form of living embodiment, the single cell (as a simple organism) in its ecological niche, which presupposes the workings of "the physical body" as a thermodynamic system in nonequilibrium, yet transcends that form by its systematic "memory,,23 of organism components and organism-environment relations. Biosemiotics posits that organismic embodiment24 is the first genuine form of embodi22. Or that for which a part works in order to sustain a living whole, cf. Emmeche (2002 b). 23. This memory is often described as "genetic information" but it includes other forms of memory as well, i.e., stable inheritance systems located within the organism (but outside DNA), or within the ecological habitat of the organism (Jabonka 2001). 24. A concept very similar to the same term in Ziemke 2003, who bases his term "organismic embodiment" on J. von Uexkiill and the notion of autopoiesis.
398
Claus Emmeche
ment in which a system becomes an autonomous agent "acting on its own behalf' (cf. Kauffman 2000), i.e., taking action to secure access to available resources necessary for continued living. It is often overlooked that the subject-object structure of this active agent is mediated not only energetically by a structured entropy difference between organism and environment, but also by signs of this difference; signs of food, signs of the niche, signs of where to be, what to eat, and how to trigger the right internal processes of production of organismic components the right time. It is often forgotten that the active responsitivity of the agent organism (based upon observable molecular signs) has, as an "inner" dimension, a quality of feeling,2s implied here by what is called irritability at the level of a single cell. Irritability is probably both real, logically in accordance with a basic matter-mind evolutionary continuity, rationally conceivable, though impossible for humans to sense or perceive "from within" or empathetically know "what it feels like", say, for an amoeba or an E.coli. Animate embodiment. As the reader might have guessed, the biosemiotic idea is that when we consider the realm of animal mind, the intentionality of an animal presupposes the simpler forms of feelings and irritability we stipulate in single cells (including the "primitive" free-living animals, such as protozoa, lacking, e.g., a nervous system), yet transcends these forms by the phenomenal qualities of the perceptual spaces that emerge in functional perception-action cycles as the animal's Umwelt. As already mentioned, proprioceptive semiosis is a crucial element of phenomenal as well as functional properties (cf. Sheets-Johnstone 1998). More generally However, Ziemke's "organismic embodiment of autopoietic, living systems" (2003: 1306) does not distinguish, as we do here, between the body of biology and the body of zoology and thus lacks a specific notion of animate embodiment which is characterized by not only (vegetative) autopoiesis but also (animate, proprioceptive, perception-motor-coupling based) movement. 25. The concept of "feeling" is not quite the same as "qualia" in contemporary philosophy of mind, but closer to its use in Peirce's phaneroscopy (e.g., [19311958] 1902, CP.1.306-311)."Inner" is not referring to a separation between brain states and bodily or environmental states (which is problematic, cf. Thompson and Varela 2001: 422) but to an ontological category of possibility and quality. As Thompson and Varela do not distinguish organismic from animate embodiment, the fIrst of their three dimensions of embodiment, "organismic regulation" (including "sentience" or "the feeling of being alive", ibid., 424) is a mixture of feeling on this basic organismic level and "primal" or "core" consciousness at the level of animate embodiment.
On the biosemiotics ofembodiment and our human cyborg nature
399
(and less controversially), the animal body is a highly complex and specific kind of a multicellular organism, a kind that builds upon the simpler systems of embodiment on the level of biology, such as physiological and embryogenetic regulation of the growth of specific organ systems, including the nervous system. These regulatory systems are semiotic in nature, and rely on several levels of coded communications within the body and their dynamic interpretations (for details, see Hoffmeyer 1996; Barbieri 2001; Markos 2002). The expression "the body of zoology" in the schematic list is used to emphasize both its distinctness as a level of embodiment, and that zoology instead of being simply part of an old-fashioned division of the sciences should be the study of animated movement, including its phenomenal ("mental") qualities. Anthropic embodiment. Two forms of human (anthropic) embodiment should be distinguished, (a) social embodiment universally characteristic of humans as beings forming distinct language-dependent sociocultural groups; (b) societal embodiment, forming "cyborg bodies" (Emmeche 2004b), or cybodies. The later is a special form of (a) in which a sociocultural body is embedded in a society with several social subsystems (e.g., work, education, politics, law, economy, love/reproduction! consumption, health care, art, science, media), i.e., a civilization, and where the body's animate mode of being is intrinsically connected to technology (e.g., medico-technology) dissolving sharp body/machine boundaries; see section 6 below. First let us concentrate on the anthropic body of a person as a societal being, that is, not simply social in the sense of being a social animal, but emphatically "human-social", being part of a society with division of labor, institutional subsystems, social roles, culture, etc.; as Aristotle said, a political animal. It is beyond the scope of this chapter to develop the specific senses in which humans (i) are embedded within a society, (ii) embody distinct social systems with all their peculiarities, (iii) have their minds constituted by institutional habits, which they (iv) internalize during socialization. The "habitus" sensu Bourdieu (Jenkins 1992) of a human being in a social field is the implicit way his or her process of socialization has enacted a semiotic system of embodied habits, some of which have even physically formed the very animate body of the person - juxtaposing people from different times and classes; an attritioned third world coal miner, a North American teenager, or a 17th Century French noblewoman, may serve as illustration. No doubt, other theoretical notions in sociology could express culturally specific forms of human embodiment, but Bourdieu's concept of habitus suffices here, as it represents a set of class-
400
Claus Emmeche
specific dispositions generating specific practices and patterns of perception; it is embodied in individuals, and at the same time, it is a collective and homogeneous phenomenon, mutually adjusted for and by a social group or class. 26 Thus, on this genuinely anthroposemiotic level, biosemiotics as an approach does not suffice to capture the complexities and specificities of human embodiment. Animals have various needs and appetites, including sexual ones, whereas human desire, though developmentally presupposing appetites, cannot be reduced to them. Furthermore, consciousness in animals like apes and monkeys can exhibit quite complex forms of social cognition (Bekoff, AlIen and Burghardt 2002), but without a linguistic system (for symbolic representation of abstract structures) and a societal consciousness (of different language games, social roles, institutions and normative systems) even animal social consciousness is not that form of societal consciousness we find among humans, together with selfconsciousness in its more complex forms. One of those forms is conscience, internalized culture-specific schemata of right and wrong relations between self and other. Histrionics, normally a name for dramatic art, is used in the schematic list to signify the intimate relation of human embodiment and socio-cultural situatedness: "The naked animal" is not really human unless it realizes itself as a histrion, an actress on a social scene, a
26. Using the concept of habitus should make the distintion between animate and anthropic forms of embodiment clear. However, one could ask (a) what is embodied in what; (b) what is not embodied, and (c) if "embodied" could simply be replaced by "physical", "situated" or "embedded". As for (a), sociocultural history is embodied in social institutions, and the norms and rules of an institution are again embodied in a person functioning within such an institution. Thus, a person in some role incarnates a part of sociocultural history. Sometimes but far from always, forms of anthropic embodiment imprint their marks on the animate body of a person. Question (b) is like asking "what is not cultural?" when dealing with sociocultural systems. The answer depends on the purpose of the analysis (are we interested in the pervasiveness of cultural processes in a human lifeworld, or, e.g., in delimiting borders between a civil society and a state as distinct spheres of influence?). The answer to (b) depend on the purpose, focus and conceptual tools of any specific analysis. Regarding (c), similar remarks applies (cf. the schematic list). The distinct levels of organismic, animate, and sociocultural embodiment imply distinct forms of embedment and situatedness.
On the biosemiotics ofembodiment and our human cyborg nature
401
person mediating - as if by a social "mask,,27 - a general status in the social field "in front of the mask" and a narratively structured unique selfconsciousness "behind the mask" (cf. Welker 2000).
5.
Causality and emergent levels of interpretation
The above levels of embodiment can be located within an evolutionary macro-history and further differentiated in detail. The attempts to characterize more precisely what notions of causality should be necessary to account for the whole series of levels would lead into technicalities we will eschew here (see, e.g., El-Rani and Emmeche 2000). Instead, let us briefly describe the intuitions and demands that such a notion of causality should conform to. The idea from emergence theory (cf. review by Pihlstrom 2002) is that once a new emergent level is formed, it implies both new properties and other principles than those found on the previous level, but also has an effect to constrain the possibilities of processes and actions on the level below. This "downward" influence (Andersen et al. 2000; Thompson and Varela 2001) may be seen as a formative, structuring cause, not to be mixed up with the efficient cause of classical physics. Thus, a biosemiotic perspective demands a "Non-standard Neo-Aristotelian Pluralist notion of Causation (NNPC), that allow for dealing with a complex system of several levels. Within a NNPC, the classical notion of a temporal dyadic cause-effect relationship is just one element (often to be used for within-level explanations) of a more complete set of causes also including material causes (as answers to question about composition), formal causes (corresponding to structural constraints of a higher level upon its components), and final causes (that may take a level-specific form, and thus may be interpreted either as the enactment of anticipatory purposeful behavior within a cybernetic system, or more generally (and Peircean) as a pervasive form of structural causation of which the physical laws of nature are just an example).28 For the notion of inclusive levels of embodiment this means, (I) for the physics-biology border, that for instance the metabolic regularities of organismic embodiment of a grass constrain the physics of protein 27. Cf. the Latin persona, a mask used by an actor. 28. There is an unsolved question here about the relation between a Peircean conception of causality operating with efficient and formal causation and an even more pluralist notion as the NNPC. See also Hulswitt 2002; Emmeche 2004a; Andersen et al. 2000.
402
Claus Emmeche
action within its leaves; (11) for the biology-zoology border, that the cognitive action-reaction cycles of an animal's perception and movement constrain the formation of interpretants within its nervous system; and (Ill) for the zoology-anthropology border, that even societal structures like a market economy in specific ways may have constraining effects upon the embodiment of not just human freedom and action, but also the very animal existence of human beings. Another important distinction is the one between, on the one hand, a general ontological principle of embodiment stating that general forms have to be incarnated or realized in particular instances and can hardly exist disembodied as platonic forms (the general type needs its individual tokens to exist), and on the other hand, system-specific embodiment, i.e., embodiment holding for particular phenomena of mind, consciousness, meaning and significance, as these phenomena are embodied in particular system types (therefore the general principle that the mind is materially embodied must be specified). We have already seen the first principle exemplified by what we called semiotic embodiment: A particular general sign, such as a symbol, needs particular instances to embody it. The second principle was illustrated by the explanation of the scheme of levels of embodiment. Seeing different forms of embodiment both in a semiotic perspective and as evolutionarily emergent means seeing them as involving emergent levels of interpretation and understanding. One could learn all physiological facts about, say, the effects of iron nails against the skin of a person; perhaps useful for learning surgery, yet without getting to the essence of some phenomenon. Also the biology of wound healing and the neurophysiology of pain experiences may not satisfy our inquiry. The question is, what is the phenomenon? Closer to revealing fragments of that mystery may be developmental psychology, but at this stage we also have to ask questions about the specific sociocultural context - are we talking about skin piercing as a fashion of youth culture, or crucifixion as a tool for torturing political prisoners, or an alleged messiah?
6.
The cyborg nature of human embodiment
From the reflections on system-specific forms of embodiment thus far it transpires that biosemiotics is not enough to account for the characteristics of human embodiment. Some of these, such as the capacity to handle language and language-dependent societal processes, have often been seen as
On the biosemiotics ofembodiment and our human cyborg nature
403
disembodied, almost Platonic forms. But what may seem to be disembodied societal structures, like a state's constitution, a moral principle, or the very spoken language, are always specific historical products that indeed have to be realized in specific social fields and systems to be real. Thus the general metaphysical principle of embodiment holds, also for the societal level. The human animal was from its beginning societal, language-based, technical and political, but not all these determinations were equally developed. Let us just briefly look at the technological dimension that only became fully realized in the industrial phases of civilization, when we discover what we can call the intrinsic cyborg nature of humans, as technoculturally embedded beings within a space of meanings that are not only symbolic, but argumentative and culturally empowered by different kinds of social systems. Often, in cyber-punk and science fiction literature, the cyborg is seen either as the idea of a "melding of human and machine" and a "new era of participant evolution", or as people modified mechanically to perform specific tasks or redesigned to work in an alien environment (Stableford 1999). By a strangely familiar kind of reification, the machine is seen as something alien to the human body, and the melding of body and machine is then perceived as threatening human dignity or the 'essence' of being a human subject. 29 The anti-essentialist threat of the possibility, afforded by biotechnology, of a complete "hybridization" of a human organism and a machine has been used critically within feminist thinking (Haraway [1991]; Hayles 1999; Flanagan and Booth 2002) which in turn has inspired phenomenological approaches to technology (Ihde 2002). We will only use these sources to emphasize that the machine is not any "alien" entity. Though usually made of inorganic materials, any machine is a product of human work and its societal network of skills, ideas, thought, institutions, etc., and thus, we reify or blackbox the machine when we see it as a non-human entity.30 Marx was probably the first historian to make this point which is also a point about embodiment. On the level of human embodiment, tools, technical artefacts, machines, all embody human co29. Thus, one tends to forget that "The body in a contemporary society is already a cyborg body (a partly scientific-technologically governed body) as we are dependent, from birth to death, on the blessings of present-day medical science" (Emmeche 2002a: 155; see also Clark 2003. 30. This does not mean that there are no crucial differences between organismic and robotic forms of embodiment, as the work of Tom Ziemke (2003, see also this volume) clearly shows.
404
Claus Emmeche
operative division of labor, and of course, in our society, knowledgeintensive machines embody societal forms specific for a "post-industrial," "late capitalist," "risk" or "knowledge" society. Today the use of computers or "thinking machines" has become widespread, and we should be aware of the ease of blackboxing or reifying the semiotic processes involved in their interpretation. In current Western society, it is an aspect of the cyborg nature of human existence that machines both embody and hide complex social relations, and somehow make us forget the distributedness of processes like meaning and cognition, as we are led to commit failures of misplaced concreteness when we localize thought either in a single machine or a single brain. Our cyborg nature implies as a concept that technical artefacts, as one of the conditions of existence of the human form of life, embody human thought processes. This applies also to simple artefacts as part of the anthroposemiotic network of tools, as Peirce noticed: A psychologist cuts out a lobe of my brain (nihil animale me alienum puto) and then, when I fmd I cannot express myself, he says, "You see your faculty of language was localized in that lobe. " No doubt it was; and so, if he had filched my inkstand, I should not have been able to continue my discussion until I had got another. Yea, the very thoughts would not come to me. So my faculty of discussion is equally localized in my inkstand. It is localization in a sense in which a thing may be in two places at once. On the theory that the distinction between psychical and physical phenomena is the distinction between fmal and efficient causation, it is plain enough that the inkstand and the brain-lobe have the same general relation to the functions of the mind. (Peirce [1931-1958] 1902, CP 7.366; see also Noth 2003; Skagestad 1999).
When we have arrived at a phase of civilization, in which the societal form of human embodiment31 within the health care system has made us all ex31. Levels of embodiment (cf. the schematic list above) are partly dependent upon the epistemic interest and analytical perspective; this applies also for the above mentioned (a) social and (b) societal forms of anthropic embodiment. The kind of "social embodiment" reviewed in Barsalou et al. (2003) corresponds to (a) although some of the research reviewed tends to assume that, e.g., facial and bodily responses (examplifying social embodiment) are cross-cultural universals. The theory of Barsalou (cf. Lindblom and Ziemke this volume), especially the notion of simulators, is interesting (in spite of its metaphorical load of the brain as a computer) but raises this question: To which extent are simulators (e.g., "the simulator for the social category face", ibid., 67) specific for social
On the biosemiotics ofembodiment and our human cyborg nature
405
istentially connected to that system (and thus, to the system of technoscience that health care relies on), our cyborg nature takes on a double form. In a basic dimension, we continue to be that kind of animal, who can only survive in its present form by being interconnected to a global technosocietal system. On a supervening dimension, historically specific for a modem risk society, this interconnected existence change our very animal bodily appearance by the possibility of, e.g., prolonging life, exchanging body parts because of ageing and desires to be forever young, tendentially transforming us into the cyborgs of science fiction - what some has called "the posthuman condition".32 Due to the inclusivity of the forms of embodiment, the term posthuman is misleading though, as it is only now, when we come to consider ourselves to be beyond human nature, that our nature is revealed to us as at once human and cyborgian: To paraphrase Marx, the anatomy of the cyborg is the key to the anatomy of humans. We are already cyborgs, hooked-up to our fellow beings by tools and technology, and in late modernity, also via the technoscientific health care system. Human cyborg embodiment can take many forms, not all equally desirable, and will continue to be a contested issue in the biopolitics of the present century.
References Andersen, Peter. B., Niels O. Finnemann, Peder V. Christiansen and Claus Emmeche (eds.) 2000 Downward Causation. Minds, Bodies and Matter. Aarhus: Aarhus University Press. Barbieri, Marcello 2001 The Organic Codes. Cambridge: Cambridge University Press.
embodiment in humans? Could the theory also apply to, say, social embodiment in a herd of wolfs? And if so, in which sense is it a theory specifically explaining social embodiment in humans, in contrast to more general animal-based cognitive capacities? 32. Most famous in the posthuman discourse is Fukuyama (2002); see also Hayles (1999).
406
Claus Emmeche
Barsalou, Lawrence W., Paula M. Niedenthal, Aron K. Barbey and Jennifer A. Ruppert Social embodiment. In: Brian H. Ross (ed.), The Psychology of 2003 Learning and Motivation 43: 43-92. San Diego, CA: Academic Press. Bekoff, Marc, Colin AlIen and Gordon M. Burghardt 2002 The Cognitive Animal. Empirical and Theoretical Perspectives on Animal Cognition. Cambridge, Mass.: MIT Press. Brooks, Rodney A. 2002 Flesh and Machines. New York: Vintage Books. Clark, Andy An embodied cognitive science? Trends in Cognitive Science 3 (9): 1999 345-351. 2003 Natural-Born Cyborgs. Oxford: Oxford University Press. Csordas, Thomas J. 1996 Body. In: Adam Kuper and Jessica Kuper (eds.), The Social Science Encyclopedia. 2nd ed., 55-56. London: Routledge. EI-Hani, Charbel Nifio and Claus Emmeche 2000 On some theoretical grounds for an organism-centered biology: Property emergence, supervenience and downward causation. Theory in Biosciences 119 (3/4): 234-275. EI-Hani, Charbel Nifio, Joao Queiroz and Claus Emmeche 2005 A semiotic analysis of the genetic information system. (unpublished manus., in prep.) Emmeche, Claus 2001 Does a robot have an Umwelt? Reflections on the qualitative biosemiotics of Jakob von UexkiilI. Semiotica 134 (1/4): 653-693. 2002 a Kroppens kaput som organisme. In: Gert Balling (ed.), Homo sapiens 2.0. Nar teknologien kryber ind under huden. 121-156, 238-239. Kebenhavn: Gads Forlag. 2002 b The chicken and the Orphean egg: On the function of meaning and the meaning of function. Sign Systems Studies 30 (1): 15-32. 2004 a Causal processes, semiosis and consciousness. In: Johanna Seibt (ed.), Process Theories: Crossdisciplinary Studies in Dynamic Categories, 313-336. Dordrecht: Kluwer. 2004 b Alife, organism and body: The semiotics of emergent levels. In: Mark Bedeau, Phil Husbands, Tim Hutton, Sanjev Kumar and Hideaki Suzuki (eds.), Workshop and Tutorial Proceedings. Ninth International Conference on the Simulation and Synthesis of Living Systems (Alife IX), 117-124. Boston Massachusetts.
On the biosemiotics ofembodiment and our human cyborg nature
407
Emmeche, Claus, Simo Keppe and Frederik Stjernfelt 1997 Explaining emergence - towards an ontology of levels. Journal for General Philosophy ofScience 28: 83-119. Emmeche, Claus, Kalevi Kull and Frederik Stjernfelt 2002 Reading Hoffmeyer, Rethinking Biology. (Tartu Semiotics Library 3). Tartu: Tartu University Press. Flanagan, Mary and Austin Booth (eds.) 2002 reload: rethinking woman + cyberculture. Cambridge, Mass.: MIT Press. Fukuyama, Francis 2002 Our Posthuman Future: Consequences of the Biotechnology Revolution. New York: Farrer, Strauss and Giroux. Gilbert, Scott F. and Sahotra Sarkar 2000 Embracing complexity: Organicism for the 21st Century. Developmental Dynamics 219: 1-9. Haraway, Donna 1985 A cyborg manifesto: science, technology and socialist-feminism in the late twentieth century. In: Donna Haraway (ed.), Simians, Cyborgs and Women: The Reinvention ofNature, 149-181. New York: Routledge. Reprint 1991. Hausman, Carl R. 1993 Charles S. Peirce's Evolutionary Philosophy. Cambridge: Cambridge University Press. Hayles, N. Katherine 1999 How We Became Posthuman. Chicago: The University of Chicago Press. Hoffmeyer, Jesper 1996 Signs of Meaning in the Universe. Bloomington: Indiana University Press. Hulswitt, Menno 2002 From Causation to Cause. A Peircean Perspective. Dordrecht: Kluwere Ihde, Don 2002 Bodies in Technology. Minneapolis: University of Minnesota Press. Jablonka, Eva 2001 The systems of inheritance. In: Susan Oyama, Paul E. Griffiths and Russell D. Gray (eds.), Cycles of Contingency, 99-116. Cambridge, MA: MIT Press. Jenkins, Richard 1992 Pierre Bourdieu. New York: Routledge. Kauffman, Stuart 2000 Investigations. Oxford: Oxford University Press.
408
Claus Emmeche
Keller, Catherine 2003 Embodiment. In: J. Wentzel Vrede van Huyssteen (ed.), Encyclopedia ofScience and Religion. Vol. I, 254-256. New York: Macmillan Reference. Kull, Kalevi (ed.) 2001 Jakob von Uexkull: A paradigm for biology and semiotics. Berlin and New York: Mouton de Gruyter (= Semiotica 127 (1/4): 1-828). Kull, Kalevi 1999 Biosemiotics in the twentieth century: A view from biology. Semiotica 127 (1/4): 385-414. Lakoff, George and Mark Johnson 1999 Philosophy in the Flesh. The Embodied Mind and its Challenges to Western Thought. New York: Basic Books. Lawrence, Christopher J. 1981 Irritability/sensibility. In: W.F. Bynum, E.J. Browne and Roy Porter (eds.), Macmillan Dictionary ofthe History ofScience, 214. London: Macmillan Press. Markos, Anton 2002 Readers ofthe Book ofLife. Oxford: Oxford University Press. Marx, Karl The Capital. Vol. I. London: Penguin Classics. Reprint 1990. 1867 Mayr, Emst 1982 The Growth ofBiological Thought. Harvard, Cambridge: Belknap. Noth, Winfried 2003 Semiotic machines. S.E.E.D. Journal (Semiotics, Evolution, Energy and Development) 3 (3): 81-99. Peirce, Charles Sanders 1902 Collected Papers of Charles Sanders Peirce. Vol. 1-6, Charles Hartshome and Paul Weiss (eds.); Vols. 7-8, Arthur W. Burks (ed.). Cambridge, Mass.: Harvard University Press. [References: CP, followed by vol. and paragraph number]. Published 1931-1958. Petrilli, Susan 1999 About and beyond Peirce. Semiotica 124 (3/4): 299-376. Pihlstrom, Sami 2002 The re-emergence of the emergence debate. Principia 6 (1): 133181. [Universidade Federal de Santa Catarina, Florian6polis, Brasil] Queiroz, Joao and Sidharta Riberio 2002 The biological substrate of icons, indexes and symbols in animal communication: a neurosemiotic analysis of Vervet monkey alanncalls. In: Michael Shapiro (ed.), The Peirce Seminar Papers: Essays in Semiotic Analysis, Vol. 5, 69-78 New York: Berghahn Books.
On the biosemiotics ofembodiment and our human cyborg nature
409
Riegler, Alexander, Markus Peschl and Astrid von Stein (eds.) 1999 Understanding Representations in Cognitive Science. Does representation need reality? Dordrecht: Kluwer. Rosen, Robert 1991 Life Itself. A Comprehensive Inquiry Into the Nature, Origin and Fabrication ofLife. New York: Columbia University Press. Salthe, Stanley N. 1989 Self-organization of/in Hierarchically Structured Systems. Systems Research 6 (3): 199-208. Santaella Braga, Lucia 1999 A new causality for understanding the living. Semiotica 127 (1/4): 497-519. Santaella, Lucia 2003 What is a symbol. S.E.E.D. Journal (Semiotics, Evolution, Energy and Development) 3 (3): 54-60. Scott, Alwyn 2004 Reductionism revisited. Journal of Consciousness Studies 11 (2): 51-68. Sheets-Johnstone, Maxine 1998 Consciousness: A natural history. Journal ofConsciousness Studies 5 (3): 260-294. Short, Thomas L. 1988 The growth of symbols. Cruzeiro Semiotico 8 (Janeiro): 81-87. (Porto, Portugal). 1982 Life among the legisigns. Transactions ofthe Charles Sanders Peirce Society 18 (4): 285-310. Skagestad, Peter 1999 Peirce's inkstand as an external embodiment of mind. Transactions ofthe Charles S. Peirce Society 35: 551-561. Stableford, Brian 1999 Cyborg. In: John Clute and Peter Nicholis (eds.), The Encyclopedia ofScience Fiction, 290-291. London: Orbit. Stahl, Gerry 1975 Marxian Hermeneutics and Heideggerian Social Theory: Interpreting and Transforming our World. Ph.D. thesis, Northwestern University, Evanston, Illinois. (also available at http://www.cs.colorado.edu /~gerry/publications/dissertations/philosophy) Thompson, Evan and Francisco J. Varela 2001 Radical embodiment: Neural dynamics and consciousness. Trends in Cognitive Sciences 5 (10): 418--425.
410
Claus Emmeche
Van de Vijver, Gertrudis, Stanley Salthe and Manuela Delpos (eds.) 1998 Evolutionary Systems: Biological and Epistemological Perspectives on Selection and Self-Organization. Dordrecht: Kluwer. Van de Vijver, Gertrudis, Linda Van Speybroeck and Windy Vandevyvere 2003 Reflecting on complexity of biological systems: Kant and beyond? Acta Biotheoretica 51 (2): 101-140. Varela, Francisco J., Evan Thompson and Eleanor Rosch 1991 The Embodied Mind: Cognitive Science and Human Experience. Cambridge, MA: MIT Press. Varzi, Achille 2004 "Mereology", The Stanford Encyclopedia of Philosophy (Fall 2004 Edition). In: Edward N. Zalta (ed.), URL = http://plato.stanford.edu/ archives/fa1l2004/entries/mereology/ von Uexkiill, Jakob 1940 Bedeutungslehre. (Bios 10. Johann Ambrosius Barth, Leipzig), [English translation: The theory of meaning, Semiotica 42 (1): 2582, 1982]. von Uexkiill' Thure 1982 Introduction: Meaning and science in Jakob von Uexkiill's concept of biology. Semiotica 42 (1): 1-24. Welker, Michael 2000 Is the autonomous person of European modernity a sustainable model of human personhood? In: Niels Henrik Gregersen, Willem B. Drees and UlfGorman (eds.), The Human Person in Science and Theology, 95-114. Edinburgh: T&T Clark. Ziemke, Tom 2003 What's that thing called embodiment? In: Richard Alterman and David Kirsh (eds.), Proceedings of the 25 th Annual Meeting of the Cognitive Science Society, 1305-1310. Mahwah, NJ: Lawrence Erlbaum. Ziemke, Tom and Noel E. Sharkey 2001 A stroll through the worlds of robots and animals: Applying Jakob von Uexkiill's theory of meaning to adaptive robots and artificial life. Semiotica 134 (1/4): 701-746. Zurek, Wojcieh H. 1989 Thermodynamic cost of computation, algorithmic complexity and information metric. Nature 341: 119-124.
Embodiment and self-organization of human categories: A case study for speech
Luc Steels and Bart de Boer
Abstract The paper considers explanations for the kinds of categories that have been found to be involved in human behavior. It insists that embodiment not only plays an important role in shaping these categories, but also that the collective dynamics generated by social interaction is of equal importance. A case study is developed for speech sounds. We show through computer simulations how a group of autonomous agents equipped with a (sufficiently) realistic perceptual and auditory apparatus can arrive at a shared repertoire of vowels and that these vowel systems exhibit the same universal trends that are found in human vowel systems. It is significant that this happens without innate a priori categories. Keywords: Embodiment, categorization, self-organization, speech.
1.
Introduction
The term "embodied" is being adopted by a growing number of researchers interested in cognition (Ziemke 2003), but often with different meanings. At least three senses of embodiment are being used: The first one emphasizes that cognitive processes, such as those required for vision or language, are to be implemented in terms of networks that have neural plausibility, both in terms of their computational microstructure and in terms of the general architecture of the brain. While there is of course a wide consensus that cognitive processes must somehow be mapped onto brain hardware, the "strong embodiment" hypothesis claims more, namely, that without this neural realism many cognitive phenomena cannot be understood. This line of reasoning finds its most enthusiastic defenders among phi-
412
Luc Steels and Bart de Boer
losophers like Searle and those who argue that the brain should not be viewed as an information processor at all (e.g. Edelman 1990). The second type of embodiment research first manifested itself in "behavior-based AI" (Steels and Brooks 1995). It argues that part of our interaction with the world is not explicitly controlled by any kind of information processing. Instead, the physical properties of the body, the physical characteristics of sensors and actuators, the structure of the task, and the physical properties of the environment all play an important role in shaping actual behavior. This line of research has proven to be very fruitful for designing and building autonomous robots that interact with real-world environments in real time (Pfeifer and Iida 2004). A neat example is the robot Stumpy that exhibits locomotion without any kind of explicit control or sensing (Iida, Dravid and Paul 2002). The third sense of embodiment refers to the grounding and expression of human concepts. It argues that many concepts, even abstract mathematical concepts, find their ultimate foundation in grounded interaction with the world (Lakoff 2000) and that this is still visible in the metaphors and expressions used to communicate using these concepts. For example, the temporal preposition "back" (as in "back in time") is historically derived from a body part (the back) which first became used to indicate a spatial relation or area (as in the back of the car) and then a temporal relation (see also Nlifiez 1999). It follows from this position that conceptualization and concept acquisition must be embedded in a developmental history of embodied interaction with the world. This paper explores the second view on embodiment, as manifested in "embodied AI", and tries to demonstrate what kind of explanatory power can be drawn from it. We take the domain of speech sounds as a case study because we believe that without studying concrete cognitive tasks and showing what embodiment can bring, the discussion may become too divorced from reality. Section 2 contrasts classical theories of speech that do not take embodiment into account with theories grounded in the function of speech that do take embodiment into account. Section 3 presents theoretical considerations about what constitutes an explanation in cognitive science as well as an overview of the phonological phenomena under consideration. Section 4 presents a case study in which the role in explaining universals of speech sounds of embodiment and situatedness are investi-
Embodiment and self-organization ofhuman categories
413
gated with a computer model. Section 5 presents a summary and the conclusion.
2.
Embodied versus disembodied speech research
The domain of speech is known to be extraordinarily difficult to understand and model because it not only involves complex fine-grained motor control of the vocal articulators for producing speech sounds, but also complex acoustic signal processing and categorization for recognizing speech sounds. Therefore, there is at least the potential for a strong role of embodiment. At the same time speech also seems to imply abstract categories (in the sense that some distinctions are relevant in a language and others are not) and abstract rules (such as "a voiced consonant becomes unvoiced at the end of a word", a rule which holds for Dutch but not for English). There is a long tradition in linguistics, shaped by the Prague structuralists, with Roman Jakobson (Jakobson and Halle 1956) as its most prominent member, and Chomsky and Halle's influential work on abstract phonology (Chomsky and Halle 1968), that completely ignores the embodied aspects of speech. Instead, these linguists move immediately to an abstract level in terms of distinctive features (such as +/- voiced, +/- labial, +/aspirated, etc.). These distinctive features are assumed to be both grounded in articulation, as well as in acoustic recognition and categorization. For example, voiced means that the vocal chords are vibrating while a sound is being produced, such as in the final consonant of "bad" (voiced) versus "bat" (voiceless). Yet it is never specified in detail how exactly this grounding is realized. Abstract phonologists simply use the abstract features to formulate phonological constraints (e.g. which combinations of consonants and vowels are allowed in a language). But it turns out that it is very hard to recognize these features in the acoustic data and that they are not sufficiently precise to realistically drive motor control. Moreover this branch of phonology often assumes that the features are genetically predetermined as part of the innate language acquisition device (LAD) and argues that systematicity is genetically encoded in terms of a set of principles with parameters to be set by cues from the environment (Dresher 1992). Abstract phonology is therefore a prime example of disembodied cognitive science and it contrasts with work by another group of phoneticians and phonologists who employ what can be considered an embodied ap-
414
Lue Steels and Bart de Boer
proach to speech (see e.g. Lindblom and Maddieson (1988); Browman and Goldstein (1992)). They start from the actual acoustic signals and from sophisticated articulatory models, taking constraints coming from the physics of the world and embodiment into account. For example, many coarticulation phenomena are no longer assumed to be governed by rules but simply follow from the dynamics of motor control. Grounding now plays a role in explaining the acquisition of speech sounds or why certain sounds or sound combinations are occurring (MacNeilage 1998). Perception is always coupled tightly to production (Plaut and Kello 1999). The research discussed in this paper fits within this embodied view of speech. Section 3 presents theoretical considerations about what constitutes an explanation in cognitive science
3.
Explanatory theories
Before delving into the concrete case study presented in this paper, it is worthwhile to make some general methodological remarks about the nature of theories in cognitive science. It is well known that categorization is a central topic in cognition; indeed we can say that a behavior becomes cognitive only when some form of categorization is implied. Traditionally, linguistics, psychology and anthropology have collected a large number of facts about human categorization. For example, Berlin and Kay (1969) collected data about the kinds of color categories found in human languages by doing naming and memory experiments, as first pioneered by Lenneberg and Roberts (1956). Subjects are shown a series of color samples (the Munsell chips) and are asked to show the best representative for a specific color like "red". Berlin and Kay found that there are some surprising universal trends in focal colors and their work has generated a wide body of literature that seeks to identify additional evidence which confirms (Kay and Regier 2003) or refutes (Davidoff 2001) these basic human color categories. Another domain that has been studied extensively is space. Here again research has shown that there are universal tendencies in spatial categories. For example, IN, ON, LEFT OF, etc. are very common categories in human spatial reasoning and language (Herskovits 1986) and so it has been argued that they are largely innate. But on the other hand, comparative studies of the expression of spatial categories has shown significant intercultural differences. Also, children only gradually acquire spatial cate-
Embodiment and self-organization ofhuman categories
415
gories under the influence of their native language (Bowerman and Choi 2001). Similarly the categories (and hence the articulatory gestures) used in sound systems of human languages show remarkable regularities. Languages do not use random subsets of the many possible speech sounds, but rather certain sounds occur more often than others. Although humans are able to produce and distinguish an amazing number of different speech sounds, the average number of distinctive speech sounds that is used in the languages of the world lies between 20 and 37 (Maddieson 1984). Some sounds, such as [a], [m] or [k] occur almost universally, while others occur only rarely. For example, English does not use the rather rare vowel [y] (pronounced as in French "rue") whereas French (and Dutch) do. Phoneme inventories tend to exhibit symmetries as well. If, for example, a repertoire contains [0] it is very likely to also contain [e]. Such symmetries are also found in consonant systems (Lindblom and Maddieson 1988). Languages furthermore constrain the set of possible sound combinations. For example, in English [mb] can occur at the end of a word as in "lamb" but not in the beginning, whereas in some African languages this is possible (as in Swahili "mbali" (far)). So sounds fall into classes and the classes form a combinatorial system. Again there appear to be universal tendencies in these sound combinations although they have been less adequately surveyed. Regularities like these (and many others can be demonstrated, including universal tendencies in grammar) demand an explanation. Linguists and psychologists in the disembodied tradition often propose that these universal tendencies are based on innate mechanisms (e.g. the distinctive phonological features and their markedness is assumed to be part of the innate language faculty), and so they are said to be universally shared by all human beings, the same way each of us has (normally) five fingers on each hand. But that means that these theories simply pass the buck to evolutionary biologists. However, biologists are often not so keen about ascribing complex human behavior to innate features, partly because it seems rather implausible that the micro circuitry of the brain is under detailed genetic control (Edelman 1990), and partly because we would still need an evolutionary scenario to explain for instance why [y] (as in French "rue") is not common and [a] (as in English "bar") is, or why a language with four color terms usually has terms for either green or yellow. So simply saying that human categories are (to a large extent) innate is not a sufficient explanation; we also need to show which neural structures can perform the categorization, how these neural structures could be ge-
416
Luc Steels and Bart de Boer
netically determined in development and how the relevant genes could have become universally present in human populations. All of these are non-trivial tasks that remain to be carried out. Another difficulty for the genetic position is that without exception, human categorization is not absolutely universal across all human populations and cultures. Rather it shows significant variation. Although this variation is supposedly captured in parameters set during development, it is unclear how such a parameterized system could evolve under realistic selection pressures based on language communication. If genetics is not very satisfactory as a source of explanations, we should perhaps investigate other ways in which certain categories like colors, spatial categories or speech sounds can become shared (universally or locally). Three possible determining factors have been discussed: (1)
(2)
(3)
Either human embodiment is so constraining that it is not necessary to make the categories innate because they could never be perceived or the motor movements could never be made. For example, the human vision system restricts perception to a certain range of the spectrum; the human vocal tract is limited to certain number of sounds, the human body suggests a division into front and back, etc. This enables and limits the set of possible solutions for a specific task. Human beings interact socially and this creates a collective dynamics that may also constrain individual choices, pushing them in the direction of a shared system. For example, in Britain cars drive on the left side of the road and in continental Europe they drive on the right side. This is not because British drivers have "left driving genes" so that they are born to drive on the left, or that their embodiment makes driving on the right impossible, but rather because a cultural choice was made and this choice is enforced by the subsequent behavior of all involved. Even if somebody would like to drive on the right in England, they would be quickly sanctioned or maybe even lose their life. It could be that the regularities in the environment are such that a statistical learning process picks them out easily and so there would be no need to encode the category in the genome. For example, the chromatic distribution of colors in the world might show enough statistical regularity that statistical clustering algorithms could detect them.
Embodiment and self-organization ofhuman categories
417
In the work being carried out by our research group, we have been exploring explanations for human categorization in which these three factors play a role. A good example is our research on color categories, in which we have attempted to explain the universal tendencies in color categories using embodiment, the collective dynamics generated by social interaction, and the statistical structures inherent in the world (Steels and Belpaeme 2005). In the domain of language in general and speech sounds in particular, the statistical regularity is not present in the world independently of the agents. So only factors (1) and (2) are relevant, particularly when a totally new system is being bootstrapped. In the case study that follows, originally described in (de Boer 1999, 2000), embodiment and the dynamics of language use in a population are investigated as factors explaining regularities in systems of human speech sounds. As vowels are the simplest speech sounds to model, this paper focuses on vowels, but there also exists work on syllables and syllable structures (Steels and Oudeyer 2000; Redford, Chen and Miikkulainen 2001). Our main point is that embodiment in itself is not sufficient to explain many universal tendencies or culturally accepted norms. Earlier embodiment approaches have shown that embodiment constrains the possibilities, e.g. that there are limitations of actuators and sensors which prevent or discourage humans from using certain sounds. In addition, these approaches demonstrate that typical speech categories are in some sense optimal (see e.g. Lindblom, MacNeilage and Studdert-Kennedy. 1984). Specifically the sounds in a language are claimed to have a particular distribution and structure that is optimized for the task of reliably producing and perceiving sounds, and hence the sounds are maximally contrastive. If that is the case, then embodiment and the constraints of the task are sufficient to explain why human languages show particular sounds and not others. Nonetheless, observations of real phonetic data show that suboptimal phonetic systems also exist in many languages. For example, the UPSID database of human sound systems lists two examples with suboptimal 3vowel systems, namely with the sounds [e] [a] [0] instead of the more common optimal [i] [a] [u]. They come from a North-American Indian language, Alabaman (Rand 1968) and a Peruvian language, Amuesha (Fast 1953). These 3-vowel systems are suboptimal because they do not provide the maximal contrast in the acoustic space. Therefore embodiment (and the constraints of the task) are in themselves not sufficient to explain the vowel
418
Lue Steels and Bart de Boer
systems one finds in human languages. One also has to take the collective dynamics of language use in a population into account. All users have to conform to the system of speech sounds that has historically been chosen in order to achieve communicative success. This system is partly arbitrary and may be suboptima1.
4.
Experiments in self-organization of speech sounds
To examine whether the structure of vowel systems is sufficiently determined by self-organization in a population under constraints of embodiment, we need to find a mechanism that incorporates these two factors, and then show that this indeed leads to emergent vowel systems that reflect the universal tendencies observed in human vowel systems. In our case, we use a population of artificial agents that can produce, perceive and remember speech sounds in a human-like way. Each agent is equipped with an articulatory synthesizer, a model of human perception for calculating the distances between different signals, and an associative memory for storing vowel prototypes. Also, each agent can interact with other agents (following a fixed pattern) by imitating them. These interactions are called imitation games, and are in the same line as the "language games" we have used in our work for the past decade (see Steels 2003). The agents can update their vowel repertoires depending on the outcome of the interactions in such a way that the expectation of future imitation success is maximized. The agents' goal in life is to imitate the other agents as well as possible with a repertoire of vowels that is as large as possible. However in doing this, the agents only use local information, and do not carry out any explicit optimization. The behavior of the agents is summarized in table 1 at the end of section 4.1.
4.1.
Embodiment constraints on perception and production
The repertoire of speech signals that humans can produce is limited by the physics and the physiology of the vocal tract (figure 1). The repertoire of sounds that they can distinguish is limited by the properties of the auditory apparatus. In this example the focus is on vowels, and therefore only the articulatory and perceptual properties of vowels are integrated. These properties are also better understood than those of consonants. This has to
Embodiment and self-organization ofhuman categories
419
do with the fact that vowels are static signals, while many consonants are dynamic, i.e. their articulation and their signal do not remain constant over time.
:fculators
Alveolum (teeth ridge)
\
Teeth
1 1
Lips
.~~-
\I,
..
.........
Tip
IBlade Tongue
Figure 1.
-GloWs
Structure of the human vocal tract. Speech sounds are produced by very fme-grained movements of the articulators
The acoustic signal of a vowel can be described relatively straightforwardly by the resonances of the vocal tract. These resonances cause prominent peaks in the frequency spectrum of a vowel signal. The frequencies at which these peaks occur are called the formant frequencies. This is illustrated for the vowel [a] (as in "father") by the black line in figure 2. For different vowels, these peaks occur in different places. Vowels can therefore be uniquely characterized by their formant frequencies. In practice, only the three or four resonances at the lowest frequencies are relevant. In the simulations discussed here, no acoustic signals are generated, although it would be straightforward to add the extra complexity. Vowels are represented by their first four formant frequencies. If an agent wants to generate a given vowel, it synthesizes these formants using an articulatory
420
Lue Steels and Bart de Boer
synthesizer. This synthesizer takes as inputs the three major vowel parameters: tongue position, tongue height and lip rounding (Ladefoged and Maddieson 1996, Ch. 9). The outputs of the synthesizer are the first four formant frequencies of the corresponding vowel. The inputs are modeled as real values in the range 0 to 1. For tongue position, 0 means most to the front and 1 means most to the back. For tongue height, 0 means lowest and 1 means highest. For lip rounding, 0 means least rounded and 1 means most rounded. Thus (0, 0, 0) corresponds to the vowel [a] and to the formant frequencies (708, 1517, 2427, 3678) Hertz. The parameters (1, 1, 1) correspond to the vowel [u] and to the formant frequencies (276, 740, 2177,3506) Hertz. Formant values were taken from (ValIee 1994). This synthesizer is able to generate all possible basic vowels.
2..,.---------------------, o• 700 H4380 Hz ..
~
-2 \ . / ·4·
A
;.~
-~
~580
\ '
Hz
V~C
-!---".....-....r----,,--.....,..--..,.,....----l
o
1000
2000 1, q
Figure 2.
3D
4000
0000
5000
n(;:y(~)
The fIrst four fonnants of the vowel [a] black line, and its perception in tenns of the effective second fonnant (gray line)
Humans also perceive vowels based on their formant frequencies. These can be used to calculate a perceptual distance between vowels. Unfortunately, the distance calculation is not as simple as calculating a Euc1idean distance between two formant vectors. As the bandwidth of the sound receptors at higher frequencies is greater than those at lower frequencies, blurring of spectral detail takes place at higher frequencies. This means that the formant peaks at higher frequencies can generally be compressed into one broader peak. The center frequency of this peak is called the ef-
Embodiment and selforganization ofhuman categories
421
fective second formant. The first peak in the spectrum is usually perceived as remaining in the same place, and is still referred to as the first formant. The perception in terms of the first and effective second formant of the signal [a] is shown as the gray line in figure 2. Note that this is an idealized view to illustrate the process. While there are several ways to calculate the effective second formant, the one adopted in the research described here has been developed by Schwartz et al. (1997b). It is a non-linear weighted average of the 2nd, 3rd and 4th formant frequencies. In order to calculate distances between vowel signals, the first and effective second formant are expressed in the Bark frequency scale. The Bark scale is a perceptually inspired frequency scale that can be considered to be logarithmic for the frequencies that are relevant to formants. Equal frequency differences in Bark are perceived as equal intervals, in contrast to the way frequencies in Hertz are perceived.
..
21 1
~
I I
16
12
14
F (Bark) 2
Figure 3.
10
8
8
16
14
12 F (Bark) 2
10
2
a\.
~4
!!!.
g
LL9""
t .0.
6
;:
1
8
LL9""
10000 games
~.
!!!.
~
:'
6!
~4
,.
~4
~ 41 LL....
2000 games
~
~.
;
.
500 games
20 games
.-:.....
~.
...-:1·
LL'" '.
.J. 8
8
16
14
12 10 F (Bark) 2
" ,
8
8 16
14
...
..
~
~
'"
....... :
12
10
8
F (Bark) 2
Emergence of a vowel system in a population of twenty agents.
Using the first and effective second formant frequencies in Bark, the distance between two vowels can be calculated as an ordinary weighted Euclidean distance. As differences in the effective second formant are perceptually less important than differences in the first formant, the effective second formant is multiplied by 0.3 when calculating distances. These distances are then used to determine which vowel in an agent's repertoire is recognized. This is done by calculating the distance between a perceived signal and all the vowels in an agent's repertoire. The vowel that is closest to the perceived signal is considered to be the one that the agent heard.
1
Initiator If ( V = ) Add random vowel to V Pick random vowel v from V uv uv + 1 Produce signal A1: A1 acv + noise
Imitator
Receive signal A1. If ( V = ) Find phoneme( vnew, A1 ) V V vnew Calculate vrec: ¬ v2 :
vrec V
( v2
V
D ( A1 , acv 2 ) < D ( A1 , acv rec ) )
Produce signal A2: A2 + noise 3
2
acvrec
Receive signal A2. Calculate vrec: ¬ v2 :
vrec V
( v2
V
D ( A2 , acv 2 ) < D ( A2 , acv rec ) )
If ( vrec = v ) Send non-verbal feedback: success. sv sv + 1 Else Send non-verbal feedback: failure.
5
Do other updates of V.
Receive non-verbal feedback. Update V according to feedback signal. Do other updates of V.
4 5
Embodiment and self-organization ofhuman categories
4.2.
423
Results
The agents all start out with an empty vowel repertoire. By playing imitation games with each other, the agents have to develop a vowel system that is as large as possible, that allows for successful communication and that should be realistic if self-organization and embodiment are really factors in explaining the structure of human vowel systems.
4.2.1. Emergence ofa vowel system The emergence of a vowel system in a population of twenty agents under 10% acoustic noise is shown in figure 3. In each of the frames of the figure, the acoustic aspects of the prototypes of the agents' vowels in the population are plotted in acoustic space. In this particular acoustic space (based on the first and effective second formant frequencies of a vowel signal), equal distances between points correspond to equal perceptual distances. Each vowel of each agent in the population is represented by a dot. Note that due to articulatory constraints, only a roughly triangular area of the acoustic space is available to the agents. This is indicated in the fourth frame of figure 3. From the figure it is clear that after the first 20 games the agents still only have very few vowels. The vowels that exist are more or less randomly dispersed through the acoustic space, although some of them already show a tendency to cluster. This is caused by the fact that all agents start out with an empty vowel repertoire. In order to get the imitation games started, random vowels are inserted. However, the imitating agents in the games try to make imitations that are as close as possible and add these to their vowel repertoires. This accounts for the clustering. After some 500 imitation games, shown in the second frame, the clustering has become more pronounced. The most important process at this moment is the compacting of the clusters due to the fact that the agents move their vowel prototypes closer to the signals they perceive. However, there is still sufficient room in the auditory space for extra vowels, so the random addition of new vowels also plays a role. After 2000 games, the available vowel space becomes filled more evenly with vowels and the shape of the vowel system becomes more realistic. After 10.000 imitation games, the available acoustic space has become more or less filled up with vowels and the vowel system has become realistically symmetric and dispersed. After this
424
Lue Steels and Bart de Boer
has happened, the vowel system remains stable. However, it is not static. The vowel prototypes of agents (and therefore the clusters) tend to move, and it is even possible that they merge or that new clusters are formed (if they do not interfere with other clusters). Type B (10/54)
Type A (30/54)
2
lil~
••••\
.~.
• -a/ul
•ift..f a"~. , IfJ ,.:.• ~ I'JI lal
.~~
6
¥
/',1 ••" ..
e?.
6
/fB/
i·
•
,i 101
•
8 1...-
Type D (7/54)
Type E (3/54)
Type F (1/54)
10
8
lif •••
li! .,
2
• ,. lul
~ ~
2
•• hI
/if •
lel •
..
.t.
.Ial
•
• • I'JI
------'
10
8
8 16
L--
lce/·
• 101
"""'--~_..........,
14
12
• lul
/tl
• Jul
4
8
2
la/ •
...0--
12
14
2
lal • • a 14
16
------'
F (Bark)
12
F2 (Bark)
Figure 4.
.- • I'J/
F (Bark)
14
• • I'JI
16
... 12 10 F (Bark)
• /01
8 1...-
/ 1 ,.:: u
.
u......
lal • :-.: 8 '---------.;.,....--"--.........., 16 14 12 10 8
----..J
lit ......
/01
1£/ .,... • • •
2
2
...
e•• : .
u...... 6
2
.::. : lu!
lel .~I·
4
UI
lal .-.,. ..\ ~,r _ """--_
8 '--16
.p
2
Type C (7/54)
10
8
F (Bark) 2
8 I...16 14
"""'-----"-_----.J
12
10
8
F2 (Bark)
Emerged vowel systems with six vowels.
4.2.2. Evaluation ofthe emerged vowel system Are the vowel systems that emerge realistic? If in the emerged systems the same types of vowel systems are found in the same proportions as in human languages, they are realistic. The classifications of human vowel systems that were used as a reference were the ones by Crothers (1978) and by Schwartz et al. (1997a). An example of a classification of emerged vowel systems containing six vowels is shown in figure 4. The data in this figure were obtained by run-
Embodiment and self-organization ofhuman categories
425
ning the simulation a hundred times for a given parameter setting (acoustic noise set to 12%). In each run 25 000 imitation games were played. Note that although the frames in figure 4. look similar to the figures of vowel systems shown previously in figure 3, there is a crucial difference. In previous figures, the agents shown in one frame were all members of the same population (and had therefore interacted with each other). In the present figure, all agents shown are members of different populations. The fact that there still is a large amount of similarity between the vowel systems of agents from different populations is a strong demonstration of how self-organization can make populations converge towards similar vowel systems. The emerging systems are realistic. Most of them conform to the universals that Crothers (1978) found for human vowel systems. When the percentages with which the different emerged systems occur are compared to the percentages with which human vowel systems occur, a good match is found as well. Schwartz et al. (1997a) have measured the occurrence of different vowel system types in the different languages in the UCLA Phonological Segment Inventory Database (UPSID, a database based on speech sound data of 451 languages, Maddieson 1984; Maddieson and Precoda 1990). They find 60 vowel systems with 6 vowels. Although their classification is not exactly the same as the classification shown in figure 4, there is good agreement. Of the systems they found in UPSID, 43% is of type A, 20% is of type B, 5% is of type C, 7% is of type D and 20% is of type E. No systems of type F were found, and two of the systems from UPSID (3%) cannot easily be fitted into the classification used here, but are probably of type A. Equally good agreement was found for systems of 4 and 5 vowels and reasonable agreement was found for systems of 7 and 8 vowels (de Boer 1999, ch. 6). It seems that the simulation is capable of not only predicting the most frequently found vowel systems in human language (as was already possible with systems that optimize acoustic distinctiveness), but also of predicting the less frequently occurring vowel systems and, approximately, their relative abundance.
5.
Conclusions
The results of the simulation show that vowel systems can emerge through self-organization in a population of agents whose production and perception is constrained by their embodiment. Although the agents start with an
426
Luc Steels and Bart de Boer
empty vowel repertoire and do not have any constraints on the kinds of vowel systems they can learn, the vowel systems that emerge in the population tend to be symmetric and dispersed, and optimized for acoustic distinctiveness. The model also predicts the types of vowel systems that occur less frequently and their abundance with a reasonable degree of accuracy. This illustrates that dynamics and embodiment play an important role in determining the structure of phonological and phonetic systems. The importance of embodiment both for constraining the space of possibilities and for pushing towards optimality in the task was known already (see e.g. Lindblom, MacNeilage and Studdert-Kennedy 1984) and the computer model presented here provides additional empirical support for this thesis. At the same time, the model clearly shows that embodiment can (and has to be) complemented with the collective dynamics flowing from social interaction. As the imitation success of agents is partly determined by how well the agents' vowel repertoires conform to the repertoires that are used by the other agents in the population, different (sub-optimal) configurations can emerge and be maintained. Some configurations can be considered stronger attractors than others in the dynamical system that is defined by the agents and their interactions. The same universal tendencies that are found in human vowel systems are found in the systems that emerge. This indicates that innate rules and constraints are not necessary in order to explain why these tendencies are present; neither is it necessary to assume that they have to be genetically encoded. More generally (and based on other works such as reported in Steels and Belpaeme 2005) we argue that the universal tendencies found in human categorization, as well as the cognitive systems that use these categorizations for action and decision-making, are a function of multiple forces, and in particular embodiment and collective dynamics. In domains where the statistical structure inherent in the real world is relevant (such as color categories), sensitivity to this structure will also play a role. In domains where there is genuine choice (like driving to the right or left) the regularity is generated by the agents and cannot be explained on the basis of preexisting statistical structure. All this does not exclude the possibility that certain categorizations could not have become genetically assimilated. However, in the case of cultural systems, like language, which undergo relatively rapid change, this seems less likely than is often assumed.
Embodiment and self-organization ofhuman categories
427
Acknowledgement
The Sony Computer Science Laboratory in Paris funds the work by Luc Steels, with additional funding from the OHLL CNRS project, the OMLL ESF project, and the EU FET ECAgents project. The work ofBart de Boer reported here took place at the University of Brussels AI lab and was funded by a GOA project.
References Berlin, B. and P. Kay 1969 Basic color terms: Their universality and evolution. University of California Press. Bowerman, Melissa and Soonja Choi 2001 Shaping meanings for language: Universal and language specific in the acquisition of spatial semantic categories. In: Melissa Bowerman and Stephen C. Levinson (eds.), Language Acquisition and Conceptual Development, 475-511. Cambridge: Cambridge University Press. Browman, Catherine P. and Louis Goldstein 1992 Articulatory phonology: An overview. Phonetica 49: 155-180. Chomsky, Noam and Morris Halle 1968 The Sound Patterns ofEnglish. Cambridge, Mass: MIT Press. Christophe, Anne, Jacques Mehler and Nuria Sebastian-Galles 2001 Perception of prosodic boundary correlates by newborn infants. Infancy 2: 285-394. Crothers, John 1978 Typology and universals of vowel systems. In: Joseph H. Greenberg, Charles A. Ferguson and E. A. Moravcsik (eds.), Universals of Human Language, Volume 2 Phonology, 93-152. Stanford: Stanford University Press. Davidoff, J. 2001 Language and perceptual categorisation. Trends in Cognitive Sciences 5 (9): 382-87. de Boer, Bart 1999 Self Organisation in Vowel Systems, PhD Thesis, AI-Lab, Vrije Universiteit Brussel. Self organization in vowel systems. Journal of Phonetics 28 (4): 2000 441-465. 2001 The Origins of Vowel Systems. Oxford: Oxford University Press.
428
Luc Steels and Bart de Boer
Dresher, Elan 1992 A learning model for a parametric theory in phonology. In: Robert Levine (ed.), Formal Grammar: Theory and Implementation, 290317. Oxford: Oxford University Press. Edelman, Gerald 1990 Neural Darwinism. New York: Basic Books. Fast, Peter W. 1953 Amuesha (Arawak) phonemes. International Journal of American Linguistics 19: 191-194. Herskovits, Annette 1986 Language and Spatial Cognition. Cambridge: Cambridge University Press. Iida, Fumiya, Raja Dravid and Chandana Paul 2002 Design and control of a pendulum driven hopping robot. Proceedings ofInternational Conference on Intelligent Robots and Systems 2002 (IROS 02) Lausanne, Switzerland, 2141-2146. Jakobson, Roman and Morris Halle 1956 Fundamentals ofLanguage. The Hague: Mouton and Co. Kay, P. and T. Regier 2003 Resolving the question of color naming universals. Proceedings of the National Academy of Sciences 100 (15): 9085-9089. Ladefoged, Peter, and Ian Maddieson 1996 The Sounds ofthe World's Languages. Oxford: Blackwell. Lakoff, George and Mark Johnson 1980 Metaphors We Live by. Chicago: University of Chicago Press. Lindblom, Bjom, Peter MacNeilage and Michael Studdert-Kennedy 1984 Self-organizing processes and the explanation of phonological universals. In: Brian Butterworth, Bernhard Comrie and Osten Dahl (eds.), Explanations for Language Universals, 181-203. Berlin: WaIter de Gruyter. Lindblom, Bjom and Ian Maddieson 1988 Phonetic universals in consonant systems. In: Larry Hyman and Charles N. Li (eds.), Language, Speech and Mind, 62-79. London: Routledge. Lenneberg, Eric H. and John M. Roberts 1956 The language of experience: A study in methodology. International Journal ofAmerican Linguistics memoir 13. MacNeilage, Peter 1998 The frame/content theory of evolution of speech production. Behavioral and Brain Sciences 21: 499-548. Maddieson, Ian 1984 Patterns ofSounds. Cambridge: Cambridge University Press.
Embodiment and self-organization ofhuman categories
429
Maddieson, Ian and Kristin Precoda 1990 Updating UPSID. UCLA Working Papers in Phonetics 74: 104-111. NUflez, Rafael 1999 Could the future taste purple? Reclaiming mind, body and cognition. Journal ofConsciousness Studies 6 (11-12): 41-60 Pfeifer, Rolf and Fumiya Iida 2004 Embodied artificial intelligence: Trends and challenges, In: Fumiya Iida, RolfPfeifer, Luc Steels and Yasuo Kuniyoshi (eds.), Embodied Artificial Intelligence, Lecture Notes in Computer Science 3139: 126. Berlin: Springer Verlag. Plaut, D. C. and C.T. Kello 1999 The emergence of phonology from the interplay of speech comprehension and production: A distributed connectionist approach. In: Brian MacWhinney (ed.), The emergence of language: 381-415. Mahwah, NJ: Erlbaum. Rand, Earl 1968 The structural phonology of Alabaman, a Muskogean language. International Journal ofAmerican Linguistics 34: 94-103. Redford, Melissa A., Chun C. Chen and Risto Miikkulainen 2001 Constrained emergence of universals and variation in syllable systems. Language and Speech 44: 27-56. Schwartz, Jean-Luc, Louis-Jean Boe, Nathalie Vallee and Christian Abry 1997 a Major trends in vowel system inventories. Journal of Phonetics 25: 233-253. 1997 b The dispersion-focalization theory of vowel systems. Journal of Phonetics 25: 255-286. Steels, Luc 2003 Evolving grounded communication for robots. Trends in Cognitive Science. 7 (7): 308-312. Steels, Luc and Tony Belpaeme 2005 Coordinating perceptually grounded categories through language. Behavioral and Brain Science. 28 (4): 469-529. Steels, Luc and Rodney Brooks 1995 The Artificial Life Route to Artificial Intelligence. Building Embodied Situated Agents. New Haven: Lawrence Erlbaum. Steels, Luc and Oudeyer, Pierre-Yves 2000 The cultural evolution of syntactic constraints in phonology. In: M. A. Bedau, J. S. McCaskill, N. H. Packard and S. Rasmussen (eds.), Proceedings of the VIfh artificial life conference (alife 7): 382-394. Cambridge (MA): MIT Press.
430
Luc Steels and Bart de Boer
Vallee, Nathalie 1994 Systemes vocaliques: de la typologie aux predictions. These. ICP, Grenoble. Ziemke, Tom 2003 What's that thing called embodiment? In: R. Alterman and D. Hirsch (eds.), Proceedings o/the 25th Annual Meeting o/the Cognitive Science Society: 1305-1310. Hillsdale, NI: Lawrence Erlbaum.
Communication as situated, embodied practice
Wolff-Michael Roth
Abstract Researchers generally are preoccupied with written language as the paradigm of communication. This paradigm makes communication and what is communicated appear as something that exists apart from invariably embodied and situated action. A large part of human communication involves participants in face-to-face conversation. In such situations, participants attend to semiotic resources other than words produced and attended to, including gestures, body positions, and structures in the environment. To avoid the dualism that characterizes much of the scholarship on communication, I provide a dialectical account of human activity, one component of which is communication, itself characterized by the dialectical relation of different modes (speech, gesture, structures in environment). The analyses of an episode from a school science classroom exemplify the central role of the body in communication, which by far exceeds what participants articulate in so many words. Its relationship to environmental features, however, is dialectical, making communication a (materially, socially) situated and embodied practice. Keywords: agency situated practice.
1.
I
structure dialectic, communication, embodiment, gestures,
Introduction
Since the late 1980s, it has become fashionable to talk about knowing and learning in terms of embodied cognition Of, alternatively, in terms of distributed or situated cognition. All of these terms imply that knowing exceeds that what can be found in the brain - viewed as computational wetware - and that has been the preoccupation of psychologists.! Like many others concerned with cognition and development, I had been involved in 1. Etymologically, to know and cognize have the same origins, generally held to have the same root as can.
432
Wolff-Michael Roth
building models of human cognition based on Piaget's work, which came to be associated with information processing. However, once I conducted research in the attempt to understand knowing and learning in real time and particularly the evolution of language, the then valid measures of reasoning ability and short-term memory were unable to predict performances in the real-world settings that I attempted to explain. Theoretical approaches that included or were based on social processes seemed to be much more promising, as the turn to social constructivism in disciplines such as social studies of science (e.g., Latour and Woolgar 1979), anthropology, and education (e.g., Lave and Wenger 1991) seemed to imply. In the course of the 1990s, I had taken hundreds of hours of videotape first in classrooms and then among scientists and technicians at work. It became increasingly clear to me that analyzing only the words they used was insufficient to understand what participants knew, what they communicated to one another, and what they knew to be going on. It was impossible to model communication unless one also accounted for the bodily origin of communication and nature of knowing, on the one hand, and for material cultural processes, on the other. There was a problem, however, which I understood only afterward: the individual (body) and the sociality of culture cannot be thought together in dualistic models, because each approach ends up reducing one to the other. It was at then that dialectical approaches to communication became salient in my work. In this chapter, I exemplify the kind of communicative action that I found among students and professionals of science. Drawing on an episode from a science classroom, I exemplify the (socially, materially) situated and embodied nature of communicative action. I begin by articulating a dialectical framework for activity and action that allows me to treat communicative action and situated cognition in a non-dualist way. I move on to provide a brief description of the ethnographic and analytic background, before providing an extended analysis of one typical episode.
2.
Integrating body, mind and culture
Body, mind, and culture can easily be integrated into the same unit within a dialectical approach (e.g., Leont'ev 1978). In the following, I first sketch a general theory resting on the dialectic of agency and structure, and then articulate communicative action as one aspect within this framework.
Communication as situated, embodied practice
2.1.
433
Agencylstructure dialectic
The embodied, situated, and cultural nature of cognition in general and communication of a conscious agent more specifically can be understood within an agencylstructure dialectic (Sewell 1992). Agency denotes the capacity to act. It is immediately clear that there is no agency without structure, material or cognitive. For living organisms, structure is everywhere. Bodies are structured in a material sense, exhibiting particular articulations, symmetries, and sizes, all of which mediate the kind of movements organisms are capable of. The world in which we live is structured, too, with its characteristic forces and articulations among things, but structured differently for different organisms (von Uexkiill 1972/1928). Scholars attend less to the structured ways in which we see, hear, feel, move, and do things, including thinking about them. These structured actions, including perception, have arisen from our bodily engagement with the world (Merleau-Ponty 1945; see also Gallagher this volume; Lindblom and Ziemke this volume). The very cognitive structures that allow us to perceive the world in structured ways are themselves a consequence of interactions of a structured body with a structured world (Bourdieu 1980). But there would not be the need for speaking of structures in the world unless there were organisms to which these structures are behaviorally relevant. To distinguish the two forms of structure, the notions of objectively given (sociomaterial) resources and (mental) schema have been proposed (Sewell 1992). At any moment in activity, the given structures, i.e., sociomaterial resources and schema, enable and constrain actions. In any consideration of human activity, the two forms of structures have to be considered dialectically. Any object of activity simultaneously exists as a vision or hypothesis of the forthcoming product (which implies consciousness), a set of material conditions including materials and tools, and knowledge of sociomaterial conditions of the productive activity (Saari and Miettinen 2001). Therefore, neither sociomaterial resources nor schema individually or together determine action; rather, actions emerge in response to the contradictions embodied in the unit of analysis, which comprises the identity of (the non-identical) sociomaterial resources and their appearance to conscious mind in action. Any analysis of activity simultaneously has to consider three levels, which integrate the bodily and cultural nature of engagement with the world: activity, action, and operation (Leont'ev 1978). Activities, such as schooling, farming, or researching are always collective endeavors but
434
Wolff-Michael Roth
concretely intended and realized in varying form, and are directed toward some generalized (conscious) object or motive. Concretely realization occurs by means of sequencing of particular conscious actions, each directed toward a specific conscious goal formulated by the acting subject. While researching, a professor might decide to measure a certain variable or to do an ANOVA. The relation of activity and action is dialectical: the activity determines the sense of an action - an ANOVA done by a pollster is likely to have a different sense than the professor's - but actions concretely realize the activity. Properly sequenced smaller units, operations, constitute actions; operations are unconsciously produced responses to conditions, such as when the professor keys the statistical model into the computer but is not consciously aware of hitting particular keys but focuses on the equations appearing on the screen. The relation of action and operation is, again, dialectical: the action is a generalized referent for the proper sequencing of operations, but operations concretely realize actions. Operations have been shaped in and are the outcome of previous interactions of the body with the sociomaterial world, they constitute a concrete realization of cultural possibilities; they are a form of unconscious collective consciousness, that is, an embodiment of culture. This approach corresponds to that taken by Gallagher (this volume), where actions are goal-directed and conscious but realized by unconscious operations brought about by body schema. Communication is but one form of actions that develop the current activity, without necessarily being about the activity.2 This approach therefore contrasts the claims of those scholars (e.g., Zlatev, this volume) who fail to recognize that (a) language-in-use (parole) is most frequently produced without prior planning (i.e., conscious supervision) and frequently does not follow the conventions grammarians have adopted and (b) communicative action and thought are frequently produced in real time rather than reproduced through speech.
2. Space limitations do not allow me to elaborate the relation between resources and schema on the one hand, and activity, action, and operation on the other hand. Suffice it to say that resources exist externally to the agent, at the interface of activity and action; and schemas exist internally to the agent, at the interface of action and operation.
Communication as situated, embodied practice
2.2.
435
Communicative action
Communicative action is more than sending information (words) from a sender (speaker) to a receiver (listener), which would require equal tuning of the two participants to code and encode concrete material sound parcels. Because of the dialectical relation material resources and schema, the identical tuning of speaker and listener is never given in human communication. A dialectical approach to human activity leads us to a different conception of communication. Each action produces a change in the current setting, thereby providing changes to the semiotic resources available for the conduct of activity in general and face-to-face communication in particular. Speech act theory allowed us to understand that utterances, too, have effects in the setting, they constitute resources that can be part of any future action, current or subsequent speaker. Artifacts (Streeck 1996), diagrams (Goodwin 2000), and other structured aspects in the setting, including various forms of gestures (e.g., Bavelas, Chovil, Coates and Roe 1995), pointing and body orientation (Hindmarsh and Heath 2000), and pitch (Goodwin, Goodwin and Yeager-Dror 2002), constitute resources produced and used by speakers and listeners alike. To understand communication in everyday settings, all of these structures need to be accounted for simultaneously (Roth 2004). These different structures that are apparent in communicative action (words, gestures, things pointed to) are not simply additive but dialectically related. Thus, for example, speech and iconic gestures constitute two ways in which thought is not merely made public but actually finds its expression as productive activity (McNeill and Duncan 2000). Speech and gestures are dialectically related because they constitute different expressive forms for the same intentional stance toward the world. This stance is not some idea somehow behind or underlying and therefore expressing (translating) speech. Rather, it is my bodily me that gestures and speaks, and thereby signifies thought and intention for others and myself (MerleauPonty 1945). Making sounds, speaking individual words, gesturing, nodding the head, turning the body and so forth are operations that together constitute a communicative act (Mikhailov 1980). The production and recognition of these operations always requires a concrete body.3 But each communica-
3. Gallagher (this volume) and Lindblom and Ziemke (this volume) elaborate on the link between production and recognition of body-produced movements
436
Wolff-Michael Roth
tive act does not have a sense in itself; rather, sense emerges from the relation between the speech act and the overall, collective activity. Communicative action is therefore at the same time a concretely embodied (intrapsychological) and sociocultural (inter-psychological) phenomenon; its meaning emerges from the simultaneous grounding of action in cultural activity and bodily operations. Conscious action therefore lies at the interface between the unconscious operations that realize them and the cultural processes that give them their sense.
3.
Ethnographic context
The data analyzed here derive from an in-situ naturalistic study of communication in a seventh-grade science classroom where students learned about the physics of simple machines largely by designing machines. To make the learning environment as authentic as possible, the specifications for the design tasks were articulated in a call for proposals. As part of their submission, the students included paper-and-pencil sketches (e.g., Figure 1) and prototypes. They subsequently presented their prototypes to the student collective, which instantiated a form of peer review. This curriculum therefore provided many opportunities for students to engage in conversations with others, both during the design of the prototype and the review sessions. Two cameras and two tape recorders were used to record smallgroup and whole-class conversations. To begin with, all recordings were roughly but word for word transcribed to capture all utterances as completely as possible under the prevailing conditions (e.g., with students hammering and using power tools). In a second pass, the VHS videotapes were digitized (Macintosh iMovie); offprint images capturing characteristic moments were imported into the transcripts. Each lesson subsequently was broken into episodes, which were exported to QuickTime versions. These digitized clips were then analyzed using Peak™ DV 3.21, which allows the graphical representation of waveform (for determining time, relative loudness, and duration) and its Fast Fourier Transform (for determining pitch changes). Individual utterances were timed with an accuracy of ±O.OOl seconds but rounded to the
(signs) and the role of mirror neurons to bring about this integration between perception and action.
Communication as situated, embodied practice
437
nearest 10 milliseconds and coordinated with the frames that allowed timing to 33 milliseconds (30 frames/s rate of the VHS video).
4.
Language and body in design activity
Linguistic analyses generally are concerned with language independent of communicative praxis. This leads directly leads to the grounding of language, that is, its connection to things and situations, which is continuously realized in practice but has been a problem for traditional cognitive science (Harnad 1990). Furthermore, most linguistic studies are hardly relevant to speech, for the language of writing is a fundamentally different cognitive phenomenon than language in practical face-to-face communication (Ong 1982). Here, I am concerned with language as it appears in face-to-face communication alongside other semiotic resources humans produce in interaction and for interactive purposes - body orientation, body movements, and hand gestures. Following the presentation and brief gloss of a 23-second episode, analyses are provided of the concrete role played by human bodies in communication.
4.1.
A moment in design activity
The following episode was recorded at the moment when the three girls were deliberating how to go about building the tower that would hold an elevator, the first element in their design of a Rube Goldberg machine (Figure 1).4 Leanne (far left in offprints) had previously explained that they needed to build a two-sided tower; Amanda (far right) and Bella (center) went along so that it appeared that there was agreement on this design. Following Leanne's suggestion, the three had decided to cut the wood needed to build that part of their design. At one moment, Leanne talked about cutting two pieces of wood for the two sides of the structure. This
4. Rube Goldberg was a cartoonist who drew complicated machines that accomplished the most mundane everyday actions. Constructing a Rube Goldberg machine is a popular design activity of middle school students around the world. More information is available at http://www.rube-goldberg.com/.
438
Wolff-Michael Roth
was when the following episode began. 5 (I first provide the raw transcript and then a verbal description of the events.)
Figure 1.
The students had decided to have a tower and a chute as the frrst two elements in their Rube Goldberg machine, a complicated device that would accomplish the simple task of feeding a cat. The episode concerns the construction of the tower part.
5. To get a better sense for the changes in body positions and hand gestures involved, readers should visually scan the images before reading the transcript. The following transcription conventions have been used in line with traditional conversation analysis: (0.41) : time in seconds; [ttwo sides] : square brackets enclose speech overlapped by gesture or other body movement; [t((r~ lH moves up)) : description of gesture (rH, ill = right and left hand, respectively), body movement; * : the asterisk aligns speech and video offprints; = : equal sign shows latching, that is, two utterances are not separated by the normal pause; .,?! : punctuation is used to indicate speech features, such as rising intonation heard as a question, or falling intonation to indicate the end of an idea unit (sentence); - : the n-dash indicates stop in utterance without voice inflection indicating end of idea unit.
Communication as situated, embodied practice
01 Bella: 02
Okay, which two sides? * (0.47)
03 Leanne:
Okay, you know [\how we are doing that thing where the el]\bevator thing comesh bup* [l(body rotates to the left, rH places saw on table)) [z((rH moves upward, IH slightly upward)) b((begins turning upper body,
04 05 Leanne: 06
07
08 09 10 Bella: 11 Bella
12 13 Leanne:
14 Bella: 15 Leanne:
(0.43)h b((head turns toward board to right, rH points to board)) [4these * are the sides] [4((rH single stroke along the board)) [sthat like keep it] * [6standing]. [s((body turns to left, both hands come up palms facing one another to pinnacle oftrajectory)) [6((two-handed gesture halfway down, slightly up, then completely down)) (0.59) hO.60] h((two-handed up down gesture, at half height)) [8 The two sides?] * [8((two-handed up gesture)) [9(0.88)] =[9((nods three times, at the end, turns to her right toward the piece ofboard)) [\oUh, okay.] [1O((turns to her right toward the piece ofboard))
-
439
440
WolfJ-Michael Roth
16 Leanne:
[111 think it'll work. But if it doesn't, I _"-:~M!I" know what we can[lI((PicAJ up saw, begins sawing))
17 (0.42) fix how we can fix it. 18 19 Amanda: Don't we need four?* 20 (0.56) 21 22 23 24 25
Leanne: No,justAmanda: Just two. Leanne: Two.
(0.44) Leanne:
26 Bella: 27 Leanne:
[12Right?]*
[12((turns upper body, head to gaze at Bella)) =[13((nods twice))] 11 [14((turns head to gaze at saw and wood))
At the beginning of this episode, Bella, apparently confused about just how the part was to look, then asked, "Okay, which two sides?" (turn 01). A brief pause followed Bella's question, before Leanne began to answer, "you know how we are doing that thing where the elevator thing comes up" (turn 03). While uttering this description of the current state of affairs, she turned her body toward Bella, laid down the saw she had been operating, and moved her right arm and hand upward (turn 03). In the ensuing pause in the verbal articulation, Leanne turned to her right, her hand moved forward initially pointing to the board on the table, and then moving her hand about 5 centimeters along one of the edges. It was precisely at the place where she had previously shown with the saw where and how they needed to cut the word (Figure 2). Contemporaneous with the gesture, she uttered, "these are the sides" (turn 06). She oriented toward Bella again and, in a continuous motion, produced two two-handed up-down hand movements, palms facing, suggesting two pieces facing one another, while uttering, "that like keep it standing" (turn 07). After a longish pause, Bella mimetically reproduced Leanne's doublehanded gesture, before repeating the gesture all the while uttering, with rising pitch, "The two sides?" At the verbal level, there followed a longer, 0.88-second pause (turn 10). But Leanne began the first of three head nods just as Bella had completed her utterance (turn 13). That is, as a turn, her head nod latched onto the ending of Bella's turn. Bella responded with a sign of understanding and idea completion (falling pitch), "Uh okay,"
Communication as situated, embodied practice
441
while Leanne turned to the piece of board from which she had earlier indicated the intention to cut two narrow pieces. She explained that she thought her idea would work, and ifnot, she knew an alternative (turns 1618). She had begun to saw and would not stop for the remainder of the episode.
Figure 2.
The iconic gesture produced here with the saw moving along the board where a cut was to be made was reproduced later in an episode analyzed in this chapter.
At this point, Amanda, who had sat on the table next to her two peers, entered the conversation for the first time in this episode. Up to this point, she had faced away and now, turning her head only slightly to her right, asked, "Don't we need four?" (turn 19). Continuing to saw and interrupted by Amanda's own articulation "two," Leanne responded, "No, just two" (turns 21,23). After a brief pause and without stopping to saw, Leanne turned her upper body (slightly) and head to gaze squarely at Bella in the process uttering with rising inflection, "Right?" (turn 25). She had barely completed when Bella began to nod twice, upon which Leanne turned her head back to look toward the board that she was working on. In terms of the overall project, the episode was important in that what had appeared to be a common design vision of the tower and the elevator it subtended did not exist but was brought into alignment. Although the three girls previously had talked about and seemed to agree on the design, and although Leanne had explicated the two-sided version of the tower once before, it was apparent that both of her two peers wondered in the course of this episode why she wanted to cut two pieces of board rather than some other number, including the four Amanda articulated. This alignment, however, was not produced merely by talk. Rather, pointing (deictic) and iconic gestures, head nods, body turns were important aspects of the communicative exchange. Without participants' attention to the body of the respective other, they would find it difficult to produce the alignment achieved here. The latter part of the episode exemplifies this: Amanda, not
442
Wolff-Michael Roth
having had perceptual access to the gestures and body orientations of the other two girls, did not understand why there ought to be two sides to the tower despite the fact that Leanne just explained it. Thus, Amanda began the second part of the episode with fundamentally the same question that Leanne had just answered in response to Bella's query. In this episode, therefore, alignment of at least two of the three participants with respect to a future state of their activity was achieved. The episode had begun with a query about the number of sides for the tower, although this issue seeme,d to have been settled during a previous interaction. Twice during the episode, Bella aclmowledged the two-sided version of the tower, once by means of an iconic gesture and confirming utterance (turns 10, 14) and then by means of a head nod. As Amanda acquiesced (turn 22), we have no indication about her personal vision of what their collective project was to be. In the following, I analyze different aspects of this episode, which bring to the foreground different dimensions of the relationship of body and language in action.
4.2.
Orienting the body and gesturing mediate turn taking
Communication is an interactive phenomenon requiring the analysis of pairs of turns. A body of studies in conversation analysis provided an understanding of mostly audiotaped conversations (ten Have 1999). Whoever speaks, holds the floor; others normally do not speak at the same time. However, participants do not own pauses in speech. If the current speaker has the intention to continue, he or she has to provide other semiotic resources that indicate this intentionality. However, videotaped interactions reveal that in face-to-face situations, there are many other semiotic resources that participants bring to in play that mediate turn taking. In this situation, those who want a turn at talk may provide bodily and gestural resources that indicate this intentionality (Goodwin 2000). Mutual awareness of the interaction participant's body orientation, perceptual access, and (hand, head) gestures mediates the rules of interaction in ways that traditional conversation analysis could not reveal. Videotaped materials, as used here, reveal the central role of the body in face-to-face interaction. The role of body orientation and perceptual access was quite apparent in this episode, which had started with Bella's question about the nature of the tower sides. Leanne responded, followed by a renewed query and affirmation that there would be two sides. During the interaction, both had
Communication as situated, embodied practice
443
produced gestures in the context of changing body orientations. Amanda had been sitting on the table facing away from the two. Thus, she had no access to all those parts of the communicative interchange between her peers other than the verbal utterances. In particular, she had not been in the position to perceive either Leanne or Bella's two-handed production of an image that pertained both to the role of the two pieces of board that they were in the process of cutting and the design drawing. Her question "Don't we need four?" essentially repeated Bella's question about the relationship between raw materials, design drawing, and their vision of the emerging artifact. The words alone had been insufficient to bring Amanda in alignment with what was to be the collective vision for the unfolding design artifact. Pauses are interactive resources that play a particular role in the situated evaluation of communicative action (Bourdieu 1980). As a pause in talk is lengthening, it increasingly becomes a resource for any participant to begin speaking. Following a question, however, a pause becomes an increasing liability for the addressee, as not answering can be heard as an intentional exclusion of the querying person from interaction. This happened repeatedly in this group, where Leanne did not answer queries that Amanda directed to her, whereas she answered all questions that Bella posed. Thus, the lengthening pause following Amanda's question (turn 19) opened up the possibility that this situation, too, would be one in which Amanda did not receive a response. Perceptual access to the interlocutor's body changes the rules identified in conversation analytic studies. Body movements required by the interaction may take time, but the very fact that they are enacted articulates communicative intention. So actions other than speech also articulate the intent to contribute, and therefore that the current turn lies with the acting person. For example, after Leanne first articulated the sides of the tower ("that thing where the elevator thing comes up"), there was a pause at the verbal level. However, at that time she turned her body and moved arm and hand nearly 90 degrees around to her right. The motion ended in the pointing gesture and led into the gestural and verbal articulation of the target as "the sides." Later (turn 13), she nodded three times in response to Bella's query for confirmation of the two-walled nature of the future artifact. The head nodding was therefore a different form of turn taking than what has been available to traditional conversation analysis. In the head nod, the body articulated an affirmation, which in a different spatial arrangement might
444
Wolff-Michael Roth
have come forth through an activation of the vocal cords. The relationship between pause and orientation was also evident in the following situation. In the final part of the episode (turns 25-27), Leanne had answered Amanda's question about the number of sides of the tower. There followed a pause. Leanne then uttered with rising pitch, "Right?" (turn 25) simultaneously turning her head to Bella (still continuing to saw). Leanne thereby held Bella accountable for not having confirmed the answer, not only in asking for a confirmation but also, by turning her gaze in the direction of Bella, designating her as the addressee of the question. Bella did not respond verbally, but, immediately following the end of Leanne's head movement, nodded twice. Without a word, Leanne returned her head to gaze again at the saw, board, and sawing action. Bella had answered with a double nod, which would not have made sense had she not been aware of the fact that Leanne could see her. Leanne's subsequent turn was an acknowledgment that the request for confirmation had been satisfied, and therefore that she had seen the nods. Bella, too, could know that Leanne was satisfied with her non-verbal response, for she could have expected a conversational repair if this had not been the case. A converse case existed earlier in the episode when Bella acknowledged the confirmation received to her query about the two sides (turn 14). Bella was moving her head downward as part of a grooming gesture and Leanne was turning around. Bella therefore was forced to utter "Ub okay" rather than enacting acknowledgment in another way (head nod, eye movement, literal "thumb up"). First, because Bella turned her gaze away from Leanne, she was not able to see whether her peer was in the position of seeing her gesture. Second, because Leanne was moving, she could not have seen the gesture either. It is likely that the verbal utterance was forming at the same moment that the downward head movement was. In yet another instance (turn 10), Bella had produced a two-handed gesture while looking down at her hands; she reproduced the gesture this time uttering questioningly, "The two sides?" while simultaneously raising her gaze to Leanne's face. This provided a resource for Leanne to respond by nodding but did not require a verbal equivalent, which, here, was not provided. These analyses exemplify that the mutual bodily orientation of the participants mediated how communication is articulated, that is, verbally or gesturally (e.g., hand, arm, head). Participants are not only aware of their own positions and orientations but also of the positions and orientations of their interlocutors. Speech is the preferred mode if there is evidence that the listener does not have access to bodily gestures and that both speaker
Communication as situated, embodied practice
445
and addressee are aware of this. The speaker may produce gestures when the listener has access, even though the speaker herself does not face the listener. These rules also mediate the pauses between verbal utterances. If participants have full perceptual access to one another, a lot of the communicative work is generally done by producing non-verbal semiotic resources, such as indicating intent to take the turn, articulating ideas, orienting the communicative partner toward, and so forth.
4.3.
Pointing
Pointing is classified on the spontaneous-idiosyncratic-movement end of a gesture continuum that reaches to fully developed linguistic systems (e.g., American Sign Language) on the other end (Kendon 1988). Acts of pointing occur in situations that involve at least two participants, one of whom is, in the act of pointing, producing resources for the joint establishment of a particular space as a common focus (Haviland 2000). This common focus constitutes a resource for the organization of subsequent situated action and cognition. Pointing is jointly contextualized by semiotic resources that include, at a minimum, a body that visibly performs the act, a dialectically related speech act, the material properties of the in pointing targeted entity, the orientation of relevant participants toward one another and toward the target space, and the larger activity within which the act of pointing (itself only part of the communicative act) is embedded (Goodwin 2003). In this episode, there were repeated instances of verbal deixis ("that," "it," and "these"). One of these instances, "these," immediately followed a deictic gesture (turn 05) and overlapped with an iconic gesture whereby the hand moved along the piece of wood (turn 06). Such acts of pointing constitute prime examples where the double nature of the object in activity becomes apparent: from the speaker's perspective, it appears twice, once as material and once in perception. The body oriented itself toward an entity other than itself (piece of wood), which it was perceptually isolating from a more inchoate ground. The body, here, repeatedly pointed to this entity, verbally by means of the utterance "these" and gesturally by the stretched arm and edge of the hand coming to rest on the piece of wood, the current object of attention. The deictic term "these" locates the entity pointed to near the speaker (Hanks 1992), in the present case, within reach. The deictic term "these" not only instructed Bella to attend to something beyond the speech act, the
446
Wolff-Michael Roth
targeted material structure, but also articulated that what she pointed to as a multiplicity (i.e. "these" rather than "this"). Furthermore, by using "these" rather than an articulation of a point in space (here, there), Leanne formulated the target as material things rather than spaces or sounds. There existed, however, a contradiction: there was only one piece of wood, and Leanne' s subsequent iconic gesture produced the image of one rather than two cuts. But her pointing gesture was not only to the piece of wood but also to the previous interaction where she had gesturally enacted twice the iconic gesture of two cuts. Because these previous gestures were part of the girls' interactional history, they constituted production resources "once-removed," which make connections over larger chunks of the ongoing activity (Erickson 1982). This pointing (turn 05) therefore indexed the earlier sawing gesture with the edge of the right hand at the same place where she had earlier moved the saw (Figure 2). In a continuous motion, the pointing configuration then moved along a line about five centimeters from the edge where she had earlier indicated that they needed to cut (Figure 2). Although most classification systems categorize deictic and iconic gestures differently, there are situations where the same movement enacts both types, for example, when the pointing body part traces out some shape. Thus, Leanne's initial deictic gesture toward the piece of wood (turn 05) also became an iconic gesture (turn 06), because its trace took the shape of the projected linear cut. Here, movement along a line that separated the object pointed to from what was to be taken as diffuse ground enhanced salience. The single iconic gesture along the board constituted a reproduction of an earlier gesture showing where she was to cut the board (Figure 2). In fact, at that earlier moment, Leanne produced the gesture twice, holding the saw where the cuts were to be done. That is, the distance of the saw from the edge doubled for the second movement in the first pair of gestures; in the second pair, she produced for practical purposes identical gestures at either end of the board, again indicating that they needed two pieces of board for the tower part. Despite the iconic relation between the earlier and present gestures, the contradiction between the gestural (one) and verbal articulations (more than one) was only resolved in the subsequent iconic gestures (turn 07) that articulated a vision of two upright boards, as it was further materialized in a subsequent episode (Figure 3). From the listener's perspective, the associated task is to perceptually isolate the entity pointed out by the speaker's body (orientation, deictic gesture). This is not necessarily an easy task, for even in the sparsest set-
Communication as situated, embodied practice
447
tings, it may not be evident to the listener what the salient feature is for the speaker. Different speakers may use the same word for denoting different material entities or may use different words for denoting the same material entities (e.g., Roth 1995). Other, semiotic resources are required as constraining factors to assist the listener in perceptually articulating the target structure. Such constraining factors are inculcated culturally marked object or field, or may be found in the history of the activity and associated talk. In the present episode, the perceptual field articulated in Leanne' s pointing was not culturally marked, but marked by having been the focus of previous interactions. By turning to the piece of wood, pointing it out as a resource, and then returning to the initial body orientation facing Bella, Leanne actually brought the pieces of wood into the space between them, where she gesturally enacted a vision of the future artifact. In the present case, Leanne had already used repeatedly an iconic gesture to point out where and how many times they needed to cut for getting the required pieces in the building of the tower part. The nature of the entity pointed to was further constrained by the talk that immediately followed, where Leanne's two-handed palm-facing gesture articulated a vision of boards facing one another (turn 07, Figure 3).
Figure 3.
The two-handed gesture, which had fIrst appeared in a conversation about what the tower would look like is here reproduced but here in a holding movement, fIrst by Leanne (a), then by both students (b).
In the present situation, Leanne was not just pointing somewhere, but she rotated her entire body. The sequence of the first four offprints in the transcript (turns 01-07) shows that she had first rotated away from an orientation to the board (turn 01) in order to face Bella (turn 03), then rotated back to face the board while pointing to it (turn 06), and then rotated again to face Bella (turn 07). She was aware that Bella shifted her gaze from facing Leanne (turn 03) toward the board (turn 06), the endpoint of the gesture,
448
Wolff-Michael Roth
and then back toward the space between them, where she could see both Leanne into the face and her two-handed gesture. This intentional pointing therefore was instantiated not merely the hand but in fact by the entire body (person) under the auspices of her knowing that the listener Bella was following her changing orientations.
4.4.
Iconic gesturing
Iconic gestures belong to the same class of gestures as pointing. They include those hand-arm movements that bear a perceptual relation with concrete or projected entities and events that the speaker is oriented to (e.g., McNeill 1992). In contrast to a deictic gesture, however, an iconic gesture is itself the focal entity. From a speaker's perspective, the iconic gesture articulates an immanent or unfolding sense that stands in a dialectical relation both with concurrent speech and the material entity in the setting so articulated (Roth 2004). Iconic gestures are important semiotic resources in interaction (e.g., Ochs, Jacoby and Gonzales 1994). Research in school science classrooms showed that for students of all ages, iconic gestures are important aspects of communication, especially while the learners are becoming familiar with the phenomena at hand and do not yet have a stable (viable) language to talk about them (Roth 2003). Iconic gestures also were important interactional resources continuously produced and reproduced in this episode. The first iconic gesture occurred while Leanne reminded Bella that they were currently working on the first part ("thing") of their Rube Goldberg machine where the elevator ("thing") moves (turn 03). The rising right hand palm facing down evoked the image of something moving upward, like the elevator in the design (Figure 1). More so, while they had talked about how to realize the tower, both Leanne and Bella had gesturally followed the vertical outline of the tower, indicated by the orientation of the hands. These gestures therefore bore iconic relations to other material forms, apparent as tower and elevator in the design drawing. At the verbal level, the upward movement of the hand was iconic to the articulated "comes up." It is a concurrent gestural and verbal concretization of the updown or verticality schema (Johnson 1987, Roth and Lawless 2002b). In this case, the timing between the verbal and gestural modality was precise (simultaneous), consistent with the contention that gesture and speech are simultaneous formulations of an immanent or emerging idea by one and the
Communication as situated, embodied practice
449
same body (McNeill 1992). However, the left hand only moved upward slightly, about one-quarter the way of the right hand, and given Bella's gaze direction, probably invisible to her. In the same part of the episode, one can also note that the idea had begun to form earlier. While uttering that they were "doing that thing," Leanne still had the saw in her hand, nearing the end of the body rotation. She placed the saw and then, in a continuous motion, moved her right hand up just as the utterance evoked an upward moving elevator. That is, the placement of the saw was in preparation of the subsequent gesture. But even before uttering the locative "up," her body and hand already began to rotate sideways toward enacting the deictic gesture that pointed to the wood to be used in the tower construction, thereby preparing the subsequent gesture. After the intervening (already analyzed) pointing episode, Leanne' s whole body turned back toward Bella and, in a continuous motion, both hands moved upward while she uttered, "that like keep it" (turn 07). The hands then descended halfway to the table, but, as if hesitating moved three-quarter the width of her hand upward, before completely descending to the table level; in the process, she completed the word "standing." The two-handed gesture produced the image of two upright boards facing one another, which, according to the verbal part, "keep it [elevator] standing." The two-handed gesture not only produced an iconic expression of the design vision, but also produced (a) an image of the two boards that provided the scaffold for the elevator in their design drawing and (b) an image of the boards that she had just pointed to in their raw-material state. Through the two-handed gesture, the design drawing (Figure 1) was coming alive while simultaneously implicating the raw materials to be used: it was a bodily realization of the vision. Later on, very similar hand movements included pieces of wood now standing for themselves as the "sides" (Figure 3a). At that point, the design was even further concretized but not yet an independent artifact: Leanne's body was still implicated by holding the pieces. Only subsequently, after adding more materials and using fastening resources would the emerging design become self-supporting and exist independent of the body. In this situation, we observe an opposite movement to my previous research on gesture in learning science (Roth and Lawless 2002a): here, ideas and fleeting images are initially concretized in body position and hand gestures whereas the inverse was observed in learning by experimenting.
450
Wolff-Michael Roth
At this point, Leanne had apparently ended her response to Bella's query. There was a long pause in the talk. During the second half of this pause, however, Bella reproduced a mimetic image of the two-handed gesture Leanne had just completed - and, similarly to her peer, used the same hand movements to hold up the two pieces of wood (Figure 3b). She then reproduced the same two-handed gesture uttering with increasing pitch, "The two sides?" (turn 11). Here, although the hands mimetically reproduced Leanne's gesture, thereby confirming that she had perceived it, the verbal utterance indicated uncertainty. Why should there be uncertainty? What must have appeared in doubt was the relation to the diagram, that is, their ongoing activity and its motive to construct a Rube Goldberg machine, with an elevator as its first part. That is, from the receiver's part, Leanne's production was intelligible as such (Bella reproduced it), but she had not yet found the relation Leanne' s gestures bore to the envisioned object of activity, partially concretized in the diagram. Facing Bella, who looked straight at her, Leanne nodded three times; she did not produce a verbal affirmation. As shown previously, Bella was in a position where her sign of affirmation had to be produced verbally. Leanne then turned around to continue where she had left off. Leanne had articulated her version of the design in response to Bella's query about the sides involved in the elevator tower. Thus, although they had talked about this before, that is, although Leanne had articulated this vision in a previous episode, there was apparently a lack of alignment or common ground between the two. Bella's query articulated this uncertainty about the nature of tower in what was to be their common design. The iconic and mimetic character of Leanne's two-handed gesture thus constituted a solution to the problem of establishing common ground with respect to an initially ephemeral vision of the object of activity, which was increasingly concretized in the unfolding of the design artifact. Bella enacted an independent concrete version of what is taken to be the common object at hand, always under revision in future action.
5.
Body, mind and culture
In this chapter, I articulate and exemplify the embodied and situated nature of human communication drawing of a non-dualist, dialectic framework that integrates mind and culture, body and mind, speech and gesture, materiality and conscious nature of human experience, and so on. The analyses
Communication as situated, embodied practice
451
show that the separate, interlocking displays of conversational participants form a unit that is greater than the sum of its parts, a particular kind of participation framework. Bella and Leanne treated each other's bodies as resources in the ongoing production of intentional action; Amanda, though she was close and overheard the two, did not have access to the same resources, as she faced away. For conversation participants, the body is a complex resource that enables different kinds of articulations that stand in a dialectically relation to, and therefore elaborate one another. The body therefore is a different type of entity than the piece of wood Leanne pointed to, the saw she used, or the paper-and-pencil diagram they had collectively produced, although it is of the same material kind. 6 Participants in communicative interaction must not only attend to multiple perceptual arrays (body orientation, gesture, speech, entities in the setting), but also arrays that are significantly different in their structure. In the featured episode, there were repeated instances of thoughts in the process of forming concurrently with the bodily productions (speech, gesture, orientation). Thus, Bella had already produced a two-handed gesture that enacted the image of the two-walled tower before she began to utter the question, "The two sides?" while simultaneously reproducing the twohanded gesture. In the first instance, one sees the question forthcoming, not yet formed in its entirety; it only became the full interactional resource when both gesture and speech aligned in the articulation. In fact, the first hand movement had a shorter trajectory, as if Bella had stopped it to reproduce it in full simultaneous with the utterance. This process was not planned, as gesticulations are generally produced unconsciously. The continuous unfolding of a thought was also noticeable in Leanne's performance of her response to Bella's initial question. It had started with a reference to the diagram, continued by articulating the pieces of board that they were currently working on, and then articulated the design vision that brought the two earlier pieces together into one unit. The analyses showed how her entire body performed this response. The analyses also showed that the body, through forms of both spontaneous and idiosyncratic gesture, is tied to its environment so that communication is not only an embodied (production of speech, gesture) but also a situated (distributed) phenomenon. In the deictic gesture, the identity of the
6. Gallagher (this volume) elaborates on the differences between the perceptions of material objects and human bodies (that of others and one's own) and the implications this has for intersubjectivity.
452
Wolff-Michael Roth
perceptual (intra-psychological) and material (extra-psychological) aspects of the object becomes apparent and articulated. It is in a reflexive movement that this identity is destroyed, that is, that subject becomes separated from object and object from subject; the reflexive movement, however, only returns the idea or thought but not experience of a body (MerleauPonty 1945). The body is therefore not only a different type of entity, but also it cannot be the object proper in the phenomenological or activitytheoretic formulation. The proper analytic starting point is the dialectical unity of activity - bodies, thoughts, experiences, and even emotions and motivations emerge as the effects of the subject I object dialectic that constitutes the activity (Roth 2006). The analyses also provide a hint at the fact that body, mind, and culture are inextricably related and cannot be reduced to one another. Thus, each utterance, gesture, body orientation, and so on produced by one participant not only provided a resource for future action to herself but also to the other; each communicative act was explicitly produced for the other. That is, the idea of a semiotic resource inherently requires the other participant to be aware of the semiotic potential of these resources. At the moment that the body articulates itself by producing semiotic resources, these already have to exist for the recipient other. An individual bodily me that articulates itself has to draw on resources that are already intelligible by the other and therefore part of the cultural possibilities. This, however, should not come as a surprise, if we recall that operations (and therefore schemas) are unconscious collective consciousness. All actions are therefore synthesized from operations, the concrete realizations of cultural (generalized) possibilities, and therefore are simultaneously the person's and the collective other's. The present analyses of one brief event testify to the complexity of communication, which cannot be reduced to language. The analyses articulated communication as an embodied and situated phenomenon and showed how ideas, questions, and responses emerged in real time. The episode analyzed here was not different than those preceding or following it during the six-hour project, or, for that matter, any other design activity by these students or their peers. If 23 seconds of situated activity are already so complex, how much more complex will be entire forms of activities that stretch over days, months, years, and generations? Because communication, ideas, questions, and responses are distributed across different bodily modes of expressions and across the salient aspects of the setting, the structure of thought cannot be linguistic structure. Thought cannot be
Communication as situated, embodied practice
453
reduced to language; nor can any other form of expression be reduced to the body. Rather, thought exists in the dialectical unity of all forms of expreSSIon. The present analyses make salient the special role of the body, seat and source of agency, in design activity, constituted in the unfolding ephemeral ideas into concrete artifacts that we had also observed among environmental activists (Lee and Roth 2001). The body was the first site where ideas took shape, where they were articulated; it was then augmented by the introduction of materials and subsequently emerging stages of the artifact. Still later, the body removed itself as additional actions allowed the artifact, the concrete rendering and development of the initial ideas, to take on a life of its own. Communication was but one form of action that moved the unfolding activity along its trajectory toward completion. This role of gestures complements the findings of our earlier research (e.g. Roth 2002), which showed the mediation gesturing plays in the development of language following manipulations of objects. That is, hand gestures in particular again played an important role in the transition between two representational poles, here, ephemeral design ideas sketchily articulated in speech, on the one hand, and concrete artifacts, on the other. There are direct practical consequences of the way of thinking about and analyzing human communication. For example, in lecture-type situations, students generally copy whatever the lecturer notes on the inscription device present; they may also note some of the utterances that do not get publicly recorded. However, students generally do not record all the other semiotic resources that the lecturers provide in their performances. It is therefore not surprising that students experience difficulties to reproduce the sense of what has been communicated from their notes alone. It also should not surprise that so many students experience difficulties in understanding the lecture content at some later point, although during the lecture it all had made sense. The perspective developed here suggests that these students are in fact missing most of the resources that were available to them in the lecture but have neither been recorded in their notebooks nor were recoverable from the existing entries.
Acknowledgments This study was made possible in part by a grant from the Social Sciences and Humanities Research Council of Canada. I thank Sylvie Boutonne,
454
Wolff-Michael Roth
Michelle K. McGinn, and Carolyn Woszczyna for the help they provided during data collection.
References Bavelas, Jan B., Nicole Chovil, Linda Coates and Lori Roe 1995 Gestures specialized for dialogue. Personality and Social Psychology Bulletin 21: 394-405. Bourdieu, Pierre 1980 Le sens pratique. Paris: Les Editions de Minuit. Erickson, Frederic 1982 Money tree, lasagna bush, salt and pepper: Social construction of topical cohesion in a conversation among Italian Americans. In: Deborah Tannen (ed.), Analyzing Discourse: Text and Talk, 43-70. Washington, DC: Georgetown University Press. Goodwin, Charles 2000 Action and embodiment within situated human interaction. Journal of Pragmatics 32: 1489-1522. 2003 Pointing as situated practice. In: Sotaro Kita (ed.), Pointing: Where Language, Culture and Cognition Meet, 217-241. Mahwah, NJ: Lawrence Erlbaum Associates. Goodwin, Charles, Marjorie H. Goodwin and Malcah Yaeger-Dror 2002 Multi-modality in girls' game disputes. Journal of Pragmatics 34: 1621-1649. Hanks, William F. 1992 The indexical ground of deictic reference. In: Alessandro Duranti and Charles Goodwin (eds.), Rethinking Context: Language as an Interactive Phenomenon, 43-76. Cambridge: Cambridge University Press. Hamad, Stevan 1990 The symbol grounding problem. Physica D 42: 335-346. Haviland, John B. 2000 Pointing, gesture spaces, and mental maps. In: David McNeill (ed.), Language and Gesture, 13-46. Cambridge: Cambridge University Press. Hindmarsh, Jon and Christian Heath 2000 Embodied reference: A study of deixis in workplace interaction. Journal ofPragmatics 32: 1855-1878. Johnson, Mark 1987 The Body in the Mind: The Bodily Basis ofImagination, Reason, and Meaning. Chicago: University of Chicago Press.
Communication as situated, embodied practice
455
Kendon, Adam 1988 Goffman's approach to face-to-face interaction. In: Paul Drew and Anthony Wooton (eds.), Erving Goffman: Exploring the Interaction Order, 14-40. Cambridge: Polity Press. Latour, Bruno and Steve Woolgar 1979 Laboratory Life: The Social Construction ofScientific Facts. Beverly Hills, CA: Sage. Lave, Jean 1988 Cognition in Practice: Mind, Mathematics and Culture in Everyday Life. Cambridge: Cambridge University Press. Lave, Jean and Etienne Wenger 1991 Situated Learning: Legitimate Peripheral Participation. Cambridge: Cambridge University Press. Lee, Stuart and Wolff-Michael Roth 2001 How ditch and drain become a healthy creek: Representations, translations and agency during the re/design of a watershed. Social Studies ofScience 31: 315-356. Leont'ev, Alexei N. 1978 Activity, Consciousness and Personality. Englewood Cliffs, NJ: Prentice Hall. McNeill' David 1992 Hand and Mind: What Gestures Reveal about Thought. Chicago: University of Chicago Press. McNeill, David and Susan D. Duncan 2000 Growth points in thinking for speaking. In: David McNeill (ed.), Language and Gesture, 141-161. Cambridge: Cambridge University Press. Merleau-Ponty, Maurice 1945 Phenomenologie de la perception. Paris: Gallimard. Mikhailov, Feliks 1980 The Riddle ofSelf. Moscow: Progress. Ochs, Elenor, Sally Jacoby and Patrick Gonzales 1994 Interpretive journeys: How physicists talk and travel through graphic space. Configurations 2: 151-171. Ong, WaIter J. 1982 Orality and Literacy: The Technologizing of the Word. New York: Routledge. Roth, Wolff-Michael 1995 Affordances of computers in teacher-student interactions: The case of Interactive Physics™. Journal of Research in Science Teaching 32: 329-347.
456
Wolff-Michael Roth
2002
From action to discourse: the bridging function of gestures. Journal ofCognitive Systems Research 3: 535-554. 2003 Gesture-speech phenomena, learning and development. Educational Psychologist 38: 249-263. 2004 Perceptual gestalts in workplace communication. Journal of Pragmatics. 2006 Identity as dialectic: Making and Re/making self in urban schooling. In: Joe L. Kincheloe, kecia hayes, Karel Rose and Phil M. Anderson (eds.), The Praeger Handbook of Urban Education: 143-153. Westport, eT: Greenwood. Roth, Wolff-Michael and Daniel Lawless 2002 a Signs, deixis, and the emergence of scientific explanations. Semiotica 138: 95-130. 2002 b When up is down and down is up: Body orientation, proximity and gestures as resources for listeners. Language in Society 31: 1-28. Sewell, William H. 1992 A theory of structure: duality, agency and transformation. American Journal ofSociology 98: 1-29. Streeck, Jiirgen 1996 How to do things with things: Objets trouves and symbolization. Human Studies 19: 365-384. ten Have, Paul 1999 Doing Conversation Analysis: A Practical Guide. London: Sage. Uexkiill, Jakob von 1972 Theoretische Biologie. Frankfurt/M: Suhrkamp. (Originally published in 1928)
Index
action 1,3,8,20,21,26,34,36,41, 42,45,46,58,72,109,132,136, 138, 139, 147, 152, 167, 169, 179-183,185,186,188-190, 199,200,204,205,211,214, 219,241-251,253-259,271275,277,287,307,313,320, 323,324,326,328,351,369, 381,382,384-388,392,394, 395,397,398,402,426,431436,442-445,450-453 active perception 197, 198, 204206,214 affordances 7,8,55,68,69,72-74, 76,98,99,186,251,257 agencYlstructure dialectic 431, 433 animal 43, 44, 48, 49, 65, 66, 69, 71,72,76,94-96,98,101,102, 134-138,140,141,168,177, 189,214,218,245,307,321, 327,365,379-381,383,384, 386,391-393,395,396,398400,402,403,405 biosemiotics 7, 9, 85, 101, 102, 105, 109,379,380,382,383,385391,394,395,400,402 bodily mimesis 219,228,231,232, 297,301,312,318,320,322324,326,329 body 2-4,6-9,17,18,22,24,25, 30,35,37,38,40,41,43,47,48, 55-59,62-64,67,69,70,73,7577,85-91,93,96,101,108-114, 116-120,130-134,137,138, 142-145, 147-149, 151-153, 169,179,180,183,189,205, 206,217,219,225,228,231,
241-243,252,253,257-259, 271-288,297,298,301,308, 312,313,318,320,325,329, 339-348,350,351,356,364370,379-384,388,389,392, 395-400,403,405,412,414, 416,431-435,437-447,449-453 body image 8, 217, 228, 231, 271283,285,288,308,313,351 body schema 8, 205, 217, 271-284, 286,288,308,313,351,434 categorization 167, 169, 175, 176, 178,190,206-208,212,214, 215,231,371,376,411,413415,417,426 causality 65, 314, 315, 380, 384, 386,387,393,401 cognitive linguistics 1, 6-8, 17, 85, 113,119,120,198,220,259, 298-300,304,306,308,315, 318,325,339,346,357,358 cognitive neuroscience 49, 144, 183,185,252,339,341,346, 351,361,363,369 cognitive science 1, 3-5, 7, 9, 10, 17-21,47,48,81,86,92,98, 119, 129-13 1, 133, 141-151, 154,168,197,200,242,250, 252,271,273,297,310,311, 313,315,316,339-341,343, 344,347-349,356,358,359, 362-364,369,370,379,382, 383,386,387,393,395,412414,437 communication 9, 43, 44, 46, 47, 115, 130, 131, 138, 152, 153, 208,218,223,228,252,255,
458
Index
284,307,319,320,324,381, 382,416,423,431-433,435437,444,448,450-453 complex systems 197, 199, 200, 231,386 conceptual spaces 8,167,168,171, 173,175,179,182,185,187, 190, 191, 197, 199 consciousness 2, 6, 7, 9, 22, 85-89, 91,97,98,100,102,107-109, 113,119,135,136,201,202, 217,219,222,228,232,275277,279-281,297-300,302304,307,309,312,313,315318,322-325,363,389,392, 395,398,400,402,433,434,452 conventions 297, 303-305, 310, 324,326,328,434,438 cyborg 379,380,399,402-405 deafferentation 271, 273, 277-279 dynamic categorization 197, 204,
206-208,213 ecology 7,55,56,64,75,85,100,
101
416-418,423,425,426,431, 434,453 emergence 23,86,139,219,231, 258,380,383,384,387,401,423 evolution 3, 80, 85, 114, 116, 134, 209,215,231,258,284,301, 318-322,326-328,383,387, 390,394,403,432 force dynamics 8,167,169,172,
179-183,186-191 frames of reference 339, 343, 345,
346,352,354,362-365,367-369 gesture/-s 6, 9, 129, 130, 134, 153,
202,227,228,231,241,255, 258,282-284,301,307,319321,323,324,326-329,362, 365,366,415,431,435,437-453 Gibson, James 7, 55, 56, 61, 64-69, 71-78,79,97-100,102,108, 122, 156, 182, 183, 186, 189, 193,198,204,205,207,234,261 history of cognitive science 129-
131, 133, 138
embodied cognition 1, 3-8, 10, 18,
48, 129, 131, 144, 145, 148-151, 154,206,241-243,249,256, 258,297-300,311,312,348, 356,360,364,393,431 emb0dnnent 1-4,6-9,17,18,21, 48,55,56,76,85-89,91,109, 118-120,129-133,137,143146, 148-153, 185, 187, 197, 206,207,230,241-243,249, 252,255,258-260,271,272, 280,284,286,288,297-301, 308,309,311-314,318,319, 322,324-329,339,340,342, 344,348-354,356-360,363, 365,366,368-370,379-386, 389,390,392,394-404,411414,416-418,423,425,426,
image schema/-s 17,21,32-37,42,
44,45,85,86,90,92,109,111, 114, 120, 167, 187, 189, 191, 198,309,314,324,325 information 7, 48, 55, 62, 68, 69, 72,76,77,105,115,142,146, 154,168,178,181-183,185, 189,227,246,249,255,315, 316,340,382,383,386-388, 390,391,394,397,412,432,435 interactive technology 129, 133, 151, 153, 154 internal meaning space 8, 197, 198, 204,214-217,219,222,223, 225,227,229,231,232
Index intersubjectivity 8, 133, 241, 252,
271,285-288,301,307,351,451
459
nature-culture dualism 55-57, 69,
76, 77 neonate imitation 271, 273, 279,
Japanese mimetics 197, 199, 223,
224,226-228,230
280,284,285 neurobiology 17,21,26,29,48,88,
91,308,319,341,387 language 1,2,4,7-10,17,19,20,
22,23,31,36,37,41,42,44,45, 47,48,59,61,63,66,69,87,93, 96,99,101,104,106,107,110, 115-118, 120, 130, 131, 133, 134,139,140,148,152,173, 174,197-200,202,204,216, 218-220,223,226-228,230232,241,243-245,253,255260,283,297-301,303-314, 318-323,325-329,339,345347,349,350,353-355,359, 361,362,364,365,367,380382,393,400,402,404,411, 413-418,425,426,431,432, 437,442,448,452,453 Lifeworld 85,87,88,99-101,107, 109,110,120 memory 58,59,85,94,114-119,
142,184,203,212-214,229, 231,283,350,397,414,418,432 mental rotation 9, 250, 339, 341343,345-348,356,359,362, 364,365,368 metaphor 4, 5, 7, 9, 17, 20, 37-42, 48,55,61-63,98,111,129,130, 133,142,144,202,218,250, 301,308,310,312,339,357, 358,362,363,366,368,391 mimetic schemas 114, 180, 199, 203,228,231,297,301,322, 324-329 mirror neurons 8, 133, 147, 148, 150,189,217,241,251-255, 271,287,328,436 mutualism 55,75-77
organicism 380, 390, 391 perception 8, 23, 27, 32, 33, 35, 36,
44,60,61,73-75,78,88,95,96, 98,99,106,108,117,132,138, 147,149,167,169,170,172, 182, 183, 186, 190, 193, 195, 200,201,203-205,207,208, 214,217-219,223,226,231, 241-245,249,250,253,257259,271-276,278,281-284, 286,287,311,355,381,400, 402,416,418,420,421,425, 433,436,445 phenomenology 3,7,65,85-91,97, 120,150,200,206,218,271, 272,285,288,297,309,351,391 picture 5,44,67,76,85,96,98,99, 116,130,200,205,231,250, 306,311,329,348,367,387 pragmatism 17-21,32,37,39-42, 47,48,306 proprioception 55, 74, 225, 278, 279,284,286,320 radical embodiment 3, 129, 147,
149-151,386 representation 8, 9, 20, 25, 30, 43,
46,69,86,92,96,104,114,115, 144,147,149,152,168,170, 173,177-180,182-184,186, 190,197,198,200,201,203, 213,219,220,223,226,228, 230,231,244,246,252,259, 260,297-300,306,309-312, 317,322-325,350,364,367,
460
Index
369,379,380,383,388,393, 400,436 Representationalism 17, 19-21, 3033,35,36,42-48 self-organization 390, 411, 418,
423,425 semantics 8, 17, 167, 176, 187, 188,
191,192,223,304,349,357, 382,393 semiotic function 85, 93-95 semiotics 7, 85, 89-92, 94, 99, 105, 132,351,381,383-386,388, 393,394 sign 85,86, 92-109, 114-116, 118, 120,200,223,226,304,319, 320,381-384,386-389,391395,402,438,440,450 simulation theories 5, 12, 147, 241, 243,245,250,258-260 situated practice 431-433, 450-453 social interaction 7, 9, 43, 47, 48, 129-133, 137, 140, 144, 145,
147,148,151-153,252,253, 307,309,313,314,411,417,426 society 141, 381, 382, 397, 399, 400,403-405 space 8,9,23,28,29,33,38,66, 67, 86, 87, 93, 110, 113, 119, 167-178,183,185,186,188, 190,197-199,214-217,219, 224,225,231,232,281,299, 317,319,339-341,346,360, 367,369,414,417,423,426, 445-448 speech 9,47,59,140,174,181, 223,226,231,255,305-307,319, 326,411-418,425,431,434-438, 442,443,445,448,450,451,453 synaesthesia 8, 197, 199, 216-223, 225,230,231 unilateral neglect 271, 273, 277-
279