Embodiment, Ego-Space, and Action
Carnegie Mellon Symposia on Cognition David Klahr, Series Editor Anderson • Cogniti...
211 downloads
732 Views
5MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Embodiment, Ego-Space, and Action
Carnegie Mellon Symposia on Cognition David Klahr, Series Editor Anderson • Cognitive Skills and Thei r Acquisition Carroll/Payne • Cognition and Social Behavior Carver/Klahr • Cognition and Instruction: Twenty-Five Years of Progress Clark/Fiske • Affect and Cognition Cohen/Schooler • Scientific Approaches to Consciousness Cole • Percetpion and Production of Fluent Speech Farah/Ratcliff • The Neuropsychology of High-Level Vision: Collected Tutorial E ssays Gershkoff-Stowe/Rakison • Building Object Categories in Developmental Time Granrud • Visual Perception and Cognition in Infancy Gregg • Knowledge and Cognition Just/Carpenter • Cognitive Processes in Comprehension Kimshi/Behrmann/Olson • Perceptual Organization in Vision: Behavioral and Neural Perspetives Klahr • Cognition and Instruction Klahr/Kotovsky • Complex Information Processing: The Impact of Herbert A. Simon Lau/Sears • Political Cognition Lovett/Shah • Thinking With Data MacWhinney • The Emergence of Language MacWhinney • Mechanisms of Language Acquisition McClellan/Siegler • Mechanisms of Cognitiive Development: Behavioral and Neural Perspectives Reder • Implicit Memory and Metacognition Siegler • Children’s Thinking: What Develops? Sophian • Origins of Cognitive Skills Steier/Mitchell • Mind Matters: A Tribute to Allen Newell VanLehn • Architectures for Intelligence
Embodiment, Ego-Space, and Action Edited by
Roberta L. Klatzky Brian MacWhinney Marlene Behrmann
Psychology Press Taylor & Francis Group 270 Madison Avenue New York, NY 10016
Psychology Press Taylor & Francis Group 27 Church Road Hove, East Sussex BN3 2FA
© 2008 by Taylor & Francis Group, LLC Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-13: 978-0-8058-6288-1 (0) Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the Psychology Press Web site at http://www.psypress.com
Contents
About the Editors
vii
Contributors
ix
Editors’ Preface Roberta L. Klatzky, Marlene Behrmann, and Brian MacWhinney
xi
1
Measuring Spatial Perception with Spatial Updating and Action Jack M. Loomis and John W. Philbeck
2
Bodily and Motor Contributions to Action Perception Günther Knoblich
3
The Social Dance: On-Line Body Perception in the Context of Others Catherine L. Reed and Daniel N. McIntosh
4
Embodied Motion Perception: Psychophysical Studies of the Factors Defining Visual Sensitivity to Self- and Other-Generated Actions Maggie Shiffrar
1 45
79
113
5
The Embodied Actor in Multiple Frames of Reference Roberta L. Klatzky and Bing Wu
145
6
An Action-Specific Approach to Spatial Perception Dennis R. Proffitt
179
7
The Affordance Competition Hypothesis: A Framework for Embodied Behavior Paul Cisek
8
fMRI Investigations of Reaching and Ego Space in Human Superior Parieto-Occipital Cortex Jody C. Culham, Jason Gallivan, Cristiana Cavina-Pratesi, and Derek J. Quinlan
203
247
v
vi 9
10
11
Contents The Growing Body in Action: What Infant Locomotion Tells Us About Perceptually Guided Action Karen E. Adolph
275
Motor Knowledge and Action Understanding: A Developmental Perspective Bennett I. Bertenthal and Matthew R. Longo
323
How Mental Models Encode Embodied Linguistic Perspectives Brian MacWhinney
369
Author Index
411
Subject Index
419
About the Editors
Marlene Behrmann, PhD, i s a p rofessor i n t he Depa rtment o f Psychology, C arnegie M ellon U niversity, a nd ha s a ppointments in t he C enter f or t he Neural Ba sis o f C ognition (Carnegie Mellon University a nd U niversity o f P ittsburgh) a nd i n t he Depa rtments of N euroscience a nd C ommunication Di sorders a t t he U niversity of P ittsburgh. Her research focuses on t he ps ychological a nd neural mechanisms that underlie the ability to recognize visual scenes and objects, represent them internally in visual imagery, and interact with them through eye movements, reaching and grasping, and navigation. One major research approach involves the study of individuals wh o ha ve su stained b rain d amage t hat s electively a ffects their v isual p rocesses, i ncluding i ndividuals w ith l esions t o t he parietal cortex and to the temporal cortex. Thi s neuropsychological approach is combined with several other methodologies, including behavioral st udies w ith n ormal sub jects, s imulating n eural b reakdown u sing neural network models, a nd ex amining t he biological substrate using functional and structural neuroimaging to elucidate the neural mechanisms supporting visual cognition. Roberta L. Klatzky, PhD, is a professor of psychology at Carnegie Mellon University, where she is also on the faculty of the Center for the Neural Basis of Cognition a nd t he Human–Computer Interaction Institute. She received a BS in mathematics from the University of Michigan and a PhD in experimental psychology from Stanford University. Before coming to Carnegie Mellon, she was a member of the faculty at the University of California, Santa Barbara. Klatzky’s research interests are in human perception and cognition, with special emphasis on spatial cognition and haptic perception. She ha s done ex tensive research on human haptic a nd v isual object recognition, navigation under visual and nonvisual guidance, and perceptually guided action. Her work has application to navigavii
viii
About the Editors
tion aids for the blind, haptic interfaces, exploratory robotics, teleoperation, and virtual environments. Professor Klatzky is the author of over 200 articles and chapters, and she has authored or edited six books. Brian MacWhinney, PhD, is a professor of psychology at Carnegie Mellon University. He is also on the faculty of Modern Languages and the Language Technologies Institute. His work has examined a variety of issues in first and second language learning and processing. Recently, he has been exploring the role of embodiment in mental imagery as a support for language processing. He proposes that this embodied mental imagery is organized through a system of perspective taking that operates on the levels of direct perception, space/time/ aspect, action plans, and social schemas. Grammatical structures, such as pronominalization and relativization, provide methods for signaling perspective switches on each of these levels. He is interested in relating t his h igher l evel ps ycholinguistic ac count t o ba sic n eural a nd perceptual mechanisms for t he construction a nd projection of t he body image.
Contributors
Karen E. Adolph, PhD Department of Psychology New York University New York, New York (USA) Bennett I. Bertenthal, PhD Department of Psychology University of Indiana Bloomington, Indiana (USA) Cristiana Cavina-Pratesi, PhD Department of Psychology University of Durham Durham, Great Britain (UK) Paul Cisek, PhD Department of Physiology University of Montreal Montreal, Quebec (Canada) Jody C. Culham, PhD Department of Psychology and Ne uroscience Program University of Western Ontario London, Ontario (Canada) Jason Gallivan, PhD Neuroscience Program University of Western Ontario London, Ontario (Canada)
Günther Knoblich, PhD Rutgers University Newark, New Jersey (USA) and Center for Interdisciplinary Research University of Bielefeld Bielefeld, Eastern Westphalia (Germany) Matthew R. Longo, PhD Institute of Cognitive Neuroscience University College London London, Great Britain (UK) Jack M. Loomis, PhD Department of Psychology University of California, Santa Barbara Santa Barbara, California (USA) Daniel N. McIntosh, PhD Department of Psychology University of Denver Denver, Colorado (USA) John W. Philbeck, PhD Department of Psychology George Washington University Washington, D.C. (USA)
ix
x
Contributors
Dennis R. Proffitt, PhD Department of Psychology University of Virginia Charlottesville, Virginia (USA)
Maggie Shiffrar, PhD Department of Psychology Rutgers University Newark, New Jersey (USA)
Derek J. Quinlan, PhD Neuroscience Program University of Western Ontario London, Ontario (Canada)
Bing Wu, PhD Robotics Institute Carnegie Mellon University Pittsburgh, Pennsylvania (USA)
Catherine L. Reed, PhD Department of Psychology University of Denver Denver, Colorado (USA)
Editors’ Preface
This volume is a collection of papers presented at the 34th Carnegie Symposium on Cognition, held at Carnegie Mellon University in Pittsburgh in June 2006. The s ymposium w as motivated by t he increasing visibility of a research approach that has come to be called embodiment. But what, exactly, is embodiment? For an insight into this question, consider a s eemingly elementary ac tion: breathing. W hen we breathe, we engage in a series of inhalations that are largely involuntary. These inhalations are physiologically equivalent to the inhalations we produce voluntarily when we want to sniff a flower. Even so, are the effects of the involuntary inhalations the same as the effects of the voluntary sniffs? Brain imaging shows that there are clear differences in activation, depending on the intention of the sniffer (Zelano et al., 2005). If it is true that even something as basic as breathing is co ntextualized, w e sh ould n ot be su rprised t o find t hat p erception a nd cog nition a re per meated by t he context of t he s elf i n t he world, where t hat context i s s ensory, spatial, temporal, soc ial, a nd goal directed. That is embodiment. Our definition o f em bodiment t akes co ntextual i nfluence a s i ts cornerstone. It is based on the assumption that the way people perceive and act in the world around them is influenced by their ongoing representations of th emselves in th at world. The overarching goal of the 34th symposium was to further our understanding of embodiment f rom m ultiple perspec tives—mechanistic (including co mputational), ne urophysiological, a nd d evelopmental. W e i ntended to identify i mportant p henomena t hat r eflect em bodiment, adv ance theoretical u nderstanding ac ross a b road r ange o f r elated fields, and provide a foundation for future efforts in this emerging area of research.
xi
xii
Editors’ Preface
The f ormulation o f em bodiment u nderlying t he sha ping o f t he symposium w as p urposefully g eneral, i n o rder t o ac commodate the variety of approaches we saw as relevant. One particularly clear position has been ar ticulated by Clark (1999). Clark seized on J. J. Gibson’s concept of affordance as t he central construct of embodiment. Gibson proposed that the environment offers or “affords” to the perceiver/actor d irect c ues about what ac tions a re pos sible. A s the organism acts on the environment, its own state changes, as do the affordances offered by the world, creating an ongoing dynamic chain. W hen a n o rganism i s a ble to c reate a n i nternal si mulation of t hese dy namic e vents, presumably by i nvoking its i ntrinsic perceptual–motor mechanisms, something new happens: A mode of processing emerges that facilitates perception and cognition. It is the embodied mode. The idea that an embodied processing mode exists has won increasing acceptance, but its scope remains a po int of contention. As formulated by Clark (1999), an extreme view of embodied cognitive science is that all thinking is embedded in body-based dynamic simulations; there is no fi xed modular structure for the mind, nor do t heorists n eed t o post ulate abst racted r epresentations. W ilson (2002, p. 626) described as radical the view that, “The information flow between mind and world is so dens e and continuous t hat, for scientists studying the nature of cognitive activity, the mind alone is not a meaningful unit of analysis.” We are not such extremists, but we find the range of phenomena that potentially reflect the embodied context to be rich indeed. Consider some e veryday obs ervations t hat m ight i nvoke s imulation of the body: A person turns to a friend while walking but maintains her course unerringly. An infant who is just mastering the act of walking ac commodates i mmediately to wearing a h eavy coat. A ten nis player finds her game improved after watching a professional match. A newborn imitates a face he sees. Grandparents and grandchildren watching a p uppet show respond to t he dolls’ gestures and expressions as if they were people. Few formalisms have been put forward to describe how the context of the perceiver/actor functions at a mechanistic level and what neural structures support those functions. At this point, we have more phenomena than we have mechanisms. Behavioral research has revealed a n umber of tantalizing outcomes t hat point to a r ole for the representation of the body in basic human function. Embodi-
Editors’ Preface
xiii
ment has been theorized to play a r ole in eye movements, reaching and g rasping, locomotion a nd navigation, i nfant i mitation, spatial and soc ial perspec tive t aking, p roblem so lving, a nd dy sfunctions as d iverse a s phantom l imb pa in a nd autism. Neuroscientists have identified multiple sensorimotor maps of the body within the cortex, specific brain areas devoted to t he representation of space and place, and cells that acknowledge the relation between one’s own and another’s movements. Developmental researchers have studied neonatal behaviors i ndicating a r epresentation of s elf a nd have t raced the course of spatially oriented action across the early years. Computational m odelers ha ve po inted t o s ensory-based f eed-forward mechanisms in motor control. In organizing the symposium, we felt that what was needed was a shared effort to merge these perspectives to further our understanding of the forms and functional roles of the embodied representation. As a po tentially u seful st arting po int, w e su ggest t hree em bodied perspectives t hat m ight form a co ntext for perceiving, ac ting, a nd thinking: 1. 2.
3.
The body image is the ongoing internal record of the relative disposition of body parts across time. The bod y sche ma i s t he s et o f p otential b ody i mages, w here t he potential pertains either to the capability of all members of a species or specifically to one’s own self. (We acknowledge that these terms have a variety of definitions in the literature—see Paillard, 1999, following Head & Holmes, 1911.) The spatial image is a representation of the current disposition of the body within surrounding space at a given point in time.
How might the body image, body schema, and spatial image play a role in perception and action? Directly opposing the extreme view of embodiment, what we will call the nonembodied approach would postulate t hat body -based r epresentations a re m erely el ements o f information p rocessing t hat p rovide n ecessary d ata f or co mputations. F or ex ample, co rollary-discharge t heory de scribes a n a lgorithm that enables the organism to keep a stable spatial image during an eye movement, whereby the updating of eye-position coordinates cancels the flow of visible elements in retinal coordinates. Both coordinate systems exist and provide necessary data, but this mechanism imputes no special status to them. As another example, updating of the spatial image while walking without vision might be performed
xiv
Editors’ Preface
by sensing proprioceptive signals, deriving estimates of translational and rotational velocity, and feeding those to an internal process that integrates the signals over time. In this example, as in the first, the afferent s ignals a nd t he der ived e stimate a re m erely d ata; t he fac t that they represent the body has no special mechanistic implications. This model requires neural mechanisms related to the self solely to support a coherent body experience. A second model, which we call the mapping model, proposes that the body image and schema function as integrated representations of relatively complex sensorimotor patterns. However, this model further stipulates that body representations play no fundamental role beyond s erving a s complex elements for purposes of i nput matching and output generation. On the input side, sensorimotor patterns would r esonate t o per ceptual i nputs, h ence enab ling r ecognition (conscious or unconscious) of one’s own or another’s action. On the output s ide, t he body i mage s erves a s a t oken for motor s ynergies that execute complex ac tions. Activation of t he body i mage would initiate execution of the corresponding motor program, but beyond this “ button-pressing” f unction, embodiment would play no direct role i n motor control. The mapping model requires neural mechanisms to represent complex patterns i n perception a nd ac tion, but does not propose that they are simulated to provide context. A third approach, which we call the embodied model, assumes that body representations are a unique form of data that are mechanistically involved in a broad set of information-processing capabilities, by virtue of perceptual–motor simulation and the context it provides. The visual input from another’s actions might be i nterpreted by activating the body image, analogously to the analysis-by-synthesis theory of speech perception, which postulates that listeners create or synthesize an ongoing predictive model of the speech that they are hearing. The body s chema might a llow per formers to compare t he observed behavior of another to their own habitual actions, enabling them to improve by watching an expert. The spatial image would be used to plan pathways through the immediate environment. This f raming of t he em bodied m odel g ives r ise t o f undamental issues, including: How are t he body i mage, body s chema, and spatial i mage i mplemented, f unctionally a nd n europhysiologically? (Some i mplementation i s r equired, wh ether one a ssumes t he n onembodied model, where t hese representations merely support subjective impressions, or the mapping or embodied model, where they
Editors’ Preface
xv
function directly in information processing.) How do these entities function in thinking, as well as perceiving and acting? How and for what purposes a re d iverse body pa rts i ntegrated i nto a r epresentation of the self, and how is this representation updated as the person/ environment linkage changes through external forces or the person’s own ac tions? W hat k inds o f n eural st ructures su pport s imulated movements o f t he body t hat m ight be u sed f or l earning a nd p removement planning? What are the developmental origins and timecourse of the body image, spatial image, and body schema? Authors of this volume bring to bear on these and related questions a broad range of theory and empirical findings. Biological foundations and models are dealt with by Culham and Cisek; the spatial image is the focus of chapters by Klatzky, Loomis, and Proffitt; Reed and Shiffrar consider t he body i mage a nd body s chema; K noblich, Adolph, a nd Bertenthal provide t he de velopmental v iewpoint; a nd MacWhinney brings a linguistic perspective. The symposium leading to this book was charged with excitement about the specific research presented and the overall perspective of embodiment, and it is our hope that its publication will enable readers to share in that excitement. We gratefully acknowledge the symposium support provided by the National Science Foundation under Grant No. 0544568. Roberta Klatzky Marlene Behrmann Brian MacWhinney References Clark, A. (1999). An embodied cognitive science? Trends i n C ognitive S ciences, 3, 345–351. Head, H., & Holmes, G. (1911). Sensory disturbances from cerebral lesions. Brain, 34, 102–254. Paillard, J. ( 1999). B ody s chema a nd b ody i mage—A double d issociation in d eafferented pa tients. I n G . N. G antchev, S . M ori, & J. Ma ssion (Eds.), Motor control, today and tomorrow (pp. 197–214). Sofia, Bulgaria: Academic. Wilson, M. (2002). Six views of embodied cognition. Psychonomic Bulletin & Review, 9, 625–636. Zelano, C., Bensafi, M., Porter, J., Mainland, J., Johnson, B., Bremner, E. et al. (2005). Attentional modulation in human primary olfactory cortex. Nature Neuroscience, 8, 114–120.
1 Measuring Spatial Perception with Spatial Updating and Action
Jack M. Loomis and John W. Philbeck
Measurement of perceived egocentric distance, whether of visual or auditory t argets, i s a t opic of f undamental i mportance t hat i s st ill being actively pursued and debated. Beyond its intrinsic interest to psychologists a nd philosophers a like, it is i mportant to t he u nderstanding of many other topics which involve distance perception. For example, many complex behaviors like driving, piloting of aircraft, sport activities, and dance often involve distance perception. Consequently, understanding when and why errors in distance perception occur w ill i lluminate t he reasons for er ror a nd d isfluency i n t hese behaviors. Also, the understanding of distance perception is important in the current debate about the “two visual systems,” one ostensibly concerned with the conscious perception of 3-D space and the other with on-line control of action. Similarly, determining whether nonsensory fac tors, such a s i ntention t o ac t a nd en ergetic st ate o f the obs erver, i nfluence per ceived d istance, a s ha s be en cla imed (e.g. P roffitt, Stefanucci, Ba nton, & Epstei n, 2 003; Witt, Proffitt, & Epstein, 2004, 2005) depends critically on the meaning of distance perception a nd how it is to be m easured. Still a nother topic where measurement o f d istance per ception is cr itical is sp atial u pdating (the imaginal updating of a t arget perceived only prior to observer movement) in volving o bserver tr anslation. B eing a ble t o m easure 1
2
Embodiment, Ego-Space, and Action
the accuracy of spatial updating depends upon being able to partial out errors due to misperception of the initial target distance (Böök & Gärling, 1981; Loomis, Klatzky, Philbeck, & Golledge, 1998; Loomis, Lippa, Klatzky, & Golledge, 2002; Philbeck, Loomis, & Beall, 1997). Finally, m easurement o f d istance per ception is i mportant f or t he development of effective visual and auditory displays of 3-D space. Indeed, de veloping v irtual r eality s ystems t hat ex hibit na turally appearing scale has proven an enormous challenge, both for visual virtual reality (Loomis & Knapp, 2003) and for auditory virtual reality (Loomis, Klatzky, & G olledge, 1999), and there has been a spate of recent research articles concerned with understanding the causes for uniform scale compression in many visual virtual environments (e.g., Creem-Regehr, Willemsen, Gooch, & Tho mpson, 2005; Knapp, 1999; K napp & L oomis, 2004; Sahm, Creem-Regehr, Tho mpson, & Willemsen, 2 005; Thompson, W illemsen, G ooch, Cr eem-Regehr, Loomis e t al ., 2 004). Virtual reality s ystems th at successfully c reate a realistic sense of scale will enjoy even greater aesthetic impact and user acceptance and will prove even more useful in the training of skills, such as safe road crossing behavior by blind and sighted children. Indirectness of Perception Naïve realism is the commonsense view that the world we encounter in everyday life is identical with the physical world that we come to k now about t hrough our s chooling. Following dec ades of i ntellectual inquiry, philosophers of mind and scientists have come to an alternate v iew referred to a s “representative realism”—that contact with t he physical world is indirect a nd t hat what we experience in everyday life is a representation created by our senses and central nervous system (e.g., Brain, 1951; Koch, 2003; Lehar, 2003; Loomis, 1992; R ussell, 1 948; S mythies, 1 994). I ndeed, t his r epresentation, generally referred to as the phenomenal world, is so highly consistent and veridical that we routinely make life-depending decisions without ever suspecting that the perceptual information upon which we are relying is once removed from the physical world. The high degree of functionality of the perceptual process accounts for its being selfconcealing and for the reason that most laypeople and indeed many scientists think of perception as little more than attention to aspects of the environment.
Measuring Spatial Perception with Spatial Updating and Action
3
The r epresentational na ture o f per ceptual ex perience is ea sy t o appreciate w ith co lor v ision bec ause t he ma pping f rom p hysical stimulation to perceptual spac e entails a h uge loss of i nformation, from the many dimensions of spectral lights to the three perceptual dimensions of photopic color vision. In order to appreciate the representational nature of perception more generally it is helpful to keep in mind such perceptual phenomena as diplopia, binocular stereopsis elicited by stereograms, geometric visual illusions, and motion illusions; such phenomena point to a physical world beyond the world of appearance. A lthough experiencing such p henomena momentarily reminds us of the representational nature of perception, we too easily la pse back i nto na ïve r ealism wh en d riving o ur c ars, eng aging in sports ac tivity, a nd i nteracting w ith other pe ople. It i s quite a n intellectual challenge to appreciate that the very three-dimensional world we experience in day-to-day life is an elaborate perceptual representation. Indeed, many people seem to be naïve realists when it comes to visual space perception, for they think of visual space perception largely as one of judging distance. But, visual space perception is so much more than this—it gives rise to our experience of the surrounding v isual w orld, c onsisting o f su rfaces a nd objects ly ing in dep th ( e.g., G ogel, 1 990; H oward & R ogers, 2 002; L oomis, Da Silva, Fujita, & F ukusima, 1992; Marr, 1982; Ooi, Wu, & H e, 2006; Wu, O oi, & H e, 2 004). V irtual r eality ma kes t he r epresentational nature o f v isual spac e per ception ob vious ( Loomis, 1 992), f or t he user experiences being immersed within environments which have no physical existence (other t han being bits in computer memory). Teleoperator s ystems a re u seful f or d rawing t he s ame co nclusion. Consider a visual teleoperator system consisting of a head-mounted binocular d isplay a nd ex ternally mounted v ideo c ameras for d riving t he display. The user of such a tel eoperator system ex periences full presence in the physical environment while being intellectually aware t hat t he v isual st imulation co mes o nly i ndirectly b y w ay o f the display. Because the added deg ree of mediation associated with the display pales in comparison with the degree of mediation associated w ith v isual processing, t he representational nature of perception when using a teleoperator points to the representational nature of ordinary perception. How one conceives of perception determines how one goes about measuring perceived distance. For the researcher who accepts naïve realism, perceiving distance is simply a matter of judging distance in “physical space.” Under t his conception, one can simply ask t he
4
Embodiment, Ego-Space, and Action
observer how far away objects are and then correct for any judgmental biases, such as reporting 1 m as 2 m. In contrast, for researchers who adhere to the representational conception, the measurement of distance perception is a major challenge, inasmuch as one is attempting to measure aspects of an internal representation. Because one starts with behavior of some kind (e.g., verbal report, action) and because there c an be d istortions a ssociated w ith t he readout f rom i nternal representation to behavior, measurement of perception depends on a theory connecting internal representation to behavior, a theory that is best developed using multiple response measures (e.g., Foley, 1977; Philbeck & Loomis, 1997).
Some Methods for Measuring Perceived Distance Verbal report and magnitude estimation are t wo traditional methods f or m easuring per ceived d istance ( Da S ilva, 1 985). F igure 1 .1 gives the results of a number of studies using verbal report for target distances out to 28 m (Andre & R ogers, 2006; Foley, Ribeiro, & Da Silva, 2 004; Kelly, L oomis, & B eall, 2 004; K napp & L oomis, 2 004; Loomis et al., 1998; Philbeck & Loomis, 1997). The data sets are generally well fit by linear functions with 0 intercepts, but the slopes are generally less than 1.0. The mean slope is 0.80. Concerns abo ut t he pos sible i ntrusion o f k nowledge a nd bel ief into such judgments (Carlson, 1977; Gogel, 1974) have prompted the search for alternative methods. So-called indirect methods make use of other perceptual judgments thought to be less subject to intrusion by cognitive factors and then derive estimates of perceived distance by way of theory. Several of these methods rely on so-called perceptpercept co uplings. S pace per ception r esearchers ha ve l ong k nown that per ceptual va riables o ften co vary w ith o ne a nother ( Epstein, 1982; Gogel, 1984; Sedgwick, 1986). In some cases these covariations may be t he result of joint determination by common stimulus variables, but in other cases variation in one perceptual variable causes variation in another (Epstein, 1982; Gogel, 1990; Oyama, 1977); such causal covariations are referred to as percept-percept couplings. The best k nown coupling i s t hat be tween perceived s ize a nd perceived egocentric distance and is referred to as size-distance invariance (Gilinsky, 1 951; M cCready, 1 985; S edgwick, 1 986). S ize-distance invariance i s t he r elationship be tween perceived s ize (S’) a nd perceived egocentric distance (D’) for a v isual stimulus of angular size
Measuring Spatial Perception with Spatial Updating and Action
5
α: S ’ = 2D’ t an (α/2). A spec ial c ase o f s ize-distance i nvariance i s Emmert’s Law—varying the perceived distance of a stimulus causes perceived s ize to v ary p roportionally, w ith a ngular s ize h eld co nstant. Another coupling of perceptual variables is that between the perceived distance of a target and its perceived motion (Gogel, 1982, 1993). G ogel dem onstrated t hat t he perceived m otion o f a n object can be altered by mere changes in its perceived distance while keeping all other variables constant. He developed a quantitative theory for this coupling between perceived distance and perceived motion and applied it in explaining the apparent motion of a variety of stationary ob jects, such a s dep th-reversing figures a nd t he i nverted facial mask (Gogel, 1990). The existence of percept-percept couplings is m ethodologically i mportant, for t hese couplings c an be u sed t o measure perceived distance in situations where the researcher wishes observers not to be aware that perceived distance is being measured. Judgments of perceived size and perceived motion have been used to measure perceived distance (e.g., Gogel, Loomis, Newman, & Sha rkey, 1985; Loomis & K napp, 2003) and to demonstrate the effect of an e xperimental manipulation on perceived d istance (e.g., Hutchison & Loomis, 2006a). Another indirect method of measuring perceived distance involves judgments of collinearity and relies on the perception of exocentric direction. A visible pointer is adjusted by the observer to be aligned with t he t arget st imulus (W u, K latzky, Sh elton, & S tetten, 2 005); these authors used the method to measure the perceived distance of targets within arm’s reach under the assumption that the pointer is perceived correctly. Application of the method to the measurement of la rge perceived d istances seems promising, but t he method w ill have to compensate for systematic biases in exocentric direction perception (Cuijpers, Kappers, & Koenderink, 2000; Kelly et al., 2004). Still other indirect methods rely on judgments of perceived exocentric extent and attempt, by way of theory, to construct scales of perceived d istance. The best k nown ex ample is t he work by Gi linsky (1951) a nd, more recently, O oi a nd He (2007). I n t heir ex periments, observers constructed a s et of equal-appearing intervals on the g round ex tending d irectly a way f rom t he o bserver. The more distant i ntervals had t o be made p rogressively la rger i n o rder t o appear of constant size. Assuming that perceived egocentric distance over t he g round plane to a g iven point i s t he concatenation of t he equal appearing intervals up to that point, the derived perceived distance can be associated with the corresponding cumulative physical
6
Embodiment, Ego-Space, and Action
Figure 1.1 Summary of verbal reports of distance for visual targets. The data from t he d ifferent s tudies h ave b een d isplaced ve rtically for pu rposes of c larity. The dashed line in each case represents correct responding. S ources from top to bottom: Experiment 3 of P hilbeck and Loomis (1997), experimental condition of Experiment 2A of Loomis et al. (1998), calibration condition of Experiment 2A of Loomis et al. (1998), results for gymnasts in Experiment 1 of Loomis et al. (1998), results of f ull field of v iew condition of K napp & L oomis (2004), mean data from the control conditions in the 3 experiments of Andre and Rogers (2006), results of Kelly, Loomis, and Beall (2004), and egocentric distance judgments of Foley et al. (2004).
distance. The der ived f unction o f per ceived eg ocentric d istance i s compressively nonlinear even within 10 m a nd under full-cue conditions. Because t he derived f unction is noticeably discrepant with other functions to be d iscussed here and because there are process interpretations f or d oubting t hat t he der ived f unction i s i ndeed a measure of perceived distance, we do not discuss it further.
Methods Based on Action and Spatial Updating Given t he i mportance of d istance perception a nd t he lack o f consensus about how to measure it, researchers have occasionally pro-
Measuring Spatial Perception with Spatial Updating and Action
7
posed n ew m easurement p rocedures. H ere, w e f ocus o n r elatively new methods for measuring perceived distance that rely on action, sometimes w ith t he i nvolvement o f spa tial u pdating. The typical procedure begins with the stationary observer viewing or listening to a target stimulus. After this period of “preview,” further perceptual i nformation abo ut t he t arget i s r emoved b y oc cluding v ision and hearing, a nd t he observer attempts to demonstrate k nowledge of the target’s location by some form of action (e.g. pointing, walking, or throwing a ball). Visually directed pointing was a term coined by Foley and Held (1972) to refer to blind pointing with the finger to the 3-D location of a v isual target that had been previously viewed. This type of response has been used in other studies to measure the perceived locations of v isual targets w ithin a rm’s reach (e.g., Bingham, Bradley, Bailey, & Vinner, 2001; Foley, 1977; Loomis, Philbeck, & Zahorik, 2002). For more distant targets, ball or bean bag throwing has been used (Eby & Loomis, 1987; He, Wu, Ooi, Yarbrough, & Wu, 2004; Sahm et a l., 2005; Smith & S mith, 1961). A nother form of v isually d irected ac tion, blind walking (sometimes c alled “open loop walking”), has been used to study the perception of distances of “action space” (distances beyond reaching but within the range of most action planning; Cutting & V ishton, 1995); here t he observer typically v iews a t arget on t he g round a nd attempts to w alk to its location without vision. These various forms of open-loop behavior, along with others to be discussed, are referred to collectively as “perceptually directed action.” Many studies have used blind walking to assess t he accuracy of perceiving the distances of targets viewed on the ground under fullcue co nditions, f or d istances u p t o 2 8 m ( Andre & R ogers, 2 006; Corlett, Byblow, & Taylor, 1990; Corlett & Patla, 1987; Creem-Regehr et al., 2005; Elliott, 1987; Elliott, Jones, & Gray, 1990; Knapp & Loomis, 2004; Loomis et al., 1992; Loomis et al., 1998; Messing & Durgin, 2005; Rieser, Ashmead, Talor, & Youngquist, 1990; Steenhuis & Goodale, 1988; Thomson, 1983; Wu e t a l., 2 004). Figure 1.2 shows many of the results, with the data sets shifted vertically for purposes of clarity. Except for two data sets, perceived distance is proportional to physical distance with no evidence of systematic error (slopes of the best fitting l inear f unctions a re generally close to 1, a nd i ntercepts are near zero). In contrast, when t he same task, modified for audition, is u sed t o st udy d istance per ception o f so und-emitting sources heard out-of-doors, systematic errors are observed over the
8
Embodiment, Ego-Space, and Action
Figure 1.2 Summary of blind walking results for vision. The data from the different studies have been displaced vertically for purposes of clarity. The dashed line in each case represents correct responding. Sources f rom top to b ottom: E xperiment 3 of P hilbeck and Loomis (1997), Experiment 1 of W u, Ooi, and He (2004), mean data from the control conditions in the 3 experiments of Andre and Rogers (2006), results of E lliott (1987), average of t wo g roups of ob servers f rom E xperiment 1 of Loomis et al. (1998), Experiment 2b of Loomis et al. (1998), Experiment 1 of Loomis et al. (1992), Thomson (1983), Rieser et al. (1990), Steenhuis and Goodale (1988), and Experiment 2a of Loomis et al. (1998).
same range of d istances (Ashmead, DeFord, & N orthington, 1995; Loomis et al., 1998; Speigle & Loomis, 1993). Figure 1.3 shows representative results; this time, the data sets have not been shifted vertically. The be st l inear f unctions have slopes close to 0. 5, i ndicating response co mpression r elative t o t he st imulus r ange, a nd t here i s considerable variability in the intercepts. We should mention, however, that in a recent review of these and other results obtained using other response measures including verbal report, Zahorik, Brungart, and Bronkhorst (2005) fit power functions to the data and generally found exponents less than 1.0, the interpretation being that perceived auditory d istance i s a co mpressively n onlinear f unction o f so urce distance. Still, for t he range of d istances i n Figure 1.3, t he conclusion that they are linear functions with roughly constant slope but
Measuring Spatial Perception with Spatial Updating and Action
9
Figure 1.3 Summary of bl ind walking results for aud ition. These are the actual data and have not been displaced vertically for purposes of clarity. The dashed line represents correct responding. Sources from top to bottom: Ashmead et al. (1995), Speigle and Loomis (1993), Experiment 1 of L oomis et al. (1998), and Experiment 2a of Loomis et al. (1998)
varying intercept is justified. The source of the variation in intercept is a mystery. With vision, systematic errors do arise. Sinai, Ooi, and He (1998) and He et al. (2004) have found that when the ground surface is interrupted by a g ap, v isual targets resting on t he ground are mislocalized even with full cue viewing. Larger systematic errors occur when visual cues to distance are minimal. Figure 1.4 gives the results of a st udy by Philbeck a nd L oomis (1997) i n wh ich blind w alking responses a nd v erbal r eport w ere ob tained u nder t wo co nditions: reduced cues (luminous targets of constant angular size at eye level in the dark) and full-cues (the same targets placed on the floor with normal room lighting). When cues were minimal, both types of judgment showed large systematic errors, and when cues were abundant, both types of judgment showed small systematic errors. This study also showed that when the verbal responses were plotted against the walking responses in these and two other conditions, the data were
10
Embodiment, Ego-Space, and Action
Figure 1.4 Results of a n e xperiment u sing b oth ve rbal re port a nd v isuallydirected blind walking in reduced-cue and full-cue conditions. Adaptation of Figure 5 f rom Philbeck, J. W. & L oomis, J. M . (1997). Comparison of t wo indicators of v isually p erceived e gocentric d istance u nder f ull-cue a nd reduced-cue conditions. Journal of E xperimental Ps ychology: H uman P erception a nd P erformance, 23, 72–85.
well fit by a single linear function, suggesting that variations in the two response measures are controlled by the same internal variable, visually perceived distance. A related experiment was concerned with measuring the perceptual er rors i n v isual v irtual r eality (Sahm e t a l., 2 005). Obs ervers performed blind walking and bean bag throwing to targets in both a real environment and a v irtual environment modeled on the real environment. Prior to testing, observers were given feedback only about t heir t hrowing per formance i n t he r eal en vironment. The results are given in Figure 1.5. The fact that the transition from the real to virtual environment produces the same errors in walking and throwing supports the claim that the two actions, one that involves locomotion and the other that does not, are controlled by the same internal variable, visually perceived distance. In addition, the results provide further support for the growing consensus that current virtual reality s ystems produce u nder perception of d istance (Knapp, 1999; Thompson et al., 2004). Triangulation Methods The s imilarity o f t he w alking m easures a nd v erbal r eports abo ve might be taken as evidence of a simple strategy for performing blind
Measuring Spatial Perception with Spatial Updating and Action
11
Figure 1.5 Results of an experiment using both visually-directed blind walking and v isually d irected t hrowing i n re al a nd v irtual e nvironments. Re printing of Figure 3 of S ahm, C. S., Creem-Regehr, S. H., Thompson, W. B., & Wi llemsen, P. (2005). Throwing versus walking as indicators of distance perception in similar real and virtual environments. ACM Transactions on Applied Perception, 2, 35–45.
walking—while perceiving the target, estimate its distance in feet or meters, and then, with vision and hearing occluded, walk a distance equal to the estimate. Whereas t he blind walking t ask m ight well be per formed u sing this s imple st rategy, t here a re o ther cl osely r elated t asks t hat c annot. F oremost a re t riangulation t asks t hat r equire t he obs erver t o constantly u pdate t he est imated l ocation o f t he t arget wh ile m oving about i n t he absence of f urther perceptual i nput specifying its location. Figure 1.6 depicts three triangulation tasks that have been used. In “triangulation by pointing,” t he observer v iews (or listens
Figure 1.6
Three triangulation methods (see text for explanation).
12
Embodiment, Ego-Space, and Action
to) a target and, then without further perceptual information about its location, walks along a st raight path to a new location (specified by auditory or haptic signal) and then points toward the target. The pointing direction is used to triangulate the initially perceived and spatially updated target location. In one variant, the arm orientation was monitored continuously as the observer walked along a straight path (Loomis et a l., 1992) a fter v iewing a t arget on t he g round up to 5.7 m away; the average pointing responses were highly accurate. “Triangulation by walking” (also, “triangulated walking”) is similar to triangulation by pointing except that after the initial straight path, the observer turns and walks a short distance toward the target. The walking direction after the turn is used to triangulate the perceived and updated target location. Finally, in the “ indirect walking” version of triangulation, the observer walks to a turn point (specified by auditory or haptic signal), and then attempts to walk the rest of the way to the updated target location. Figure 1.7 gives the results of a number of experiments using the different triangulation methods to measure the perceived distance of targets viewed under full-cue conditions (Fukusima, Loomis, & Da Silva, 1997; Knapp, 1999; Loomis et al., 1998; Philbeck et al., 1997; Thompson et al., 2004); as in Figure 1.2, the data sets have been vertically shifted for cla rity. A lthough the data are more variable than with blind walking (Figure 1.2), they indicate overall that perceived distance is proportional to target distance with little systematic error. A Model of Perceptually Directed Action Because blind walking and the triangulation methods just mentioned rely on actions that occur after the percept has disappeared, it might be argued that these methods cannot be used to measure perception because the action depends upon postperceptual processes (e.g., Proffitt et al., 2006). However, we maintain that a valid measurement method is one for which the variations in the indicated values (those resulting f rom t he measurement process) are coupled to variations in the variable being measured and for which a calibration between the two has been established (Hutchison & Loomis, 2006b). As with any measurement device (e.g., a thermometer with an electronic display), the indirectness of the mechanism between the variable being measured and the indicator has no bearing on whether the indicated
Measuring Spatial Perception with Spatial Updating and Action
13
Figure 1.7 Summary of t riangulation re sults for v ision u sing t riangulation by pointing, triangulation by walking, and indirect walking, all obtained under fullcue conditions. The data from the different studies have been displaced vertically for purposes of clarity. The dashed line in each case represents correct responding. Sources from top to b ottom: results of i ndirect walking by P hilbeck et al. (1997), results of t riangulation by w alking in real environment (Thompson et a l., 2004), outdoor results of Knapp (1999), average of two conditions from Experiment 3 (triangulation by w alking) of Fu kusima et al. (1997), Experiment 3 (direct and indirect walking) of Loomis et al. (1998), average of two conditions from Experiment 4 (triangulation by walking) of Fukusima et al. (1997), and average of two conditions from Experiment 2 (triangulation by pointing) of Fukusima et al. (1997).
values are proper measures of the variable of interest, here perceived distance. In the case of perceptually directed action, what is required is a theory linking the indicated value to perceived distance. Action can be used to measure perception provided that the postperceptual processes introduce no systematic biases or, if they do, that the biases can be co rrected for by w ay of c alibration. O f course, a s w ith a ny measurement device or method, the precision of measurement will ultimately be limited by random noise associated with each of the subsequent processes, even if systematic biases can be eliminated by calibration. Here, we present a model of perceptually directed action that links the perceptual representation to the observed behavior. The model
14
Embodiment, Ego-Space, and Action
Figure 1.8 A block diagram of perceptually directed action (see text for explanation).
involves a number of processing stages (Figure 1.8). For similar models, see Böök and Gärling (1981), Loomis et a l. (1992), Medendorp, Van A sselt, a nd Gi elen (1999), a nd R ieser (1989). F irst, t he v isual, auditory, or haptic stimulus gives rise to the percept, which may or may not be coincident with the target location (Figure 1.9). Accompanying t he percept i s a m ore abst ract a nd probably m ore d iffuse “spatial i mage,” wh ich continues to ex ist i n representational spac e even after the percept ceases. There is evidence that the spatial images from different modalities are functionally equivalent, perhaps even amodal in nature (Avraamides, Loomis, Klatzky, & Golledge, 2004; Klatzky, L ippa, L oomis, & G olledge, 2 003; L oomis, L ippa e t a l., 2002). We assume that the spatial image is coincident with the percept, but future research may challenge this assumption; for now, it appears that error in the percept is carried over to the spatial image. When the actor begins moving, sensed changes in position and orientation (path i ntegration) re sult i n s patial up dating of t he s patial image (Böök & G ärling, 1981; L oarer & S avoyant, 1991; L oomis et al., 1992; Loomis, Klatzky, Golledge, & Philbeck, 1999; Rieser, 1989; Thomson, 1983). At any point in the traverse, as depicted in Figures 1.8 and 1.9, the observer may be asked to make some nonlocomotion response, such as pointing at or throwing to the target or verbally reporting the remaining distance. The response processes clearly are different for different types of response. An important assumption, to be discussed later, is that the response is computed in precisely the s ame fa shion wh ether ba sed o n t he co ncurrent per cept o f t he target or on the spatial image of the target (whenever the percept is absent). This assumption is depicted in Figure 1.8 by the convergence
Measuring Spatial Perception with Spatial Updating and Action
15
Figure 1.9 From le ft to r ight, d epiction of 3 s uccessive mome nts d uring p erceptually directed action. A. The observer perceives a target closer than its physical distance. Accompanying t he perceived target is a more a bstract and spatially diff use spatial image. B. With the stimulus and its percept no longe r present, the observer move s t hrough s pace, up dating t he e gocentric d istance a nd d irection of t he target. I f path integration is accurate (as depicted here), t he spatial image remains stationary with respect to the physical environment. C. After moving, the observer can make another response to t he updated spatial image, by c ontinuing to move toward it, by pointing at it, by throwing at it, or by making a verbal report of the distance remaining.
of percepts, initial spatial images, and updated spatial images onto the output transforms for different types of responses. Not dep icted i n F igure 1.8 i s a n onperceptual i nput t o t he c reation of a spa tial image. Loomis and his colleagues (Avraamides et al., 2004; Klatzky et al., 2004; Loomis, Lippa et al., 2002) have shown that once a perso n forms a spa tial i mage, whether based on a spa tial percept or based on spatial language, subsequent behaviors (like spatial updating a nd exocentric d irection judgments) appear to be indifferent to the source of the input, suggesting that spatial images based on different inputs might be a modal. The implication is that the spatial image produced by vision, hearing, or touch, can, in principal, be modified by higher-level cognition so as not to be spatially coincident with the percept. Whether this dissociation between percept a nd spa tial i mage e ver oc curs r emains t o be de termined, but the evidence to be reviewed is consistent with the assumption of coincidence.
16
Embodiment, Ego-Space, and Action
Accurate pa th i ntegration a nd co nsequent ac curate u pdating m ean t hat t he spa tial i mage r emains fi xed w ith r eference t o the physical environment. If path integration is in error, multiple updated targets move with respect to the physical environment, but they move t ogether r igidly. I f only s ensed s peed but not heading during path integration is off by a constant factor, blind walking to a target over very different paths will cause the terminal points to coincide even though the convergence point will not coincide with the i nitially per ceived l ocation. C omparisons o f i ndirect w alking responses w ith d irect w alking r esponses t o t he s ame v isual t argets indicate that walking results in accurate path integration and consequent accurate spatial updating (Loomis et al., 1998, Philbeck et al., 1997). Figure 1.10 gives the average terminal locations for direct and indirect walking responses to visual and auditory targets
Figure 1.10 Stimulus layout and results of an experiment on spatial updating of visual and auditory targets by Loomis et al. (1998). The observer stood at the origin and saw or he ard a t arget (X) lo cated either 3 or 1 0 m d istant at a n azimuth of 80°, -30°, 30°, or 80°. Without further perceptual information about the target, the observer attempted to walk to its location either directly or indirectly. In the latter case, the observer was guided forward 5 m to the turn point and then attempted to walk the rest of the way to the target. The small open circles are the centroids of the direct path stopping points for the 7 observers, and the small closed circles are the centroids for t he indirect path stopping points. Reproduction of Fi gure 5.6 f rom Loomis, J. M., Klatzky, R. L., Golledge, R. G., & Philbeck, J. W. (1999). Human navigation by path integration. In R. G. Golledge (Ed.), Wayfinding: Cognitive mapping and other spatial processes (pp. 125–151). Baltimore: Johns Hopkins.
Measuring Spatial Perception with Spatial Updating and Action
17
in one of these studies (Loomis et al., 1998). The congruence of the direct and indirect terminal points and their near coincidence with the v isual t argets dem onstrate t he ac curacy o f u pdating. The fact that the terminal points for audition are further from the auditory targets (along the initial target directions) than the vision signifies the poo rer acc uracy o f auditory d istance perception co mpared t o visual distance perception. Whereas actively controlled walking to targets at normal walking speeds produces accurate path integration and u pdating f or sh ort pa ths (e.g., P hilbeck, K latzky, B ehrmann, Loomis, & Goodridge, 2001), walking at unusual speeds or passive transport by w heelchair or ot her c onveyance ge nerally re sults i n degraded updating performance (Israël, Grasso, Georges-François, Tsuzuku, & Ba thos, 1 997; J uurmaa a nd S uonio, 1 975; Ma rlinsky, 1999a, 1999b; Mittelstaedt & Glasauer, 1991; Mittelstaedt & Mittelstaedt, 2001; Sholl, 1989), especially for older adults (Allen, Kirasic, Rashotte, & Haun, 2004). A more recent v ariant of perceptually d irected action prov ides a means of measuring the 3-D perceived location of a v isual stimulus perceived to lie above the ground plane (Ooi, Wu, & He, 2001, 2006). Figure 1.11 depicts t he procedure a nd t ypical pattern of results for luminous v isual targets placed on t he ground in an otherwise dark room. At the left, the observer views the target. Because of insufficient distance cues, the target is perceived to be closer than it is, resulting in the percept being elevated. The observer, wearing a blindfold, then walks out to the perceived and updated target and gestures by placing the hand at its location. Despite errors in distance (which are like those
Figure 1.11 Procedure and typical results for t he experiments of O oi, Wu, and He (2001, 2005). A glowing target was placed on the ground in an otherwise dark room. At t he le ft, t he ob server v iews t he t arget. B ecause of i nsufficient distance cues, t he target is perceived to b e closer t han it i s, resulting in t he percept being elevated. The observer, wearing a bl indfold, t hen walks out to t he perceived a nd updated target and crouches to place the hand at its location. The angle α is “angular declination” (or “height in the field”), which is an important cue to egocentric distance.
18
Embodiment, Ego-Space, and Action
reported by Philbeck and Loomis [1997]; see Figure 1.4), the indicated locations lay in very nearly the same direction as the targets as viewed from the origin. Given the complexity of this “blind-walking–gesturing” r esponse, t his i s a r emarkable r esult. I t i s d ifficult to imagine any interpretation of this result other than one of systematic error in perceiving the target’s distance followed by accurate path integration and spatial updating. Similar evidence supporting this interpretation comes from the aforementioned studies involving both direct and indirect walking (Loomis, Klatzky et al., 1998; Loomis, Lippa et al., 2002; Philbeck et al. 1997). When people walked along indirect paths while updating, they traveled to very nearly the same locations as when traveling along direct paths. Importantly, when the terminal points cl early de viated f rom t he t argets i n ter ms o f d istance, i ndicating distance errors, the directions were nonetheless quite accurate (Loomis, Klatzky et al., 1998; Loomis, Lippa et al., 2002; Philbeck et al. 1997). The conclusion is strong that observers were traveling to the perceived and updated target locations. Still further evidence that perceptually directed action can be used t o m easure per ceived l ocation a nd, t hus, per ceived d istance, comes f rom recent work by Ooi et a l. (2006). The y performed two experiments, one involving t he blind-walking—gesturing response to luminous point targets in the dark—and the other involving judgments of the shapes of luminous figures, also viewed in the dark. The indicated locations and judged slants were consistent with the targets in the two tasks being located on an implicit surface, extending from near the observer’s feet and moving outward while curving upward; the a uthors h ypothesize t hat t he i mplicit su rface r eflects intrinsic biases of the visual system, like the specific distance tendency (Gogel & Tietz, 1979). It is significant that two such very different responses, one involving action and other involving judgments of shape, can be understood in terms of a unitary perceptual process. The flexibility of perceptually directed action is indicated by studies demonstrating on-line modification of the response. In a h ighly influential paper, Thomson (1983) reported an experiment that prevented t he obs erver f rom ex ecuting a p replanned r esponse. On a given trial, the observer viewed a target on the ground some distance ahead. After the observer began blind walking toward the target, the experimenter gave a signal to stop and throw a beanbag the remaining d istance. A ccuracy w as h igh e ven t hough obs ervers d id n ot know at wh ich point t hey would be c ued to t hrow, demonstrating a flexible response combining two forms of action. Another exam-
Measuring Spatial Perception with Spatial Updating and Action
19
ple of on-l ine modification comes f rom one of t he ex periments on direct a nd indirect w alking to t argets (Philbeck e t a l., 1997). Here observers viewed a target, after which their vision and hearing were occluded. On c ue f rom t he ex perimenter, t he obs ervers w alked t o the target a long one of t hree paths. Because t hey were not cued as to which pat h to take until a fter vision was occluded, the excellent updating performance indicates that observers could not have been preprogramming t he response. Given t hese results, it appears t hat action directed toward a goal is extremely flexible. Presumably, once a goal ha s been established, a ny combination of ac tions, i ncluding walking, sidestepping, crawling, and throwing can be assembled “on the fly” to indicate the location of an initially perceived target. For other evidence of on-line adjustment of perceptually directed action, see the paper by Farrell and Tho mson (1999). Despite the involvement of cognitive and locomotor processes in perceptually directed action, the results of a number of experiments demonstrate that this method provides a pure, albeit indirect, measure o f per ceived d istance (and d irection). They do so by demonstrating t hat cog nitive a nd motor processes contribute l ittle to t he systematic error of task performance. Especially compelling are the results demonstrating t he absence of systematic error in path integration, spatial updating, and response execution through the congruence o f d irect a nd i ndirect w alking pa ths ( Loomis, K latzky e t al., 1998; Loomis, Lippa et al., 2002; Philbeck et al., 1997; see Figure 1.10) and the result of Ooi et al. (2001, 2006) showing that terminal location of a co mplex spatial response is consistent with the initial target d irection. I n add ition, t he close coupling of ac tion a nd verbal responses in both reduced-cue and full-cue conditions (Philbeck & Loomis, 1997), the close coupling of blind walking and throwing responses in both real and virtual environments (Sahm et al., 2005), and t he close coupling of blind walking/gesturing and shape judgments (Ooi et al., 2006; Wu et al., 2004) is further support that the action-based method provides a m easure of distance perception. A later section showing that spatial updating can be used to correct for biases in verbal report provides still further evidence.
Role of Calibration in Perceptually Directed Action Although w e bel ieve t hat cog nitive a nd m otor p rocesses d o n ot contribute appreciably to systematic errors in perceptually directed
20
Embodiment, Ego-Space, and Action
action, such processes are clearly required to control and execute the response. In light of this, and given the very high accuracy of visually directed action under full-cue conditions, there is good reason to believe that perceptually directed action depends on some adaptation process t hat ke eps t he ac tion s ystem i n c alibration. I ndeed, several forms of adaptation of perceptually directed action have been demonstrated (Durgin, Fox, & Kim, 2003; Durgin, Gigone, & Scott, 2005; Durgin, Pelah, Fox, Lewis, Kane et al., 2005; Ellard & Shaughnessy, 2003; Mohler, Creem-Regehr, & Thompson, 2006; Ooi et a l., 2001; Philbeck, O’Leary, & L ew, 2004; R ichardson & W aller, 2005; Rieser, Pick, Ashmead, & Garing, 1995; Witt et al., 2004). Does adaptation call into question the claim that perceptually directed action can be used to measure perception? Because per ception, pa th i ntegration/spatial u pdating, a nd response ex ecution a ll co ntribute t o pe rceptually d irected a ction, adaptation o f a ny o f t hese p rocesses o r o f t he co uplings be tween them c an be ex pected t o a lter t ask per formance. A daptation t hat alters perceived distance will influence all measures of distance perception ( verbal r eport, s ize-based m easures, a nd a ll ac tion-based measures, i ncluding w alking a nd t hrowing) f or wha tever s ensory modality has been adapted. In the aforementioned study, Ooi et al. (2001) used prism adaptation during a prior period of walking with vision to alter the effective “angular declination,” one of the cues in sensing target distance (represented by α in Figure 1.11). The results showed that perceived distance was indeed altered because walking and t hrowing r esponses were s imilarly a ffected, e ven t hough on ly walking was used during adaptation. Adaptation o f t he pa th i ntegration p rocess a lters t he g ain o f sensed self-motion relative to actual physical motion and is expected to have a uniform influence on walked distance. That is, if the gain is halved, observers ought to walk twice as far in order to arrive at the updated targets, and this effect should apply to all targets regardless of initial distance and t he path taken. If t he adaptation of walking speed is specific to walking direction, t hen t he effect of adaptation on path integration will depend upon the walking direction. Rieser et al. (1995) had observers walk on a treadmill while being pulled by a tractor and altered the normal relationship between vision and the proprioceptive cues of walking. Adaptation to this altered relationship produced reliable changes in the walked distance to previewed targets but did not affect throwing to targets, a result that rules out
Measuring Spatial Perception with Spatial Updating and Action
21
perceptual ad aptation. The ma gnitude o f r ecalibration depen ded upon walking direction. A likely interpretation is that sensed selfmotion w as a ltered b y t he ad aptation p rocess. D urgin a nd h is colleagues have additional e vidence of re calibration of s ensed s elfmotion (Durgin, Fox et al., 2003; Durgin, Gigone et al., 2005; Durgin, Pelah et a l., 2005), a lthough their results and interpretation of the c ause of r ecalibration d iffer somewhat from those of Rieser et al. (1995). It might be thought that recalibration involves a comparison of sensed self-motion signaled by idiothetic (proprioceptive and inertial) cues and the overall pattern of optic flow, but recent results by Thompson, Mohler, and Creem-Regehr (2005) show that the recalibration depends upon the perceived scale of the environment, with optic flow held constant. This means that recalibration is determined by the comparison between sensed self-motion signaled by idiothetic cues a nd t he s ensed s elf-motion ba sed on v isual perception of t he environment, which depends upon distance perception. The accuracy of visually directed action under full cues (Figures 1.2 and 1.7) clearly relies on calibration of the gain of walking relative to visual perception. This calibration is likely to be induced by sensing of just the near visible environment (e.g., Thompson et al., 2005). An interesting implication is that adaptation of visually sensed selfmotion ought to affect any form of action based on spatial updating regardless of the modality with which the target is perceived; thus, the indicated distance to a target based on perceptually directed action will be altered the same amount by visual recalibration whether the targets are visual, auditory, or haptic, provided that the initially perceived distances are the same for the different modalities. An experiment comparing the effects of feedback on blind walking a nd v erbal r eports (Mohler e t a l., 2 006) sh owed e vidence o f a form of recalibration not confined to t he action system. To i nduce recalibration, the authors took advantage of the systematic underperception of distance in virtual reality. Observers gave verbal reports and performed blind walking to targets seen in the virtual environment before and after getting feedback about their errors. During the feedback phase, observers blind walked to the estimated locations of the targets and were given feedback about their errors. Both walking responses and verbal reports showed considerable improvement with the feedback. Because verbal reports were affected, the recalibration cannot be confined to a change in sensed self-motion. Although the common recalibration is consistent with a modification of perceived
22
Embodiment, Ego-Space, and Action
distance, the authors conclude that it is more likely a result of a cognitive rule that influences both types of responses. If true, this might be t he result of a cog nitive a lternation i n t he spatial i mage so t hat it d oes n ot co incide w ith t he per ceived t arget. I f so , t riangulation responses ought to be similarly affected. Still another form of adaptation has been demonstrated by Ellard and Shaughnessy (2003). In their experiment, observers viewed targets at varying distances on different trials and blind walked to them. For two of the targets, observers were given false feedback about the accuracy of their responses. Telling observers that they had u ndershot the target resulted in overshooting on subsequent trials. This form of adaptation was specific to t he targets for which fa lse feedback was given. The result by Ellard and Shaughnessy (2003) raises the possibility of a form of adaptation that might undermine the claim that perceptually directed action measures perception. In particular, this type of adaptation could potentially explain the linearity (proportionality) b etween r esponded di stance an d v isual t arget di stance un der full cues even if perceived distance should in fact be a compressively nonlinear f unction o f t arget d istance, a s cla imed, f or ex ample, b y Gilinsky (1951). It would have to be a type of adaptation that modifies neither perception (which affects all responses, whether actionbased or not) nor path integration (which affects all walked distances by the same scale factor). In addition, because of the aforementioned triangulation re sults, it wou ld h ave t o a ffect t he coupling be tween perceived d istance a nd s ensed d isplacements f rom t he o rigin, regardless of path taken, and do so in a way that varies nonlinearly with distance from the origin (so as to compensate for the putative nonlinearity between target and perceived distances). There a re at least t hree l ines of e vidence a gainst t he hypothesis that this type of distance specific adaptation undermines t he measurement of perceived d istance. The first is t hat people rarely v iew distant targets and then walk to them without vision. Error feedback following such blind walking would be needed to “calibrate” perceptually d irected a ction b ased on t he put ative non linearity b etween target distance and perceived distance. The second line of evidence is concerned with whether adaptation following v isual feedback about open-loop walking er rors generalizes to other forms of perceptually directed action. Richardson and Waller (2005; E xperiment 2) had obs ervers per form blind walking
Measuring Spatial Perception with Spatial Updating and Action
23
to targets in virtual reality. At the outset, their observers walked to locations only about half of the simulated target distances, along both direct and indirect paths, indicating the underperception of distance in virtual reality found by others using a variety of methods (Knapp, 1999; Sahm et al., 2005; Thompson et al., 2004). Observers were then given a period of training involving open-loop walking to the targets along direct paths; after arriving at the estimated position of the target on each trial, the observer was given explicit feedback about the undershoot error. After t raining, obs ervers were once a gain te sted using direct and indirect walking. The t raining el iminated 79% of the undershoot error for direct walking but only 27% of the undershoot error for indirect walking. The large difference in amounts of recalibration argues t hat if people get explicit feedback about t heir blind w alking i n t he real world so a s t o a llow for correct w alking to compensate for putative nonlinearity in perceived distances, this type of recalibration still would not explain the high accuracy with which people perform triangulation tasks. In a f ollow-up st udy, R ichardson a nd Waller (2007) f ound t hat observers wh o w ere a llowed t o w alk a round i n i mmersive v irtual environments w hile c ontinuously i nteracting w ith v isual t argets exhibited a m ore g eneral f orm o f r ecalibration. C ontrasting w ith the results obtained with explicit feedback in their earlier study, the results of this study showed that the implicit feedback accompanying interaction with the environment during the training phase did allow for an equal amount of recalibration when walking open-loop along direct and indirect paths during the testing phase. Prima facie, this result appears to support the hypothesis that recalibration accounts for accurate visually directed action despite nonlinear functions for egocentric distance. However, their result is also consistent with two other hypotheses: recalibration of v isual perception a nd recalibration of sensed self motion. Further experiments not relying on updating (e.g., verbal report and ball throwing) are needed to distinguish between the three alternative hypotheses. The third line of evidence against the hypothesis is made possible by comparing the accurate responses to visual targets with the systematically compressed responses to auditory targets. Figure 1.12 gives t he results of t wo ex periments (Loomis e t a l., 1998) f rom 12 observers wh o made bo th v erbal a nd b lind w alking r esponses t o visual targets and 12 observers who made bo th types of responses to auditory targets; some observers were given targets at 4, 10, and
24
Embodiment, Ego-Space, and Action
Figure 1.12 Results of two experiments on visual and auditory distance perception (Experiments 1 a nd 2a from Loomis et al. [1998]). The same observers made verbal and blind walking responses to b oth visual and auditory targets in a l arge open field. Seven responded to targets at 4, 10, and 16 m, and 5 observers responded to targets at 4, 8, 12, and 16 m. The best fitting linear functions are plotted as well. In the right panel, arrows indicate how, for a given visual target distance, the corresponding “equivalent” auditory distance was determined, this being the distance of the auditory target which produced the same walked distance.
16 m, a nd others were given targets at 4, 8, 12, and 16 m. The best fitting linear functions (with 0 intercepts) for the combined data sets are plotted as well. If t he ac curate r esponses t o v isual t argets r eflect so me so rt o f calibration process acting on the visually-based action process, presumably t he s ame c alibration d oes n ot a pply t o auditorially ba sed action, given t he very large systematic errors. Thus, t he t wo action processes must i nvolve d ifferent c alibration f unctions. This means that if a visual target distance and an auditory target distance produce t he same value of blind walking, t he corresponding values of visually perceived a nd auditorially perceived d istance must be d ifferent. Thus, we would expect that since the process of making a verbal report is common to both modalities, t he verbal reports ought to be different for visual and auditory target distances that produce the s ame ac tion response, a ssuming t hat t he v isually-based ac tion responses have been calibrated through experience. To test this idea, we have used the best fitting linear functions to the blind walking (motor) responses (Figure 1.12) to find, for each visual t arget di stance, t he c orresponding a uditory t arget di stance that produced the same walking responses (see arrows in the right panel o f F igure 1 .12). The v isual t arget d istance a nd “ equivalent” auditory target distances were then used to compute the corresponding verbal reports f rom t he best-fitting l inear f unctions. The func-
Measuring Spatial Perception with Spatial Updating and Action
25
tion relating the verbal reports for vision and the verbal reports for audition i s very nearly t he identity f unction ( linear f unction w ith slope o f 1.04 a nd i ntercept o f - 0.22 m). This means that the vi sually-based a nd a uditorially-based ac tion p rocesses a re e ssentially identical. At least for these data, there is no evidence of a calibration of blind walking to compensate for a putative compressively nonlinear perceptual function. On the basis of the three lines of evidence, we conclude that perceptually directed action does provide a comparatively pure measure of perception when action is properly calibrated to near surrounding space. Also based on both the perceptually directed action results of Figures 1.2 and 1.7 and the verbal report results of Figure 1.1, we conclude that visual distance perception is a linear function of target distance out to at least 25 m on the ground plane when distance cues are abundant. Using Spatial Updating to Correct for Bias in Verbal Reports of Egocentric Distance Figures 1 .1, 1 .2, a nd 1 .7 sh ow t hat v isually per ceived eg ocentric distance i s a l inear f unction o f p hysical d istance o ut t o 25 m a nd that v erbal r eports a re g enerally abo ut 8 0% o f t he d istances i ndicated by ac tion. This systematic difference in response va lues does not, by itself, indicate that different internal representations of distance control the two types of responses, for Foley (1977) and later Philbeck a nd L oomis (1997) presented a m odel i n wh ich t he s ame internal r epresentation o f d istance, oste nsibly per ceived d istance, acts through different output transforms to determine the indicated responses (see right part of Figure 1.8). Philbeck and Loomis (1997) showed that verbal reports and blind walking to targets, while systematically different, were related to each other by a fixed mapping when switching from reduced-cue to full-cue viewing (also see the above analysis in connection with Figure 1.12). This is co nsistent with there being just a single internal representation of distance acting through different output transforms. However, it is possible, even likely, t hat t he o utput t ransforms f or ac tion a nd v erbal r eport a re sometimes affected differently by experimental manipulations, such that there is no fixed mapping between the two types of responses. For example, an observer can view a photograph and make reliable judgments of d istance of depicted objects, but a sking obs ervers t o
26
Embodiment, Ego-Space, and Action
perform open-loop walking to the same depicted objects is likely to be met with reluctance followed by very noisy performance. Andre and Rogers (2006) have found that experimental manipulations, including v iewing t argets t hrough ba se up a nd ba se down prisms, can differentially affect blind walking and verbal report. They interpret t heir findings i n ter ms of d ifferent i nternal representations of distance, but it is possible that the differential effects are produced at the level of response production and execution. In connection with the hypothesis that action and verbal report involve the same internal representation but different output transforms, a spec ial c ase c an be i dentified i n wh ich verbal reports a re subject t o a s ystematic u nder r eporting b ias ( b) t hat i s a co nstant proportion of perceived egocentric distance (i.e., b<1.0). Figure 1.13 depicts t he consequences of such a b ias in a n updating task, based on four assumptions. First, the output transform for verbal report of distance is assumed to be the same whether the report is based on a concurrent percept, a spatial image prior to updating, or an updated spatial image (right part of Figure 1.8). The same assumption is made for verbal report of direction. Third, any systematic biases in verbally reporting perceived direction are assumed to be small for both vision and hearing, an assumption that is well supported (Loomis et al., 1998; Experiment 2A). Fourth, path integration is assumed to be properly calibrated (per the previous section), with accurate spatial updating a consequence. Figure 1.13 depicts the situation where the verbal report scaling factor (b) is 0.75 times the perceived distance from the origin or the distance of the spatially updated spatial image following a w alk. The result is that the reported locations shift for ward and parallel with the walk, with the shift magnitude equal to the l ength o f t he w alk multiplied b y 1-b ( here 0. 25). F igure 1.14A shows the predicted shifts for the same factor (b=0.75) for all of the target locations used in Experiment 2A of Loomis et al. (1998). In this experiment, observers made verbal estimates of distance and direction while viewing or listening to targets. They then moved forward 5 m while updating and then made verbal reports of the target locations. The centroids of t he reported locations of t he perceived a nd updated visual targets are shown in Figure 1.14B.1 The general pattern of forward shifts in reported location, predicted in the left panel, can be seen in the data in the right panel, albeit with superimposed noise. At the time these data were published, the authors found this pattern of shifts enigmatic (Loomis et al., 1998, pp. 974–975).
Measuring Spatial Perception with Spatial Updating and Action
27
Figure 1.13 The c onsequences of ve rbally u nder re porting t he p erceived d istance of a target. The physical targets are represented by the Xs. The perceived and updated targets are closer to the initial location than the physical targets. The initial estimates represent the verbally under reported locations of t he perceived targets as judged from the initial location prior to the walk. After the observer walks, the terminal estimates are the verbally under reported locations of the updated targets as judged from the terminal location after t he walk. When t he verbally reported distance is less than the perceived and updated target locations, as depicted here, the estimated location of the target moves with the observer along a parallel path.
Figure 1.14 (A) The predicted pattern of shifts in the verbally reported locations for the target locations in Experiment 2 of Loomis et al. (1998), assuming that verbally reported location is only 0.75 as great as the perceived location. The targets are represented by the Xs. The terminal location from which judgments were made was forward of t he initial location following a 5 m w alk. The terminally reported locations are all displaced equally from the initially reported locations in the direction of the walk, as depicted by the thick lines. (B) The observed pattern of verbally reported locations to visual targets before and after walking 5 m forward in Experiment 2 of Loomis et al. (1998). The displacements are depicted by the thick lines.
28
Embodiment, Ego-Space, and Action
If, a s hypothesized, t he g eneral pa ttern o f sh ifts i s t he r esult o f systematically under reporting perceived distance, the reported perceived locations prior to walking and the reported updated locations after walking should be brought into better congruence by scaling the reported distances, before and after walking, by the inverse of the reporting bias b. Two measures of congruence are (1) mean distance between the initial and terminal locations, averaged over all targets, and (2) the degree of directional alignment between the initial and terminal l ocations f rom t he o rigin, a veraged o ver a ll t argets. F igure 1.15A gives the results of the scaling analysis for both measures of congruence, as applied to the visual data. A s caling factor of 1.3 produces t he ma ximum congruence. ( The auditory d ata were s everal times noisier; the resulting scaling factor was 1.7.) Figure 1.15B shows t he revised i nitial a nd ter minal reported locations a fter the rescaling of reported distances, for both the perceived and updated targets. Because of other unknown sources of error, the congruence is st ill fa r f rom per fect, b ut t he t wo m easures o f co ngruence p rovide a s imilar estimate of 1.3 for t he scaling fac tor. By h ypothesis,
Figure 1.15 (A) Two me asures of c ongruence of t he i nitial a nd t erminal e stimated locations (from verbal report) in response to v isual targets in Experiment 2 of Loomis et al. (1998), as a f unction of t he scaling factor used to correct verbal reports of distance. The two measures of congruence are mean distance between the initial and terminal locations, averaged over all targets, and the degree of directional m isalignment b etween t he i nitial a nd t erminal lo cations f rom t he or igin, averaged over all targets. Low values indicate high congruence for both measures. The results for both measures indicate that multiplying the verbal reports by a factor of 1.3 produces maximum congruence. (B) The data of Figure 14B after rescaling t he verbal reports of d istance using t he scale factor of 1 .3. In addition to t he initial and terminal estimates being maximally congruent, the estimated locations are now, on average, quite close to the physical targets in terms of distance from the origin. The displacements are depicted by the thick lines.
Measuring Spatial Perception with Spatial Updating and Action
29
this means t hat verbal estimates are biased toward underreporting of per ceived d istance b y 1 /1.3=0.77. The a djusted r eported l ocations are now on average quite close to the target locations (Figure 1.15B) in terms of distance from the origin. Interestingly, this bias is close to the average factor by which the verbal reports summarized in Figure 1.1 d iffer from the action based measures of Figures 1.2 and 1.7. Although more research needs to be done, the analysis here suggests that spatial updating can be u sed to correct for the bias in verbal reports,2 provided that path integration is properly calibrated. An implication of this analysis is that corrected verbal reports and action measures of perceived egocentric distance agree in showing linear and accurate perception of distance over the ground surface under full cue conditions.
Distortions of Perceived Exocentric Distance and Perceived Shape The ea rlier co nclusion t hat v isual d istance per ception is p roportional to target distance out to 25 m on the ground plane under fullcue co nditions s eems t o co ntradict o ther e vidence t hat ex ocentric distances a nd v isual sha pes a re s ystematically m isperceived, e ven within 2 m (Beusmans, 1998; Foley et al., 2004; Kudoh, 2005; Levin & Haber,1993; Loomis et al., 1992; Loomis & Philbeck, 1999; Loomis, Philbeck, & Z ahorik, 2 002; O oi e t a l., 2 006; Toye, 1 986; Wagner, 1985; Wu et al., 2004).3 Generally speaking, exocentric depth extents are perceived as smaller than physically equal exocentric extents in the frontoparallel plane and distant exocentric depth extents are perceived as smaller than physically equal exocentric depth extents that are close. A f ull discussion of this topic is beyond the scope of this chapter. Here, we mention five recent papers that go a long way in reconciling the linearity of visual distance perception with the large distortion of exocentric distance and shape within the same range. Foley e t a l. (2004) ha ve p ut f orth a ma thematical m odel o f v isual space in which large distortions in perceived exocentric extents are consistent with a l inear function (or slightly nonlinear function, as they ma intain) r elating per ceived eg ocentric d istance t o p hysical egocentric d istance. A lthough t he processes u nderlying t he model have yet to be elucidated, the model has great promise for explaining a variety of distortions in visual space. Wu et al. (2004) deal with the
30
Embodiment, Ego-Space, and Action
same issue in a d ifferent way. They found that restricting the visual field of view of observers alters both perceived shapes of objects on the g round plane a nd reduces t he perceived egocentric d istance of targets on t he g round plane. These changes in shape and distance are tightly coupled. They also presented evidence that the perceived ground plane is slanted slightly upward toward a frontoparallel plane, with the restricted field of view causing a somewhat greater slant (see also He et al., 2004). Based on their evidence, they have proposed a surface integration model that accounts for slant variation in the perceptual g round plane i nduced by various ma nipulations (He et al., 2004; Wu et al., 2004; see also Ooi et al., 2006). Finally, Loomis et a l. ( 2002) ha ve sh own t hat s witching f rom m onocular v iewing to binocular v iewing of a co nfiguration of nearby targets can a lter the per ceived co nfiguration o f t he t argets w ithout cha nging t heir perceived l ocations, in dicating a f unctional di ssociation b etween the perception of shape and perception of the vertices defining the shape. This dissociation adds to the discoveries of Foley et al. (2004) and of Wu et al. (2004) in making more comprehensible the linearity or near linearity of egocentric distance perception despite the systematic distortions of exocentric distance and shape. Two Visual Systems and Action-Specific Representations of Distance No treatment of using action to measure perception would be complete without mention of the “two visual streams” framework developed by Milner and Goodale (1995), Bridgeman, (1999) and others. In this framework, one neural pathway of cortical visual processing, running dorsally from primary v isual cortex to t he posterior parietal cortex, is specialized for visuomotor control; the other, running ventrally t o i nferotemporal co rtex, i s spec ialized f or v isual ob ject recognition a nd o ther f orms o f co nscious v isual per ception—presumably including visual space perception. One form of evidence in support of this framework comes from brain-injured patients, who, for ex ample, c an u se v ision t o control spatially d irected behaviors such as reaching and grasping despite profound deficits i n v isual perception of o bject sha pe a nd l ocation (Milner & G oodale, 1995; Weiskrantz, 1986). There is also related evidence of a functional dissociation between visual judgments of spatial variables and spatially
Measuring Spatial Perception with Spatial Updating and Action
31
guided actions in neurologically intact observers, although the interpretation of these functional dissociations has been debated (Carey, 2001; Da ssonville, B ridgeman, Ba la, Thiem, & S ampanes, 2 004; Franz, Fahle, Bulthoff, & Gegenfurtner, 2001; Haffenden & Goodale, 1998; Post & Welch, 1996). If indeed visually guided actions are controlled by processes distinct from those involved in visual space perception, this would be good reason for doubting that action can be used to measure visual distance perception. Blind walking and other forms of perceptually directed action rely on spatial updating, and at least some forms of spatial updating are thought to be subserved by regions of the posterior parietal cortex (e.g., updating of eye and arm position; DeSouza et a l., 2 000; H eide, Bla nkenburg, Z immermann, & K ompf, 1 995; Medendorp, Goltz, Vilis, & Crawford, 2003; Sereno, Pitzalis, & Martinez, 2001; but see also Philbeck, Behrmann, Black, & Ebert, 2000). Another aspect of the two streams framework could argue for dorsal stream control of visually directed walking, and this involves the spatial a nd tem poral s cale o ver wh ich ac tions a re co ntrolled. E ye movements beg in to show evidence of ventral st ream control a fter relatively brief delays (e.g., 500 ms; Gnadt, Bracewell, & A ndersen, 1991; Wong & Mack, 1 981), wh ile a rm m ovements beg in t o sh ow similar evidence after somewhat longer delays (2 s; Goodale, Jakobson, & Keillor, 1994; Rossetti, 1998). Walking transpires over a much larger spatial scale than eye and arm movements; if the duration of dorsal stream representations is linked with the spatial scale of the action b eing controlled, blind w alking might b e controlled by t he dorsal stream even when it is based on visual information obtained many seconds in the past. The temporal delay that marks the transition between dorsal a nd ventral st ream control of walking is not known, however. Countering t hese a rguments t hat per ceptually d irected ac tion may be subserved by the dorsal stream is the notion that the dorsal and v entral st reams ten d t o en code spa tial r epresentations w ithin differing frames of reference (Creem & Proffitt, 2001; Milner & Goodale, 1995). An egocentric frame of reference is thought to underlie dorsal stream representations, while ventral stream representations are thought to make use of both egocentric and allocentric (environment-centered) frames of reference. If perceptually directed actions are u pdated r elative t o en vironmental f eatures, t his su ggests t hat the v entral st ream ma y p lay a d ominant r ole i n co ntrolling b lind
32
Embodiment, Ego-Space, and Action
walking a nd other forms of perceptually d irected ac tions. Reg ions in the medial temporal lobe, which receive inputs in part from the inferotemporal cortex in the ventral stream, are a particularly likely neural substrate for updating the location of one’s body within an environment (Alyan & McNaughton, 1999; Redish, 1999; Whishaw, Hines, & W allace, 20 01). C onsistent w ith t his v iew is t hat so me forms of path integration in humans are impaired after injury to the medial temporal lobe (e.g., Philbeck, Behrmann et al., 2004; Worsley et al., 2001). Functional neuroimaging also shows activation in medial temporal lobe structures during navigation in virtual environments (e.g., Aguirre, Detre, A lsop, & D ’Esposito, 1996; Grön et al., 2000; Maguire et al., 1998). Regardless o f t he b rain a reas t hat m ight be i nvolved, t here i s abundant e vidence t hat m easures o f per ceived d istance ba sed o n perceptually directed action tightly covary with nonaction measures of per ceived d istance. A s m entioned e arlier, P hilbeck a nd L oomis (1997) found t hat verbal a nd walked i ndications of egocentric d istance w ere t ightly l inked ac ross a v ariety o f v iewing co nditions. Sahm et al. (2005) found the same for blind walking and throwing. Also, Wu et al. (2004) found that variations in shape judgments were tightly coupled to variations i n blind walking u nder good l ighting conditions a s field of v iew w as ma nipulated, a nd O oi e t a l. (2006) found a s imilar t ight co upling be tween sha pe j udgments a nd t he blind-walking–gesturing r esponse w hen p erformed in d arkness. Finally, Hutchison and Loomis (2006a) found that when field of view was ma nipulated u nder g ood l ighting co nditions, v erbal r eports, blind walking action responses, and size judgments were affected in much the same way.
Importance of Using Converging Measures of Distance Perception In recent years, Proffitt and his colleagues (e.g., Proffitt et al., 2003; Proffitt, this volume; Witt et al., 2004, 2005) have claimed that visual distance perception is affected by observer-related variables, such as energetic st ate a nd i ntent t o per form a n ac tion, i n add ition t o t he optical st imuli spec ifying la yout o f t he en vironment. W e bel ieve some c ircumspection i s w arranted bef ore ac cepting t hese cla ims. In view of the level of theory which has been reached after decades
Measuring Spatial Perception with Spatial Updating and Action
33
of research on visual space perception, theory which maintains that some perceptual variables, like size, shape, and motion, are causally linked to perceived distance (e.g., size-distance invariance), we feel that converging e vidence u sing judgments of t hese other v ariables and ac tion-based m easures ( e.g., H utchison & L oomis, 2 006a) i s needed t o e stablish t hat per ceived d istance i s t ruly bei ng a ffected by manipulation of nonoptical variables such a s energetic state and intent. P erhaps per ceived d istance i s bei ng a ffected, b ut u ntil t he evidence i s m ore co mpelling, t he a lternative h ypothesis w arrants serious co nsideration—that t hese n onoptical v ariables ac t n ot o n perception per se b ut o n j udgmental p rocesses s ubsequent t o t he perceptual representation. More generally, we bel ieve it prudent t o consider a ny one measure of distance perception as imperfect and to consider the use of multiple co nverging m easures. W e ha ve sh own t hat ac tion ba sed measures, i ncluding t hose r elying o n spa tial u pdating, a re t ightly coupled to other measures. O f a ll t hese, t he blind w alking/gesturing response (Ooi et al., 2001, 2006), seems the most useful measure of perceived location in three dimensions. Yet, because people can respond to spatial images that are formed from language (e.g., Loomis, Lippa et al., 2002), even such action measures might be subject to some biasing by higher-level cognition. We believe t hat indirect measures o f per ceived d istance ba sed o n s ize, m otion, a nd sha pe judgments ma y be t he p urest m easures, b ut e ven t hey a re su rely subject to cognitive influence in some circumstances. For example, obtaining pure estimates of perceived target size is probably not feasible in environments with an abundance of familiar objects present, for subjects w ill s imply u se k nowledge o f t he s izes o f t he fa miliar objects, a long w ith relative d istance c ues, to judge t he sizes of t he target stimuli. Given the imperfection of any one measure, the space perception researcher is probably best advised to use multiple converging measures whenever t here i s a co ncern about t he fa llibility of any one measure, especially when a critical claim about distance perception is at stake.
Acknowledgments The work conducted by the authors and their colleagues and reported here was supported by grants from the National Science Foundation,
34
Embodiment, Ego-Space, and Action
National Eye Institute, Office of Naval Research, and Air Force Office of S cientific Re search a nd b y B razilian g rants f rom t he F undação de Amado a Pesquisa do Estado de São Paulo and Conselho Nacional de Pesquisas. The authors thank John Foley and Zijiang He for very helpful discussion and Roberta Klatzky for her suggestions for improving the chapter.
Notes 1.
The e xperiment i nvolved v iewing o ne o r t wo t argets o r l istening to o ne o r t wo t argets. O n t rials w hen t wo t argets w ere p resented, the obs erver w as p rompted w hich o f t he t argets to re port a fter the observer had m oved f orward w hile u pdating. There w as l ittle d ifference i n u pdating p erformance w hether o ne o r t wo t argets w ere updated. Rieser and Rider (1991) also found little effect of number of targets (from 1 to 5). Consequently, the reported target perceived and updated locations for the one and two-target trials were averaged (by computing centroids) and used in the analysis here. 2. It might be thought that this procedure of calibrating verbal reports could b e done u sing p erception of t he s ame t arget b efore a nd a fter the walk, instead of relying on spatial updating of the spatial image of the target. The challenge to doing so is that one has to assume that the perceived target has not moved during t he walk. K nowing whether the perceived target has moved brings us back full circle to the problem of how to measure perception in the first place. 3. The range over which egocentric distance is linear very likely depends upon t he e ye heig ht o f t he obs erver. L oomis a nd P hilbeck ( 1999) observed that the ground surface from a h igh altitude under conditions of high visibility looks as flat as a table viewed from normal eye height. In addition, they found evidence that distortions of perceived shape of objects lying on the ground plane are scale invariant for monocular vision. A good working hypothesis is that, beyond 10 m or so, t he monocular relative d istance cues (e.g., texture, relative size, and linear perspective) determine the shape of the visual environment. If the relative distance cues are the same at different scales, the shape of the perceived visual environment ought to be the same. An i mplication o f th is hypothesis i s th at th e p sychophysical fu nction relating perceived distance to p hysical distance, if nonlinear at farther d istances, sho uld s cale w ith t he e ye heig ht o f t he obs erver. Evidence for this comes from Harway (1963) and Ooi and He (2007).
Measuring Spatial Perception with Spatial Updating and Action
35
References Aguirre, G . K ., De tre, J. A ., A lsop, D . C ., & D’ Esposito, M . (1996). The parahippocampus subserves topographical learning in man. Cerebral Cortex, 6, 823–829. Allen, G. L., Kirasic, K. C., Rashotte, M. A., & Haun, D. B. M. (2004). Aging and path i ntegration skill: K inesthetic a nd vestibular c ontributions to wayfinding. Perception & Psychophysics, 66, 170–170. Alyan, S ., & M cNaughton, B . L . ( 1999). H ippocampectomized r ats a re capable of homing by path integration. Behavioral Neuroscience, 113, 19–31. Andre, J., & Rog ers, S . (2006). U sing v erbal a nd bl ind-walking d istance estimates to investigate the two visual systems hypothesis. Perception & Psychophysics, 68, 353–361. Ashmead, D. H., DeFord, L. D., & Northington, A. (1995). Contribution of listeners’ approaching motion to auditory distance perception. Journal of Experimental Psychology: Human Perception and Performance, 21, 239–256. Avraamides, M. N., Loomis, J. M., Klatzky, R. L., & Golledge, R. G. (2004). Functional equivalence of spatial representations derived from vision and language: Evidence from allocentric judgments. Journal of Experimental Psychology: Learning, Memory, & Cognition, 30, 801–814. Beusmans, J. M . H . (1998). Perceived ob ject shap e a ffects t he p erceived direction of self-movement. Perception, 27, 1079–1085. Bingham, G . P., B radley, A ., B ailey, M ., & Vi nner, R . (2001). Accommodation, o cclusion, a nd d isparity ma tching a re u sed to g uide re aching: A c omparison of actual versus virtual environments. Journal of Experimental P sychology: H uman P erception an d P erformance, 24, 145–168. Böök, A. & Gärling, T. (1981). Maintenance of orientation during locomotion in unfamiliar environments. Journal of Experimental Psychology: Human Perception and Performance, 7, 995–1006. Brain, W. R. (1951). The cerebral basis of consciousness. Proceedings of the Royal Society of Medicine, 44, 37–42. Bridgeman, B. (1999). Separate representations of visual space for perception and visually guided behavior. In G. Aschersleben & T. Bachmann (Eds.), Cognitive contributions to the perception of spatial and temporal events (Vol. 129, pp. 3–13). Amsterdam: Elsevier. Carey, D. P. (2001). Do action systems resist visual illusions? Trends in Cognitive Sciences, 5, 109–113. Carlson, V. R. (1977). Instructions and perceptual constancy judgments. In W. Epstein (Ed.), Stability and Constancy in Visual Perception: Mechanisms and Processes (pp. 217–254). New York: Wiley.
36
Embodiment, Ego-Space, and Action
Corlett, J. T., Byblow, W., & Taylor, B. (1990). The effect of perceived locomotor constraints on distance estimation. Journal of Motor Behavior, 22, 347–360. Corlett, J. T., & Patla, A. E. (1987). Some effects of upward, downward, and level v isual s canning a nd lo comotion o n d istance e stimation ac curacy. Journal of Human Movement Studies, 13, 85–95. Creem, S. H., & Proffit t, D. R. (2001). Defining the cortical visual systems: “What”, “Where” and “How.” Acta Psychologica, 107, 43–68. Creem-Regehr, S . H ., Wi llemsen, P., G ooch, A . A ., & Thompson, W. B . (2005). The influence of restricted viewing conditions on egocentric distance p erception: I mplications f or re al a nd v irtual i ndoor en vironments. Perception, 34, 191–204. Cuijpers, R. H., Kappers, A. M. L., & Koenderink, J. J. (2000). Investigation of visual space using an exocentric pointing task. Perception & Psychophysics, 62, 1556–1571. Cutting, J. E., & Vishton, P. M. (1995). Perceiving layout and knowing distances: The i ntegration, rel ative p otency, a nd c ontextual u se of d ifferent i nformation a bout de pth. I n W. Eps tein & S . Rog ers ( Eds.), Perception of s pace an d mot ion ( pp. 6 9–117). N ew York: A cademic Press. Da Silva, J. A. (1985). Scales for perceived egocentric distance in a large open field: Comparison of three psychophysical methods. American Journal of Psychology, 98, 119–144. Dassonville, P., Bridgeman, B., Bala, J. K., Thiem, P., & Sampanes, A. (2004). The induced Roelofs effect: two visual systems or the shift of a single reference frame? Vision Research, 44, 603–611. DeSouza, J. F. X., Dukelow, S. P., Gati, J. S., Manon, R. S., Andersen, R. A., & Vi lis, T. (2000). E ye p osition sig nal m odulates a h uman pa rietal pointing region during memory guided movements. Journal of Neuroscience, 20, 5835–5840. Durgin, F. H., Fox, L. F., & Kim, Dong Hoon (2003). Not letting the left leg know what t he right leg is doing: Limb-specific locomotion adaptation to sensory-cue conflict. Psychological Science, 14, 567–572. Durgin, F. H., Gigone, K., & Scott, R. (2005). The perception of visual speed while m oving. Journal of E xperimental P sychology: H uman P erception and Performance, 31, 339–353. Durgin, F. H ., P elah, A ., F ox, L . F., L ewis, J., K ane, R ., & W alley, K . A . (2005). Self-motion perception during locomotor recalibration: More than meets the eye. Journal of Experimental Psychology: Human Perception and Performance, 31, 398–419. Eby, D. W., & L oomis, J. M . (1987). A s tudy of visually directed throwing in the presence of multiple distance cues. Perception & Psychophysics, 41, 308–312.
Measuring Spatial Perception with Spatial Updating and Action
37
Ellard, C . G ., & Sha ughnessy, S . C . ( 2003). A c omparison o f v isual a nd nonvisual sensory inputs to walked distance in a blind-walking task. Perception, 32, 567–578. Elliott, D. (1987). The influence of walking speed and prior practice on locomotor distance estimation. Journal of Motor Behavior, 19, 476–485. Elliott, D., Jones, R., & Gray, S. (1990). Short-term memory for spatial location in goal-directed locomotion. Bulletin of the Psychonomic Society, 8, 158–160. Epstein, W. (1982). Percept-percept coupling. Perception, 11, 75–83. Farrell, M. J., & Thomson, J. A. (1999). On-line updating of spatial information during locomotion without vision. Journal of Motor Behavior, 3, 39–53. Foley, J. M. (1977). Effect of distance information and range on two indices of visually perceived distance. Perception, 6, 449–460. Foley, J. M., & Held, R. (1972). Visually directed pointing as a f unction of target d istance, d irection, a nd a vailable c ues. Perception & P sychophysics, 12, 263–268. Foley, J. M., Ribeiro, N. P., & Da Silva, J. A. (2004). Visual perception of extent and the geometry of visual space. Vision Research, 44, 147–156. Franz, V. H ., F ahle, M ., Bü lthoff, H . H ., & G egenfurtner, K . R . ( 2001). Effects of v isual i llusions on g rasping. Journal of E xperimental Ps ychology: Human Perception & Performance, 27, 112–1144. Fukusima, S. S., Loomis, J. M ., & Da Si lva, J. A . (1997). Visual perception of egocentric distance as assessed by triangulation. Journal of Experimental Psychology: Human Perception and Performance, 23, 86–100. Gilinsky, A. S. (1951). Perceived size and distance in visual space. Psychological Review, 58, 460–482. Gnadt, J. W ., B racewell, R . M ., & A ndersen, R . A . (1991). S ensorimotor transformation during eye movements to remembered visual targets. Vision Research, 31, 693–715. Gogel, W. C. (1974). Cognitive factors in spatial responses. Psychologia, 17, 213–225. Gogel, W. C. (1982). Analysis of the perception of motion concomitant with head motion. Perception & Psychophysics, 32, 241–250. Gogel, W. C. (1984). The role of perceptual interrelations in figural synthesis. In P. C. Dodwell & T. Caelli (Eds.), Figural synthesis (pp. 31–82). Hillsdale, NJ: Erlbaum. Gogel, W. C . (1990). A t heory o f p henomenal g eometry a nd i ts appl ications. Perception & Psychophysics, 48, 105–123. Gogel, W. C. (1993). The analysis of perceived space. In S. C. Masin (Ed.), Foundations of pe rceptual th eory. Ad vances in ps ychology (Vol. 9 9, pp. 1 13–182). A msterdam, N etherlands: N orth-Holland/Elsevier Science.
38
Embodiment, Ego-Space, and Action
Gogel, W. C., Loomis, J. M., Newman, N. J., & Sharkey, T. J. (1985). Agreement between indirect measures of perceived distance. Perception & Psychophysics, 37, 17–27. Gogel, W. C., & Tietz, J. D. (1979). A comparison of oculomotor and motion parallax cues of egocentric distance. Vision Research, 19, 1161–1170. Goodale, M. A., Jakobson, L. S., & Kei llor, J. M. (1994). Differences in the visual control of pantomimed and natural grasping movements. Neuropsychologia, 32, 1159–1178. Grön, G ., W underlich, A . P., Sp itzer, M ., Tomczak, R ., & R iepe, M . W. (2000). Brain activation during human navigation: Gender-different neural ne tworks a s s ubstrate of p erformance. Nature Ne uroscience, 3, 404–408. Haffenden, A. M., & Goodale, M. A. (1998). The effect of pictorial illusion on prehension and perception. Journal of Cognitive Neuroscience, 10, 122–136. Harway, N. I. (1963). Judgment of distance in children and adults. Journal of Experimental Psychology, 65, 385–390. He, Z. J., Wu, B., Ooi, T. L., Yarbrough, G., & Wu, J. (2004). Judging egocentric d istance o n t he g round: O cclusion a nd su rface i ntegration. Perception, 33, 789–806. Heide, W., Blankenburg, M., Zimmermann, E., & Kompf, D. (1995). Cortical control of double-step saccades: Implications for spatial orientation. Annals of Neurology, 38, 739–748. Howard, I. P., & Rogers, B. J. (2002). Seeing in depth: Vol. 2. Depth perception. Thornhill, Ontario: I. Porteous. Hutchison, J. J., & L oomis, J. M . (2006a). Does energy expenditure a ffect the perception of egocentric d istance? A f ailure to re plicate E xperiment 1 of Proffitt, Stefanucci, Banton, and Epstein (2003). The Spanish Journal of Psychology, 9, 332–339. Hutchison, J. J., & Loomis, J. M. (2006b). Reply to Proffit t, Stefanucci, Banton, and Epstein. The Spanish Journal of Psychology, 9, 343–345. Israël, I ., Gr asso, R ., G eorges-François, P ., T suzuku, T ., & B erthoz, A . (1997). Spa tial memory a nd pa th i ntegration s tudied by s elf-driven passive linear displacement. I. Basic properties. Journal of Neurophysiology, 77, 3180–3192. Juurmaa, J., & Suo nio, K . (1975). The role o f audition a nd motion i n t he spatial orientation of the blind and sighted. Scandinavian Journal of Psychology, 16, 209–216. Kelly, J. W., Loomis, J. M ., & B eall, A. C. (2004). Judgments of exocentric distance in large-scale space. Perception, 33, 443–454. Klatzky, R. L ., Lippa, Y., Loomis, J. M ., & G olledge, R. G. (2003). Encoding, learning, and spatial updating of multiple object locations specified by 3 -D sound, spatial language, and vision. Experimental Brain Research, 149, 48–61.
Measuring Spatial Perception with Spatial Updating and Action
39
Knapp, J. M. (1999). The v isual perception of egocentric d istance i n v irtual en vironments. Unpublished do ctoral d issertation, De partment of Psychology, University of California, Santa Barbara. Knapp, J. M., & Loomis, J. M. (2004). Limited field of view of head-mounted displays is not the cause of distance underestimation in virtual environments. Presence, 13, 572–577. Koch, C. (2003). The quest for consciousness. Englewood, CO: Roberts. Kudoh, N. ( 2005). D issociation b etween v isual p erception o f a llocentric distance a nd v isually d irected w alking of its e xtent. Perception, 3 4, 1399–416. Lehar, S. (2003). The world in your head: A Gestalt view of the mechanism of conscious experience. Mahwah, NJ: Erlbaum. Levin, C. A., & Haber, R. N. (1993). Visual angle as a determinant of perceived interobject distance. Perception & Psychophysics, 54, 250–259. Loarer, E., & Savoyant, A. (1991). Visual imagery in locomotor movement without v ision. In R . H. L ogie & M . Denis (Eds.), Mental images in human cognition (pp. 35–46). The Hague: Elsevier. Loomis, J. M. (1992). Distal attribution and presence. Presence, 1, 113–119. Loomis, J. M ., Da Si lva, J. A., Fujita, N., & Fu kusima, S . S . (1992). Vi sual space perception and visually directed action. Journal of Experimental Psychology: Human Perception and Performance, 18, 906–921. Loomis, J. M ., Klatzky, R. L., & G olledge, R. G. (1999). Auditory distance perception i n re al, v irtual, a nd m ixed en vironments. I n Y. Oh ta & H. Tamura (Eds.), Mixed reality: Merging real and virtual worlds (pp. 201–214). Tokyo: Ohmsha. Loomis, J. M ., K latzky, R . L ., G olledge, R . G ., & P hilbeck, J. W . (1999). Human navigation by path integration. In R. G. Golledge (Ed.), Wayfinding: Cognitive mapping and other spatial processes (pp. 125–151). Baltimore: Johns Hopkins. Loomis, J. M ., K latzky, R . L ., P hilbeck, J. W ., & G olledge, R . G . (1998). Assessing auditory d istance perception u sing perceptually d irected action. Perception & Psychophysics, 60, 966–980. Loomis, J. M., & K napp, J. M. (2003). Visual perception of egocentric distance i n re al a nd v irtual environments. I n L . J. H ettinger & M . W. Haas ( Eds.), Virtual an d a daptive e nvironments (pp. 2 1–46). Ma hwah, NJ: Erlbaum. Loomis, J. M ., Lippa, Y., K latzky, R . L ., & G olledge, R . G. (2002). Spatial updating of lo cations sp ecified by 3 -D s ound a nd spa tial l anguage. Journal of Experimental Psychology: Learning, Memory, & Cognition, 28, 335–345. Loomis, J. M ., & P hilbeck, J. W . ( 1999). Is t he a nisotropy o f p erceived 3-D shap e i nvariant ac ross s cale? Perception & P sychophysics, 61, 397–402.
40
Embodiment, Ego-Space, and Action
Loomis, J. M ., Philbeck, J. W., & Z ahorik, P. (2002). Dissociation of location a nd shape i n v isual spac e. Journal of E xperimental Psychology: Human Perception and Performance, 28, 1202–1212. Maguire, E. A., Burgess, N., Do nnett, J. G ., Frackowiak, R. S. J., F rith, C. D., & O’Keefe, J. (1998). Knowing where and getting there: A human navigation network. Science, 280, 921–924. Marlinsky, V. V. (1999a). Vestibular and vestibule-proprioceptive perception of motion in the horizontal plane in blindfolded man I. Estimations of linear displacement. Neuroscience, 90, 389–394. Marlinsky, V. V. (1999b). Vestibular and vestibule-proprioceptive perception of motion in the horizontal plane in blindfolded man. III. Route inference. Neuroscience, 90, 403–411. Marr, D. (1982). Vision. San Francisco: W. H. Freeman. McCready, D. (1985). On size, distance, and visual angle perception. Perception & Psychophysics, 37, 323–334. Medendorp, W. P., Goltz, H. C., Vilis, T., & Cr awford, J. D. (2003). Gazecentered updating of visual space in human parietal cortex. Journal of Neuroscience, 23, 6209–6214. Medendorp, W. P., Van Asselt, S., & Gielen, C . C . A. M. (1999). Pointing to remembered visual targets after active one-step self-displacements within reaching space. Experimental Brain Research, 125, 50–60. Messing, R ., & D urgin, F. H . (2005). D istance p erception a nd t he v isual horizon i n he ad-mounted d isplays. ACM T ransactions on Ap plied Perception, 2, 234–250. Milner, A . D., & G oodale, M . A . (1995). The v isual b rain in a ction. New York: Oxford University Press. Mittelstaedt, M. L ., & G lasauer, S. (1991). Idiothetic navigation i n gerbils and humans. Zoologische Jahrbücher Abteilungun für algemeine Zoologie und Physiologie der Tiere, 95, 427–435. Mittelstaedt, M . L . & M ittelstaedt, H . ( 2001). I diothetic na vigation i n humans: E stimation o f pa th leng th. Experimental Br ain Re search, 139, 318–332. Mohler, B. J., Creem-Regehr, S. H., & Thompson, W. B. (2006, July 28–30). The influence of feedback on egocentric judgments in real and virtual environments. Proceedings of S ymposium on Ap plied P erception in Graphics and Visualization (pp. 9–14), Boston, MA. Ooi, T. L., & He, Z. (2007). A d istance judgment function based on space perception m echanisms. Re visiting Gi linsky’s (1951) e quation. Psychological Review, 114, 441–454. Ooi, T. L., Wu, B., & He, Z. J. (2001). Distance determined by t he angular declination below the horizon. Nature, 414, 197–200. Ooi, T. L., Wu, B., & He, Z. J. (2006). Perceptual space in the dark affected by the intrinsic bias of the visual system. Perception, 35, 605–624.
Measuring Spatial Perception with Spatial Updating and Action
41
Oyama, T. (1977). Analysis of causal relations in t he perceptual constancies. In W. Epstein (Ed.), Stability and constancy in visual perception: Mechanisms and processes (pp. 183–216). New York: Wiley. Philbeck, J. W., Behrmann, M., Black, S. E., & Ebert, P. (2000). Intact spatial updating during locomotion after right posterior parietal lesions. Neuropsychologia, 38, 950–963. Philbeck, J. W ., B ehrmann, M ., L evy, L ., Potolicchio, S . J., J r., & C aputy, A. J. (2004). Path integration deficits during linear locomotion after human m edial tem poral l obectomy. Journal of C ognitive N euroscience, 16, 510–520. Philbeck, J. W., Klatzky, R. L., Behrmann, M., Loomis, J. M., & Goodridge, J. (2001). Active c ontrol o f lo comotion f acilitates n onvisual na vigation. Journal of Experimental Psychology: Human Perception and Performance, 27, 141–153. Philbeck, J. W., & L oomis, J. M . (1997). Comparison of t wo indicators of visually p erceived e gocentric d istance u nder f ull-cue a nd r educedcue conditions. Journal of E xperimental Psychology: Human Perception and Performance, 23, 72–85. Philbeck, J. W ., L oomis, J. M ., & B eall, A . C . (1997). Vi sually p erceived location is an invariant in the control of action. Perception & Psychophysics, 59, 601–612. Philbeck, J. W ., O’Leary, S ., & L ew, A . L . B . (2004). L arge er rors, but no depth compression, in walked indications of exocentric extent. Perception & Psychophysics, 66, 377–391. Post, R. B., & Welch, R. B. (1996). Is t here dissociation of perceptual and motor responses to figural illusions? Perception, 25, 1179–1188. Proffitt, D. R., Stefanucci, J., B anton, T., & Eps tein, W. (2003). The role of effort in distance perception. Psychological Science, 14, 106–113. Proffitt, D . R ., Stef anucci, J., B anton, T ., & Eps tein, W . ( 2006). Re ply to H utchison a nd L oomis. The S panish J ournal of P sychology, 9, 340–342. Redish, A. D. (1999). Beyond the cognitive map: From place cells to episodic memory. Cambridge, MA: MIT Press. Richardson, A. R., & Waller, D. (2005). The effect of feedback training on distance estimation in virtual environments. Applied Cognitive Psychology, 19, 1089–1108. Richardson, A. R., & Waller, D. (2007). Interaction with an immersive virtual environment corrects users’ distance estimates. Human Factors, 49, 507–517. Rieser, J. J. (1989). Access to knowledge of spatial structure at novel points of observation. Journal of E xperimental Psychology: Learning, Memory, and Cognition, 15, 1157–1165.
42
Embodiment, Ego-Space, and Action
Rieser, J. J., Ashmead, D. H., Talor, C. R., & Youngquist, G. A. (1990). Visual perception and the guidance of locomotion without vision to previously seen targets. Perception, 19, 675–689. Rieser, J. J., Pick, H. L., Ashmead, D. H., & Garing, A. E. (1995). Calibration of human locomotion and models of perceptual-motor organization. Journal of E xperimental P sychology: H uman P erception an d P erformance, 21, 480–497. Rieser, J. J., & R ider, E . (1991). Young c hildren’s spa tial o rientation w ith respect to m ultiple t argets w hen w alking w ithout v ision. Developmental Psychology, 27, 97–107. Rossetti, Y. (1998). I mplicit short-lived motor representations of space i n brain d amaged a nd he althy sub jects. Consciousness an d C ognition, 7, 520–558. Russell, B. (1948). Human knowledge: Its scope and limits. New York: Simon & Schuster. Sahm, C . S ., Cre em-Regehr, S . H ., Thompson, W . B ., & Wi llemsen, P . (2005). Throwing versus walking as indicators of distance perception in similar real and virtual environments. ACM Transactions on Applied Perception, 2, 35–45. Sedgwick, H. A. (1986). Space perception. In K. R. Boff, L . Kaufman, & J. P. Thom as (Eds.), Handbook of pe rception and human pe rformance: Vol. 1. S ensory processes and pe rception (pp. 21.1–21.57). New York: Wiley. Sereno, M. I., Pitzalis, S., & Martinez, A. (2001). Mapping of contralateral space in retinotopic coordinates by a parietal cortical area in humans. Science, 294, 1350–1354. Sholl, M. J. (1989). The relation between horizontality and rod-and-frame and v estibular n avigational p erformance. Journal of E xperimental Psychology: Learning, Memory, and Cognition, 15, 110–125. Sinai, M. J., O oi T. L ., & H e, Z . J. ( 1998). Terrain i nfluences t he accurate judgement of distance. Nature, 395, 497–500. Smith, P. C ., & Sm ith, O . W. (1961). B all t hrowing re sponses to p hotographically portrayed targets. Journal of Experimental Psychology, 62, 223–233. Smythies, J. R. (1994). The walls of Plato’s cave. The science and philosophy of (brain, consciousness, and perception). Aldershot, UK: Avebury. Speigle, J. M ., & L oomis, J. M . (1993). A uditory d istance p erception by translating observers. Proceedings of IEEE Symposium on Research Frontiers in Virtual Reality, San Jose, CA, October 25–26, 1993. Steenhuis, R. E., & Goodale, M. A. (1988). The effects of time and distance on accuracy of target-directed locomotion: Does a n accurate shortterm me mory for s patial lo cation e xist? Journal of M otor B ehavior, 20, 399–415.
Measuring Spatial Perception with Spatial Updating and Action
43
Thompson, W. B., Mohler, B. J., & Cre em-Regehr, S. H. (2005, May). Does perceptual-motor r ecalibration o f l ocomotion d epend o n per ceived self motion or the magnitude of optical flow? Paper presented at the annual meeting of the Vision Sciences Society, Sarasota, FL. Thompson, W. B., Willemsen, P., Gooch, A. A., Creem-Regehr, S. H., Loomis, J. M ., & B eall, A . C . (2004). Do es t he qu ality of t he c omputer graphics matter when judging distances in visually immersive environments. Presence, 13, 560–571. Thomson, J. A. (1983). Is continuous visual monitoring necessary in visually guided locomotion? Journal of Experimental Psychology: Human Perception and Performance, 9, 427–443. Toye, R. C. (1986). The effect of viewing position on the perceived layout of space. Perception & Psychophysics, 40, 85–92. Wagner, M. (1985). The metric of visual space. Perception & Psychophysics, 38, 483–495. Weiskrantz, L . (1986). Blindsight: A c ase s tudy an d impli cations. O xford: Oxford University Press. Whishaw, I. Q., Hines, D. J., & Wallace, D. G. (2001). Dead reckoning (path integration) re quires t he h ippocampal for mation: e vidence f rom spontaneous exploration and spatial learning tasks in light (allothetic) and dark (idiothetic) tests. Behavioural Brain Research, 127, 46–69. Witt, J. K., Proffitt, D. R., & Epstein, W. (2004). Perceiving distance: A role of effort and intent. Perception, 33, 577–590. Witt, J. K ., Proffit t, D. R., & Epstein, W. (2005). Tool use affects perceived distance but only when you intend to use it. Journal of Experimental Psychology: Human Perception and Performance, 30, 880–888. Wong, E., & Mac k, A. (1981). Saccadic programming and perceived location. Acta Psychologica, 48, 123–131. Worsley, C. L., Recce, M., Spiers, H. J., Marley, J., Polkey, C. E., & Morris, R. G. (2001). Path integration following temporal lobectomy in humans. Neuropsychologia, 39, 452–464. Wu, B ., K latzky, R . L ., Shel ton, D ., & Ste tten, G . (2005). P sychophysical evaluation of in-situ ultrasound visualization. IEEE Transactions on Visualization and Computer Graphics (TVCG), 11, 1–10. Wu, B., Ooi, T. L., & He, Z. J. (2004). Perceiving distances accurately by a directional process of i ntegrating g round i nformation. Nature, 428, 73–77. Zahorik, P., Brungart, D. S., & Bronkhorst, A. W. (2005). Auditory distance perception in humans: A summary of past and present research. Acta Acustica United with Acustica, 91, 409–420.
2 Bodily and Motor Contributions to Action Perception
Günther Knoblich
Over t he last 50 y ears, cognitive scientists have been on a h unt for a general architecture of cognition, initially with great enthusiasm (Newell, 1990; Newell, Shaw, & Simon, 1956; Newell & Simon, 1972), but fac ing a n i ncreasing n umber o f p roblems la ter o n. A lthough there a re st ill attempts to define such general f rameworks (Anderson e t a l., 2 004; K ieras & M eyer, 1997; Wray & J ones, 2 005), t heir impact on defining the research agenda for cognitive science seems to have dropped. What is the reason for this development? One of the main problems seems to be that most of these frameworks implicitly assume that cognition is detached from the world and from the body: P erception co nsists o f t ranslating p hysical st imulation i nto symbolic re presentations. Ac tion c onsists i n t he m anipulation of mental content or symbolic commands to the motor system, and it is frequently fully controlled by the cognitive system. Mechanisms for action execution are often underspecified. At present, a countermovement has set in and embodiment is the new keyword. Howe ver, d ifferent researchers use t his term in very different ways that can be traced back to James Gibson (1979), Maurice M erleau-Ponty (1945), J ean P iaget (1969), a nd W illiam J ames (1890). Wilson (2002) distinguished six different, but related, theoretical assumptions for which the term embodiment has been used. These assumptions span from radical interactionism, the claim that 45
46
Embodiment, Ego-Space, and Action
environment a nd cog nitive system c annot be s eparated bec ause of the dense information flow, to situated cognition, the claim that cognition is situated in particular perception-action contexts. In t his article I w ill focus on a pa rticular brand of embodiment theory t hat st resses t he close l inks be tween perception a nd ac tion and a ssigns t hem a n i mportant r ole f or cog nition i n g eneral. The functional version of this theory is known as the common coding theory of perception and action (Hommel, Muesseler, Aschersleben, & Pr inz, 2001; Pr inz, 1997), a nd t he neuronal version is k nown as mirror system theory (cf. Rizzolatti & Craighero, 2004). Intellectual precursors include William James’s ideomotor principle (1890) and the motor theory of speech perception (Liberman & Whalen, 2000). Basically, the common coding theory generalizes James’s ideomotor principle ( James, 1890) a nd applies it to ac tion perception (Greenwald, 1970; Hommel et al., 2001; Prinz, 2002). Originally, the ideomotor principle was postulated to explain voluntary action. It states that imagining an action will create a tendency to carry it out. This tendency will automatically lead to the execution of the action when no antagonistic mental images are simultaneously present (James, 1890, vol. 2, p. 526). The common coding theory adds to this claim that the mental images (or representations in more modern terms) do not code actions per se (Prinz, 1997), but the d istal per ceptual e vents t hey p roduce. This cr eates a co mmon medium for perception and action that leads to a functional equivalence of perceptual representations and action representations. As a consequence, action representations should become activated whenever one perceives an action that is similar to an action one is able to perform. A growing body of neurophysiological evidence suggests that common coding of perception and action is implemented on a neuronal level (e.g., Blakemore & Decety, 2001; Decety & Grezes, 1999; Rizzolatti & Craighero, 2004). Rizzolatti and his colleagues provided evidence for ‘‘mirror neurons’’ in the premotor and parietal cortices of mac aque m onkeys (Gallese, Fadiga, Fogassi, & R izzolatti, 1996, Kohler et al., 2002; Umiltà et al., 2001). These neurons fire when the monkey carries out object-directed ac tions. The surprising finding is t hat t hese “motor” neurons a lso fire when the monkey observes the experimenter carrying out object-directed actions. Newer findings suggest that mirror neurons in the parietal cortex code action goals (Fogassi et al., 2005). Positron emission tomography (PET) and
Bodily and Motor Contributions to Action Perception
47
functional magnetic resonance imaging (fMRI) studies suggest that humans possess a similar mirror system that involves premotor and parietal cortical areas (e.g., Iacoboni et al., 1999; Koski et al., 2002). Rizzolatti a nd Cr aighero (2004) provide a m ore elaborate de scription of findings about t he m irror s ystem i n monkeys a nd humans and of t he d ifferences between t he monkey mirror system a nd t he human mirror system. In wh ich se nse is t his co mmon cod ing/mirroring t heory a n embodied approach? It shares at least two of the six basic assumptions that Wilson (2002) identified as underlying different approaches to embodied c ognition: Fi rst, it s tresses t hat t he u ltimate f unction of the mind is to guide action, and that therefore, a better understanding o f per ception-links i s n ecessary i n o rder t o be tter u nderstand the m ind. S econd, it st resses t hat off-line cog nition i s body-based, and i n pa rticular, t hat ac tion obs ervation l eads t o t he ac tivation of s tructures t hat o ne u ses t o p erform a nd e xecute t he o bserved actions. This d oes n ot n ecessarily i mply a f ocus o n t he pa rticular anatomical structure of the body. Rather, any perceptual event that can potentially result f rom one’s own ac tions leads to a r esonance with the action system. For instance, hearing the sound of a hammer on wood will activate action representations involved in hammering (at least if one has performed hammering actions producing similar sounds earlier in one’s life). Furthermore, the common coding theory is sympathetic but not necessarily c ommitted t o th ree fu rther e mbodiment c laims th at Wilson ( 2002) ha s i dentified: ( 1) cog nition i s s ituated i n t he r eal world a nd i nherently i nvolves perception a nd ac tion; (2) ac ting i n the r eal w orld i nvolves t ime co nstraints; a nd (3) t he en vironment is used to offload cog nitive workload. Finally, t he common cod ing theory is not easily reconciled with radical interactionist approaches of embodied cognition (e.g., newer versions of Gibson’s ecological psychology), which claim that the information flow between organism and environment is too dense as to allow any meaningful characterization of cog nition t hat does not i nclude t he environment at the s ame t ime. C ommon cod ing focuses on t he goal-directed a nd intentional nature of human action (see Barresi & Moore, 1997, for related ideas), and postulates that internal states cause overt action. Thus it is a representational theory. However, there is no compelling reason to assume that the representations involved are propositional or symbolic, as suggested in
48
Embodiment, Ego-Space, and Action
some versions of the theory (e.g., Hommel et al., 2001). A weaker notion o f r epresentation, w here r epresentations ar e r egarded a s blueprints (A. Clark, 1997), as relational schemas (Barresi & Moore, 1997; K noblich & Flach, 2 001), o r a s r egions i n a n a ttractor spac e (e.g., Spivey, Grosjean, & K noblich, 2005), seems to be more appropriate. The notion of an attractor space fits nicely with the common coding t heory’s post ulate o f a s imilarity-based ma tching be tween perception a nd ac tion. The assumption that perception and action use co ntinuous, g raded r epresentations ma kes i t e asy t o ex plain how perception a nd ac tion c an be ma tched i n a m ultidimensional space, a nd provides a st raightforward l ink be tween t he f unctional principle o f co mmon cod ing a nd t he r apidly g rowing em pirical evidence ob tained i n t he c ognitive n eurosciences. I f o ne a ssumes propositional representations the similarity principle loses much of its po wer. F or i nstance, i f o ne c ategorizes co ntinuous d imensions (left…right) i nto d istinct p ropositions (e.g., fa r l eft, n ear l eft, near right, far right) a straightforward match occurs only if two events fall into exactly the same category (of course, this could be remedied by postulating additional processes). Closely related are theories postulating that we internally simulate or emulate t he ac tions we observe i n others (Blakemore & Dec ety, 2001; Grush, 2004; Jeannerod, 2001; Wilson & Knoblich, 2005; Wolpert, Doya, & Kawato, 2003). It is important not to confuse the meaning of “simulation” i n t his context of “action simulation” w ith t he meaning of “simulation” in the context of the theory of mind debate (e.g., Do kic & P roust, 2 002; G oldman, 2 006; Ha rris, 1995). I n t he latter d ebate s imulation re fers t o put ting one self i nto a nother p erson’s shoes. In the context of research on action planning and action perception, simulation re fers to pre dictive me chanisms or i nternal models t hat a re used to plan a nd execute one’s own actions. These models predict t he s ensory or perceptual consequences of ac tions. One of their main functions is to bridge timing delays between the issuing of motor commands a nd t he a rrival of reafferent information from the sensory organs in the central nervous system (Wolpert & Kawato, 1998). The basic idea behind action simulation theories is that matching perceived actions to our own action repertoire allows us to exploit such predictive mechanisms in our motor system in order to predict the future consequences of others’ actions. The obvious functional advantage of this type of action simulation is that it is not necessary to have separate perceptual prediction mechanisms for predicting the outcomes of others’ actions.
Bodily and Motor Contributions to Action Perception
49
In the rest of this chapter I will provide an overview of empirical studies that provide converging evidence for the claims of the common coding theory and action simulation theories. The main claim of the common coding theory is that perceived actions are matched to one’s own action repertoire. Theories of action simulation add that this match activates predictive motor mechanisms that allow one to predict the future outcome of others’ actions. These predictions, in turn, might help to stabilize (Wilson & Knoblich, 2005) and to temporally structure (Thornton & Knoblich, 2006) perception. There a re s everal r outes t o te sting t hese cla ims. I w ill f ocus o n results f rom f our d ifferent l ines o f em pirical r esearch. The first line of research shows that motor laws that hold in action execution also hold in action perception and motor imagery. The logic of this research is that if perception and action both rely on a common coding system one wou ld expect t hat pr inciples t hat govern action execution should also govern one’s perception of others’ actions and the way one imagines one’s own actions. The second line of research demonstrates t hat acq uiring ex pertise i n a c ertain ac tion d omain profoundly affects the perception of the corresponding actions and their effects. The logic of this research is that acquiring new motor skills l eads t o t he acq uisition o f n ew m otor r epresentations o r t o the modification of ex isting ones. According to t he common coding p rinciple, such cha nges i n t he m otor r epertoire sh ould a ffect action perception. In particular, people should resonate more when they observe actions they can perform well than when they observe actions they cannot perform or cannot perform well. The third line of r esearch su ggests t hat o ne’s o wn p revious ac tions a re a spec ial object of perception, because they maximally activate common representations for perception and action. This leads to self-identification and more accurate predictions for self-produced actions. Finally, the fourth line of research provides evidence that the ability to sense the periphery of the body is one of the necessary conditions for some forms of action simulation.
Motor Laws and Action Perception If perception and action both rely on a common coding system one would expect that the principles that constrain the production of one’s own movements a lso constrain one’s perception of others’ movements and their effects. Research on the two-thirds power law
50
Embodiment, Ego-Space, and Action
(Lacquaniti, Terzuolo, & Viviani, 1983), on the apparent motion of the h uman body ( Shiffrar & F reyd, 1 990; 1 993), a nd o n F itt’s la w (Fitts, 1954) provides strong evidence in support of this claim. Two-Thirds Power Law The t wo-thirds po wer la w (V iviani, 2 002; V iviani, Ba ud-Bovy, & Redolfi, 1997; Viviani & Stucchi, 1989, 1992) describes a lawful relationship between the velocity of a movement and the curvature of a trajectory. I will illustrate the underlying principle in a simple example. Imagine repeatedly drawing a horizontally elongated ellipse on a piece of paper, as fast as possible. In the middle of the ellipse the curvature is low (almost a straight line) but at both ends the curvature increases until a d irection change occurs (from left to right or from right to left). The two-thirds power law states that, as the curvature increases, one needs to systematically decelerate one’s movement. Conversely, to the extent that the curvature decreases, one is able to systematically speed up again. The a mount of d eceleration or acceleration is directly proportional to the change in curvature. The two-thirds power law holds for most types of human movement. For i nstance, t racking st udies sh ow t hat peo ple c annot acc urately track the movement of a target when it deviates from the two-thirds power law. This is true for manual tracking (Viviani, Campadelli, & Mounoud, 1987; Viviani & Mounoud, 1990) as well as for tracking a target with one’s eyes (DeSperati & Viviani, 1997). Most i mportant i n t he p resent co ntext i s t he finding th at th is motor law also constrains the way we perceive motion. Viviani and Stucchi ( 1989) a sked pa rticipants t o e stimate t he ec centricity o f ellipsoidal m ovements t hey obs erved. These e stimates w ere b iased towards the velocity profi le predicted by the two-thirds power law. In a nother st udy, Viviani a nd Stucchi (1992) a sked pa rticipants to adjust a r andomly moving dot’s (scribbles’) velocity to be co nstant. Surprisingly, pa rticipants perceived t he point as moving w ith constant velocity when it ac tually ac celerated a nd de celerated ac cording to the two-thirds power law. Thus perception was clearly shifted towards per ceiving a d ot t hat m oved ac cording t o co nstraints o f the human motor s ystem, a s moving w ith constant velocity. Similar effects have been observed in t he k inesthetic modality, where a robot moved the participants’ arms on different elliptical trajectories (Viviani, Baud-Bovy, & Redolfi, 1997).
Bodily and Motor Contributions to Action Perception
51
Further s tudies s how th at p eople’s a bility t o p redict th e fu ture course of a handwriting trajectory (Kandel, Orliaguet, & Boe, 2000) breaks down when the observed writing trajectory is manipulated so that it no longer corresponds to the two-thirds power law (Kandel, Orliaguet, & V iviani, 2 000). F inally, Flach a nd co lleagues (2004a) demonstrated t hat f orward d isplacements i n per ceived m ovement direction ( representational m omentum, c f. H ubbard, 1 995, 2 005; Kerzel, Jordan, & Muesseler, 2001) are reduced when the movement follows t he r ules of t he t wo-thirds power law. Thus it seems to b e easier to anticipate the future course of a movement when its velocity pro file cha nges ac cording to “ human” cha racteristics. T aken together, t he findings o n t he t wo-thirds po wer la w p rovide o verwhelming e vidence t hat m otor co nstraints c an p rofoundly a ffect movement perception. Apparent Motion of the Human Body Further su pport f or t he cla im t hat bod ily a nd m otor co nstraints affect action perception comes from research on apparent body motion (Shiffrar, this volume; Shiffrar & Freyd, 1990, 1993; Shiffrar & P into, 2 002; S tevens, F onlupt, Sh iffrar, & Dec ety, 2 000). These findings s how t hat one p erceives a n a natomically pl ausible move ment path in an apparent motion display of body movements (e.g., hand g oing a round t he h ead), a lthough ac cording t o t he cla ssical laws of apparent motion (Korte, 1915) one should perceive the shortest, but anatomically implausible path (e.g., hand going through head). However, this is only true for movement speeds that lie within a range a human actor could achieve. At fast movement speeds (short SOAs) the shortest path is perceived. For movements of nonhuman objects one always perceives the shortest path, regardless of whether physical co nstraints a re v iolated a nd r egardless o f t he per ceived movement speed. One i mportant pa rt o f t he ex planation f or t his finding i s t hat a m ultimodal body s chema p rovides i mportant co ntributions to human body perception (Funk, Shiffrar, & B rugger, 20 05; cf . Knoblich, Thornton, Gr osjean, & Sh iff rar, 2 006). Howe ver, t his assumption d oes n ot f ully ex plain wh y t he a natomically pos sible path is only perceived within a time range that corresponds to movement speeds that is actually possible to achieve for humans. Thus it is possible that another part of the explanation is that perception of the
52
Embodiment, Ego-Space, and Action
anatomically possible path is partly driven by contributions from the motor system. The observer might covertly simulate performing the observed movement (Wilson & K noblich, 2005). Such a s imulation might be a p recondition f or per ceiving t he a natomically p lausible movement path. Fitts’s Law As a final example, a recent study by Grosjean, Shiffrar, and Knoblich (2007) provides direct evidence that Fitts’s law (Fitts, 1954) holds in action perception. It could be argued that Fitts’s law is the best-studied a nd m ost r obust p rinciple o f human m otor per formance ( Plamondon & Alimi, 1997). It states that the time it takes to move as fast as possible between two targets is determined by their width and the distance between them. As the target size increases one is able to move faster without missing the target. As the distance between the targets i ncreases one needs longer t o move be tween t hem w ithout missing t hem. Thus t here i s a t rade-off that is often referred to as speed-accuracy trade-off. Fitts’s law describes this trade-off as: MT=a + b·ID, where M T i s m ovement t ime, I D i s t he i ndex o f d ifficulty of the movement, a nd a a nd b a re empirical co nstants. The c ritical v ariable is the index of difficulty, which relates the amplitude (A) of the movement to the width (W) of the targets: ID=log2(2·A/W). Thus different co mbinations o f a mplitude a nd t arget w idth c an yield the same index of difficulty, and accordingly, the same movement t ime. F or i nstance, F itts’s la w p redicts t he s ame m ovement times for targets that are 2 c m wide and 8 c m apart and for targets that a re 8 c m w ide a nd 32 cm apart ( both have ID=3). It has been demonstrated that Fitts’s law holds for different types of movement (discrete and cyclical), different effectors (finger, arm, and head), and different contexts (under a microscope and under water). Moreover, Fitts’s law holds not only when one actually executes movements, but it also holds when one just imagines performing movements (Decety & Jeannerod, 1995).
Bodily and Motor Contributions to Action Perception
53
In order to determine whether Fitts’s law holds in action perception, Grosjean and colleagues (2007) asked participants to judge two alternating pictures of a person moving at various speeds between two targets. The targets varied in amplitude and width. Ther e were three different amplitude/width combinations for each of the three IDs studied (2, 3, 4). Participants could watch these displays at leisure until they felt ready to report whether the observed person could perform such movements without missing the targets. Alternating p ictures w ere c hosen in stead o f v ideos t o a void any i nfluence of move ment t rajectory i nformation, w hich i s not addressed by Fitts’s law. Perceived movement times were defined in terms of the speeds at which participants provided an equal proportion of “possible” a nd “ impossible” jud gments. The results showed a perfect linear relationship (r2=.96) between the index of difficulty and t he movement t ime t hat was perceived a s just possible for t he observed person. This implies that the perceived movement time did not vary as a f unction of the target width or the movement amplitude (distance between targets) alone. Rather, the same speed-accuracy trade-off that is present in action production and motor imagery also governed action perception. Such a result is very hard to explain in purely perceptual ter ms. R ather, it provides st rong e vidence for motor contributions to action perception. Expertise and Action Perception The claim that action shapes perception implies that acquiring new motor s kills s hould a ffect o ne’s per ception o f o thers’ ac tions t hat require the same skill to be performed. Thus as one makes progress in learning to play piano, one’s perception of piano playing should become increasingly linked to the action representations that govern one’s own piano playing. Likewise, learning new dance movements or becoming a sk i expert should affect one’s perception of dancing and skiing. Note that this does not necessarily imply that the action system remains silent when nonexperts observe the actions of highly skilled experts. Even hardboiled couch potatoes watching a soccer or basketball game will at times have been forced to carry out the general types of actions the observed players are performing (e.g., running, kicking, throwing, etc.), and should therefore resonate with the observed actions. However, experts in a certain domain who watch other ex perts sh ould sh ow a h igher deg ree o f r esonance, bec ause
54
Embodiment, Ego-Space, and Action
the action knowledge they can apply to the observed actions will be more elaborate. Several recent studies of musicians, dancers, and athletes, suggest that expertise with particular motor skills do indeed result in closer links be tween perception a nd ac tion. I n t his s ection I w ill d iscuss studies t hat su pport t wo p redictions a rising f rom t he cla im t hat motor skills can affect perception: (1) watching actions that one is an expert in performing should, compared to nonexpert actions, lead to higher activation of brain areas that are related to action planning and m otor per formance, a nd (2) t he i ncreasing r esonance o f per ceived actions with one’s own action repertoire that result from the acquisition of new motor skills should alter the perception of actions which r equire t hese sk ills t o be per formed (W ilson & K noblich, 2005). Haueisen and Knoesche (2001; see also Bangert, Parlitz, & Altenmueller, 1999), u sing ma gnetoencephalography (MEG), found t hat pianists w ho l istened t o re cordings of pi ano pi eces s howed m uch higher activation of areas in primary motor cortex t han musically trained nonpianists (choir singers). This demonstrates that listening to t he per ceptual co nsequences o f h ighly t rained ac tions ac tivates the corresponding motor programs in experts. In a recent functional brain i maging st udy, Ha slinger a nd co lleagues (2005) ha ve sh own that when an expert pianist observed another pianist performing finger movements related to piano playing, brain networks that support action planning and action execution were activated. Thi s activation did not occur in novices and was not observed for finger movements that we re u nrelated to pi ano pl aying. E ven w hen e xpert pi anists watched silent piano playing, auditory sensory areas were activated. This suggests that the observation of finger movements led them to recover the auditory consequences of the visually observed actions. Highly skilled pianists seem to perceive melodies when observing others silently playing piano. The r ole o f ex pertise i n ac tion per ception ha s be en f urther addressed in an elegant brain imaging study on ballet dancers and capoeira dancers (Calvo-Merino, Glaser, Grezes, Passingham, & Haggard, 2005). Capoeira is an Afro-Brazilian martial art dance that can be highly artistic and requires skills that are quite different from the ones that are required for ballet dancing. For instance, whereas ballet dancers often hardly touch the ground, extensive “groundwork” is key in capoeira dancing. The kinematics of the performed move-
Bodily and Motor Contributions to Action Perception
55
ments differ between t he t wo dance st yles. Whereas ba llet dancers most often p erform el egant, s wift, a nd a iry m ovements, c apoeira dancers prefer sweeps, kicks, and head-bangs. Calvo-Merino and her colleagues asked whether the two groups of dance experts would show higher activation of action-related brain areas (the human mirror system) when observing their own dance style. Ballet dancers observed videos of ballet dancing and capoeira dancing and capoeira dancers observed videos of ballet dancing and capoeira d ancing. A s p redicted, c apoeira d ancers sh owed h igher activation of the mirror system (premotor cortex, interparietal sulcus, and superior temporal sulcus) when observing capoeira dancing as compared to ballet dancing. In contrast, ballet dancers showed a higher activation in the same areas when observing videos of ballet dancing. It should be noted that in both groups of dancers the mirror system was activated during the observation of both dance styles. Thus the dancers’ mirror system also responded to some extent when they observed a different dancing style. Together, the results suggest that being an expert in a particular domain has very specific influences on action perception. The more similar the observed actions to the actions one is an expert in performing the higher the resonance of the action system with the observed movement. One possible f unction of t his h igher resonance could be t o better identify the intentions underlying observed actions. Testing this hypothesis, S ebanz, Z isa, a nd Sh iffrar ( 2006) a sked wh ether ba sketball ex perts a re be tter ab le t han n ovices a t der iving dec eptive intentions f rom t he m ovements o f t heir o pponent. I n pa rticular, they investigated whether basketball players are better able to detect fakes (pretending t o pa ss t he ba ll t o te ammate, but ac tually ke eping it) from pictures, videos, and dynamic movement displays. The static pictures depicted the exact moment at which the ball left or did not leave the observed player’s hands. The v ideos and the dy namic movement displays ended exactly with the same frame that was displayed in the static picture condition. Sebanz and colleagues found that o nly ex perts co uld u se t he m ovement i nformation p rovided in v ideos t o de tect wh ether t he obs erved p layer w as per forming a pass or a fake. In this condition the experts performed significantly better t han n ovices. Wh en s tatic p ictures w ere di splayed, e xperts and novices were ha rdly be tter t han cha nce a nd t here w as no d ifference between the groups. A second experiment showed that only experts could identify fakes from dynamic movement displays of a
56
Embodiment, Ego-Space, and Action
basketball player, whereas novices’ identification did not differ from chance level. Thus basketball experts seem to have an improved ability to derive deceptive intentions from bodily actions. Direct e vidence f or t he a ssumption t hat t he acq uisition o f n ew skills leads to a close coupling between certain actions and their perceivable effects comes from a single-cell study on monkeys (Kohler et al., 2002). Kohler and colleagues investigated whether mirror neurons respond to auditory stimuli. In addition to “noise-producing” actions that are already in the monkey’s repertoire, such as cracking nuts, they also trained monkeys to perform a number of new actions that produced particular noises, such as ripping a paper. They found that premotor neurons generally fired in response to auditory stimuli that reflected the consequences of actions the monkey could perform, regardless of whether the actions belonged to monkey’s natural repertoire or whether they were recently acquired. Two recent studies have taken the next important step in exploring how action perception changes as a particular person acquires a particular new skill. Casile and Giese (2006) trained students to perform n ew g ait pa tterns a nd i nvestigated wh ether per ceptual ac curacy for identification of t hese gait patterns i mproved. Specifically, they trained their participants to perform “funny” arm movements with a 270° phase relationship while walking (e.g., left arm half-way between front and back, right arm in the back). During normal gait this phase relationship is approximately 180°; for example, left arm in the front and right arm in the back). Performing the funny arm movements w ith t he 270° relationship is i mpossible w ithout t raining, but t raining c an lead to a q uite ste ep i mprovement i n per formance, at least for some people. The participants were bl indfolded during t raining. On ly v erbal a nd ha ptic f eedback w as p rovided. Casile and Giese (2006) performed the training in the dark to rule out any effects of visual familiarity and visual cues during learning. Before and after the training the participants were asked to judge whether two consecutively presented point-light displays of human gait were the same or different. One of the displays depicted one of three d ifferent g ait p rototypes: A 1 80° p hase r elationship (normal gait), a 225 ° p hase r elationship (untrained f unny g ait), a nd a 2 70° phase relationship (to be trained/trained funny gait). The second display was either the same or slightly differed from the prototype that was presented first. The main dependent variable was the accuracy of the same-different judgments.
Bodily and Motor Contributions to Action Perception
57
Before the training, participants were quite accurate for displays of normal gait (180° displays), less accurate for t he 225° f unny gait displays, a nd e ven l ess ac curate f or t he 2 70° f unny g ait d isplays. After participants had received training in performing the 270° gait pattern in the dark, they selectively improved in their visual samedifferent judgments for the 270° displays, but not for the other types of displays. Thus there was a very specific effect of motor training on visual perception t hat w as not mediated by v isual c ues. Moreover, the degree of improvement in the visual task was highly correlated with participants’ improvement during training. The more successful pa rticipants had be en i n acq uiring t he f unny g ait pa ttern, t he larger was their improvement in the visual task (after training compared to prior to training). These results are quite surprising. Why should visual perception change through training during which visual cues are absent? Theories that postulate common coding of perception and action (Prinz, 1997) su ggest t hat a s o ne’s ac tion r epertoire cha nges, per ception should be a ffected i n a r eciprocal w ay. B ut how ex actly could t his work? Thornton and K noblich (2006) suggested t hat timing might be t he key to explaining why learning a n ew motor task can a ffect visual perception. Learning to perform the 270° arm movements in Casile and Giese’s experiment mainly requires temporally coordinating familiar action components in a new manner. Each of these components would a lready be l inked t o its v isual consequences. Thus, learning a n ew tem poral coo rdination pa ttern be tween t he ac tion components most likely improved participants’ ability to temporally parse the visual elements that were already linked to the action components. This would explain the higher sensitivity for slight temporal deviations when observing point light displays of the newly acquired 270° gait pattern. A second study that directly addressed the establishment of close perception/action l inks d uring t he acq uisition o f m otor sk ills w as recently per formed by Cr oss, Ha milton, a nd Gr afton (2006). They monitored changes of brain activation in expert dancers of a modern dance ensemble as they learned and rehearsed new dance sequences. The dancers practiced these sequences for about five hours per week for five weeks. During this five-week period each of the dancers was scanned w ith f MRI o nce per w eek. The d ancers w atched v ideos of sequences that they were currently practicing as well as videos of r elatively s imilar d ance s equences t hat t hey w ere c urrently n ot
58
Embodiment, Ego-Space, and Action
practicing. After watching a video, they imagined themselves performing the observed movement and then rated their own ability to perform a particular movement. As i n ma ny p revious st udies ( e.g., Bla kemore & Dec ety, 2 001; Grezes & Dec ety, 2 001; J eannerod, 2 001; R izzolatti & Cr aighero, 2004), action observation and action imagery led to activation of the human mirror system including premotor cortex and inferior parietal sulcus. The first critical finding was that activation was stronger for action sequences the dancers were currently rehearsing than for the control sequences t hey were not c urrently rehearsing. The second critical finding was that activation in these areas was highly correlated with the dancers’ judgments about how well they were able to perform the observed sequences. This is a clear demonstration that activation i n t he human mirror system flexibly changes when new motor skills are acquired. Actor Identity, Action Prediction, and Action Coordination The experiments described in the previous section suggest that the proficiency w ith wh ich o ne i s ab le t o per form pa rticular ac tions affects t he per ception o f s imilar ac tions i n o thers. H owever, t here is a pa rticular cla ss o f ac tions e very s ingle perso n i s a n ex pert i n performing, the actions in one’s own action repertoire (Knoblich & Flach, 2003). Accordingly, perceiving one’s own actions, for instance, when watching a v ideo of oneself dancing or listening to a r ecording of one’s own clapping, should maximize the activation of common r epresentations f or per ception a nd ac tion. The r eason i s t hat there should be a h igh degree of similarity between performed and perceived actions in this case, if the same representations that generated a pa rticular ac tion become ac tivated t hrough observation. As a co nsequence, o ne sh ould be ab le t o i dentify o ne’s o wn p revious actions, to better predict the outcomes of such actions, and to better coordinate new actions with previous ones.
Action Identification The first ex perimental ps ychologist wh o add ressed t he i dentification of self-generated actions was probably the German Gestalt psy-
Bodily and Motor Contributions to Action Perception
59
chologist Werner Wolff (1931). He was interested in the question of whether pe ople “ involuntarily ex press t hemselves” t hrough t heir movements. The pa rticipants i n h is ex periment w ere filmed while they walked up a nd down i n a r oom a nd c arried out a n umber of actions. They were all dressed in the same loose clothing. This served to remove static anatomical cues. In addition, each film was manipulated to disguise the filmed person’s face thus removing facial cues to re cognition. This w as a n at tempt to i solate move ment k inematics. When the participants watched these films a few days later, they could recognize themselves much better than they could recognize the other persons whom t hey a ll k new well. Wolff concluded from these results that people are able to recognize their own “individual gait cha racteristics.” However, t here a re ma ny a lternative ex planations for t his early result. For instance, it is unlikely t hat t he loose clothing effectively removed a ll a natomical characteristics, such a s the size of a person, the width of his or her shoulders, and so on. More than 40 years later Cutting and Kozlowski (1977) came up with a modern version of Wolff ’s self-recognition of gait paradigm that used the point-light technique developed by Johansson (1973). This technique allows one to effectively isolate the movement (kinematic) information f rom form i nformation. L ight sources a re attached to the main joints of a person. The person is completely dressed in black and filmed with a v ideo camera in front of a b lack background. At high contrast the resulting displays show a number of moving dots that g ive a s tunningly v ivid i mpression of h uman move ment t hat allows one to derive various attributes of the observed actor and the observed actions (for overviews see Casile & Giese, 2005; Thor nton, 2006). These attributes i nclude a perso n’s gender (Cutting, Proffitt, & Kozlowski, 1978), emotions (Dittrich, Troscianko, Lea, & Morgan, 1996; Pollick, P aterson, B ruderlin, & S anford, 2 001), a nd ex pectations (Runeson & F rykholm, 1983). Furthermore, the properties of invisible target objects of actions, such as weight, can also be derived (Hamilton, J oyce, Fla nagan, F rith, & W olpert, 2 007; R uneson & Frykholm, 1981). Cutting a nd K ozlowski a sked wh ether pe ople w ould be ab le t o recognize t hemselves a nd t heir f riends f rom po int-light d isplays of gait. In contrast to Wolff ’s earlier results, they found that their participants were not better able to recognize themselves than their friends. I n a s imilar st udy B eardsworth a nd Buckner (1981) found a small advantage for self-recognition over friend-recognition from
60
Embodiment, Ego-Space, and Action
point-light displays of gait. It should be noted, though, that the recognition rates for self and friends in both studies were hardly above chance. At first glance it seems as if these results would contradict the assumption of close perception/action links. However, r ecent ex periments t hat l ooked a t sel f-recognition i n different types of bodily motions have provided clear evidence that people a re ac tually q uite ac curate i n i dentifying t hemselves f rom point-light d isplays. I n su pport o f cl ose per ception/action l inks they also demonstrated that one is much more accurate at identifying oneself than identifying one’s friends (Loula, Prasad, Harber, & Shiffrar 2005; see also Shiffar, this volume). A further interesting aspect of these findings is that, whereas people were quite accurate in identifying themselves from point-light displays of dancing or boxing, the recognition rates were hardly above chance for actions like walking and running, just as in previous studies. This suggests t hat walking was t he w rong place to start looking for self-identification effects. But why is it hardly possible to identify one’s own walking and running? In hindsight the explanation seems simple. Walking a nd r unning a re m otion pa tterns t hat a re h ighly biomechanically constrained. Thus there are hardly any movement cues that would allow one to distinguish one’s own walking pattern from a stranger’s walking pattern or a friend’s walking pattern from a st ranger’s w alking pa ttern. M ovements l ike d ancing o r bo xing allow for more individualistic styles and thus provide rich kinematic cues for self-recognition. All of the studies reviewed so far used bodily movements of human actors as stimuli. However, the common coding theory (Prinz, 1997; 2002) and recent findings on the mirror system (Kohler et al., 2002) suggest t hat a ny perceivable e ffect of a n ac tion c an result i n resonance or activation of representations that are used to produce the action. Thus people should not only be able to identify bodily movements as self-generated, but also the visual and auditory effects of different types of actions. Take handwriting as an example. The visual effects of w riting and drawing can be de scribed as a s imple trajectory of a moving dot with two spatial dimensions and one temporal dimension. N evertheless, w riting a nd d rawing a re co mplex sk ills that everybody is familiar with (Van Sommers, 1984). Can people identify point-light displays of their writing and drawing? Knoblich and Prinz (2001) addressed this question in a s eries of experiments. Although it seems quite obvious that people are able
Bodily and Motor Contributions to Action Perception
61
to identify t heir w riting when t hey a re confronted w ith a finished product (e.g., a pa ge f rom t heir d iary), it is less clear whether t hey can i dentify t heir w riting f rom a po int-light d isplay t hat p rovides only movement information (think of writing with a laser pointer on a white wall). The participants came for two individual sessions that were at least one week apart. During the first session, they produced writing samples of a number of familiar symbols (numbers and letters from the Latin script) and unfamiliar symbols (e.g., letters from Thai and Mongolian scripts) on a writing pad. The kinematics of their writing was recorded (sampling rate was 100 Hz). During the whole recording session the participants’ writing hand was screened from view. N o v isual f eedback abo ut t he em erging t rajectory w as p rovided. In addition, the participants were required to follow a certain stroke s equence a nd st roke d irection f or e ach l etter. F or i nstance, in order to produce the letter “P” they were required to start with a down-stroke, lift the pen, and produce the bended stroke from top to bottom. This was required because otherwise stroke sequence could have been used as a further potential cue to self-recognition (Flores d’ Arcais, 1994). In the second session, participants were asked to identify their own writing. They observed two point-light displays of the production of the same symbol. One display reflected the kinematics of the participant’s o wn w riting a nd t he o ther d isplay r eflected another participant’s writing. Self- and other-produced displays appeared in random ord er. The t ask w as s imply t o dec ide wh ether t he first or second display was self-produced. The participants received no feedback about whether their judgments were correct in order to avoid effects of perceptual learning during the experiment. The r esult o f a first ex periment dem onstrated t hat pa rticipants could i ndeed r ecognize t heir o wn w riting ba sed o n t he m inimal information provided by a single moving dot. The same results were observed when self- and other-produced displays were scaled to have the s ame size a nd overall duration. Thus, t hese t wo potential c ues were not c rucial for s elf-identification. However, in an experiment where t he dot moved w ith constant velocity, pa rticipants were not able to identify their own writing. The result that particular changes in velocity were crucial for self-identification was further supported by a post-hoc analysis of the characters that led to the highest selfidentification r ates. I t t urned out t hat s elf-identification was more accurate for cha racters t hat required la rge velocity cha nges during
62
Embodiment, Ego-Space, and Action
production (e.g., characters having corners that lead to a pattern of rapid deceleration followed by rapid acceleration). Interestingly, the accuracy of self–other judgments was not higher for writing samples reproducing familiar symbols. Taken together, t hese results show t hat one is able to d iscriminate one’s own ha ndwriting f rom someone el se’s on t he ba sis of a single moving dot. Importantly, velocity changes seemed to be crucial for identifying one’s own handwriting, whereas the familiarity of t he symbol produced d id not a ffect self-identification. This supports th e a ssumption th at s elf-identification is informed by one’s own ac tion s ystem. Velocity cha nges a re cl early a n ac tion-related parameter, whereas effects of familiarity would have suggested that identification might be ba sed on mere visual experience. In a s ense what participants seem to have recognized is the “rhythm” of their writing (technically speaking, the invariant relative timing of their writing). This seems to suggest that a s imilar self-identification advantage for t he e ffects o f o ne’s own ac tions sh ould be p resent i n t he auditory do main. Thus, Flach, K noblich, a nd Pr inz (2004a) conducted another series of studies that explored the identification of one’s own clapping. In contrast to trajectories of handwriting, it is possible to remove a ll spatial i nformation f rom t he sounds of clapping. W hat remains i s p ure tem poral a nd aco ustic i nformation. Rep p ( 1987) reported some evidence that musically trained participants were able to identify their own clapping from a r ecording. He suggested that his participants used systematic differences in the acoustical patterns to derive information about their individual hand configurations. Flach and colleagues’ study aimed to determine whether actionrelated timing information also provides a cue to self-identification. Again, t here were t wo ex perimental s essions s eparated b y a w eek. In t he first s ession, p articipants we re re corded w hile t hey c lapped rhythmic patterns of varying complexity. In the second session, participants listened to a recording of clapping and were asked to indicate whether it reflected their own clapping or somebody else’s clapping. In this study, pairs of two participants were formed for the recognition session. The two participants in each pair provided judgments for exactly the same recordings. Half of the recordings reflected a participant’s own clapping and the other half of the recordings reflected the other participant’s clapping. The same recording that needed to be judged as self-produced by one participant, needed to be judged
Bodily and Motor Contributions to Action Perception
63
as other-produced by the other participant. Thus in this design selfidentification cannot be explained by stimulus differences. The results of a first experiment provided clear evidence that participants were able to identify original recordings of their own clapping. Self-identification was not affected by the rhythmic complexity of the clapping pattern. A second experiment assessed whether selfidentification w as st ill pos sible wh en o ne l istens t o a s equence o f simple tones (beeps) that reproduce the temporal intervals between the ma ximum a mplitudes o f t wo co nsecutive cla pping so unds. Although the beep sequences retained the general tempo and the relative timing of the original recording, all other acoustic differences were removed. Thus these sequences did not allow the participants to derive their relative hand orientations using acoustical cues (different hand orientations during clapping systematically produce different sounds; Repp, 1987). Surprisingly, participants were almost as accurate in identifying their own clapping from such beep sequences as f rom t he original recordings. Tempo a nd rhythmic information provided sufficient cues for self-identification. A further experiment assessed the contributions of overall tempo and r hythmic i nformation ( relative t iming) t o s elf-identification. The participants listened to beep sequences that retained the original relative timing of consecutive intervals between claps, but were replayed in the tempo the other participant had chosen for the same rhythmic sequence. In this experiment, participants were not able to identify their own clapping. This result shows that general tempo as well as relative timing information was used for self-identification. If participants had only used general tempo they should have mistaken the other participant’s clapping for their own. If they had only used relative timing information they should have been as accurate as for the beep sequences that retained their own tempo. Taken together, the results show that one can identify one’s own actions ba sed o n t iming i nformation. A s i n t he e arlier ha ndwriting st udy, i t i s l ikely t hat t he ma in c ue f or s elf-recognition w as a higher a ctivation o f p articipants’ o wn a ction k nowledge w hen they l istened to recordings of t heir own clapping. O ther ex planations s eem i mplausible. N o c ues abo ut t he r elative ha nd o rientation c an be der ived f rom beep sequences. The recognition session took place one or two weeks after the recording session, making it highly unli kely t hat recognition was based on episodic memories of the production session. Also, if episodic memory was crucial the
64
Embodiment, Ego-Space, and Action
original recordings of clapping should have been easier to identify than the beep sequences. If nonmusicians a re able to recognize t heir own clapping ba sed on rhythmic cues, one would expect that musicians excel at identifying t heir o wn per formances. A nd, i ndeed, m ost ex pert m usicians are convinced that they are able to tell apart their own performance of a pa rticular piece f rom somebody else’s. But exactly which cues allow, say, an expert pianist to identify her or his own performance? An experiment by Repp and Knoblich (2004) addressed this question. In a first session, 12 expert pianists who were either graduates at the Yale School of Music or took lessons with its piano faculty performed 12 excerpts selected from the standard classical piano literature (Bach, Mozart, and Beethoven). The excerpts had d urations of 15 to 20 s. A practice session that included a metronome made sure that t he per formances had r oughly t he s ame tempo a nd t hat t hey were largely error-free. Each pianist received auditory feedback for half of the pieces and no auditory feedback for the other half of the pieces. In addition, the pianists indicated for each piece how familiar it was to them (whether they had heard or played it before). The r ecognition s ession t ook p lace a pproximately t wo m onths after the recording session. The pianists listened to 12 performances of each piece, 11 performed by other pianists and one self-performed. For each piece they indicated on a 5-point scale the likelihood that it was their own performance. The pianists knew that only one out of 12 performances was their own. The results showed that the pianists were very good in identifying their own performances. In fact, their average rating for their own performances was higher than the highest rating for any of the remaining 11 pianists. Interestingly, self-identification was equally good for pieces t hat were p layed w ith a nd w ithout a uditory f eedback i n t he r ecording session. In addition, self-identification was equally good for familiar and u nfamiliar pieces. This suggests that expert pianists were able to identify their style of playing even if they were not familiar with a piece and had actually never heard themselves perform it! So what allows th em t o i dentify th eir o wn p laying? To a ddress th is q uestion an additional recognition session was performed (roughly two to t hree months a fter t he first recognition s ession). I n t his s ession participants listened to recordings from which all dynamic nuances (expressive dy namics) had be en removed, leaving only articulation and expressive timing as cues for self-identification. The pianists rec-
Bodily and Motor Contributions to Action Perception
65
ognized t hese ed ited recordings a s well a s t he original recordings. Thus, expressive timing and articulation seem to be the main cues for s elf-identification i n ex pert p ianists. Ag ain, t he r esults su ggest that self-identification is based on a stronger resonance of the action system with self-generated auditory effects of actions. Prediction The previously de scribed st udies on ac tion identification show t hat people c an ex plicitly judge f rom a r ecording whether t he obs erved action reflects their previous performance. It was suggested that this ability reflects a more extensive involvement of the motor system in action perception when people observe their own actions. But what is the underlying mechanism? One possibility is that people can “sense” the higher activation of common representations for perception and action that result from the high similarity between perceived actions and t he u nderlying ac tion k nowledge ( Knoblich & Pr inz, 2 001). Another pos sibility i s t hat t he h igher ac tivation of common representation results in better predictions of the perceptual consequences of actions. As a consequence, observing others’ actions leads to larger discrepancies between what is predicted and what is observed (Repp & Knoblich, 2004; Wilson & Knoblich, 2005). One way to test the latter assumption is to test whether people are able to make more accurate predictions when t hey observe recordings of t heir own ac tions than when they observe recordings of somebody else’s actions. To address this issue we investigated whether people are better able to predict the landing position of a dart on a target board when they observe a video of their own throwing movement (Knoblich & Flach, 2001). I n t he r ecording s ession pa rticipants w ere a sked t o t hrow darts at the upper, middle, and lower third of a target board until 10 video samples had been collected where the participant intended to hit and actually did hit the upper, middle, and lower third, respectively. After a week, participants returned for a second session where they watched v ideo clips of t hemselves or somebody else t hrowing darts. Two participants formed a pair and watched exactly the same stimuli. Each clip showed a side view of a person throwing a dart and ended at the video frame at which the dart had just left the person’s hand (the target board was also visible). Participants were asked to predict where on the target board the dart would land.
66
Embodiment, Ego-Space, and Action
Participants could predict t he landing position of t he dart quite well. I n i nitial t rials, t he predictions were equally accurate for self and o ther. On ly i n la ter t rials d id t he p redictions beco me s ignificantly more ac curate for s elf, a lthough no feedback w as provided. Presumably, it took participants some time to adjust to the unfamiliar situation of watching themselves from a third-person perspective. Further ex periments v aried t he a mount o f i nformation p rovided about t he t hrowing perso n. I n o ne o f t hese ex periments o nly t he upper body and the throwing arm were visible (the head was hidden behind an occluder to remove cues of gaze direction). In another experiment the whole body o f the person except the throwing arm was occluded. Although these manipulations successively decreased the overall accuracy of t he predictions, t he same pattern of results as in the first experiment was observed for the self-other manipulation. The predictions were equally accurate during the initial trials and t he ac curacy s electively i ncreased for s elf-generated t hrows i n later trials. A possible reason for the initial lack of a self-other difference is that a c ertain time is needed to adjust the predictions to an unfamiliar perspective. After t his adjustment, particular aspects of individual throwing seem to have informed and increased the accuracy of the prediction of the outcome of self-generated dart throwing movements. For a further test of the hypothesis that people are able to more accurately predict the consequences of their own actions we turned again to the domain of handwriting (Knoblich, Seigerschmidt, Flach, & Prinz, 2002). Participants were asked to write different versions of the digit “2” on a w riting pad. In addition, t hey a lso produced t he first st roke of t he d igit “ 2” i n i solation (two st rokes a re n eeded t o produce t he digit “2”; a ben ded one t hat ends at t he lower left corner and a consecutive straight one). The kinematics of their writing was recorded. The writing hand was screened from view, so that no visual feedback about the emerging trace was provided. After a w eek, pa rticipants r eturned f or a s econd s ession. They observed kinematic displays of bended strokes. These st rokes were either produced in isolation or they were produced in the context of w riting t he complete d igit “ 2.” The latter st rokes were obtained by separating the bended stroke from the straight stroke in the kinematic t race. (This w as e asy t o ach ieve bec ause t here i s a cl ear velocity minimum at the transition from the bended stroke to the straight stroke.) A s ingle moving dot reproduced t he movement of
Bodily and Motor Contributions to Action Perception
67
the pen tip without painting a static form trace on the monitor. Half of the strokes reflected the participant’s own writing and the other half reflected a nother pa rticipant’s w riting. The t ask was to decide whether the stroke had been produced as a part of the digit “2” (or in isolation). Thus participants needed to predict whether the observed stroke was followed by another stroke in the original recording. The results showed that the predictions were at chance for otherproduced strokes, but clearly above chance for self-produced strokes. In a f urther experiment participants were asked to fit their writing within horizontal and vertical auxiliary lines during the production session. The reasoning behind this manipulation was that it should constrain i nterindividual d ifferences i n p roduction ( this i s wh y exercise boo ks f or first g raders u se a uxiliary l ines). B ecause t hese laws govern everybody’s performance, predictions should be equally accurate for self and other. And indeed, the results showed that participants’ predictions were above chance and equally accurate for selfgenerated and other-generated strokes. Thus, when production was unconstrained, there was a la rge variability in production between different persons’ actions, and predictions were only accurate if participants obs erved t heir o wn p roductions. W hen p roduction w as highly co nstrained, a ll p roductions r eflected g eneral i nvariants o f human per formance ( Kandel, Or liaguet, & B oe, 2 000), l eading t o accurate predictions for self and other.
Coordination The finding that people are better able to predict future outcomes of their own ac tions suggests t hat t he higher resonance between perception and action during the observation of self-generated actions also supports predictive mechanisms. Most likely, these mechanisms are similar to the ones that predict the perceptual consequences of one’s o wn ac tions wh en o ne i s c urrently per forming t hem. This raises the question of whether such predictive mechanisms can also help to temporally coordinate one’s own actions with those of other people. In order to achieve this one would often have to synchronize the predicted consequences of one’s own action w ith t he predicted consequences of a pa rtner’s action. Is such coo rdination more successful wh en o ne i s coo rdinating o ne’s c urrent ac tions w ith o ne’s own previous actions? In case you are a good dancer, would you be
68
Embodiment, Ego-Space, and Action
your best dance partner? More realistically, do expert pianists duet better when they play with themselves? We investigated this question in a recent study (Keller, Knoblich, & Rep p, 2 007). I n a first s ession w e r ecorded n ine ex pert p ianists performing parts of three duet pieces (upper part or lower part, also know as primo and secondo) that were unknown to all pianists (two duets by Carl Maria von Weber and one by Edvard Grieg). Thei r playing w as recorded i n M IDI format. I n t he s econd s ession t hat took place a couple of months later the pianist were asked to perform the duet with a recording of their own playing or somebody else’s playing (performing the secundo with a recorded primo or vice versa). The variable of interest in this study was the accuracy of synchronization with the recording. We predicted that the temporal synchronization error for notes that are nominally simultaneous in the score would be lower when pianists duet with their own recordings. This is what we found. Furthermore, the pianists were able to identify their own performances after they had performed the duet. In fact, there was a high correlation between self-identification and synchronization error. The lower the synchronization error, the more confident the pianists were that they had performed with their own recording. This suggests that the pianists might have used accuracy of synchronization as a c ue to self-identification. Note, however, that the previously discussed study on piano experts (Repp & K noblich, 2004) showed that pianists can recognize their own playing when they just listen to their own performance. Thus, it is unlikely that accuracy of synchronization was the main cue for self-recognition. A further study of nonexperts explored whether one can synchronize one’s finger taps more accurately to visual events that correspond to one’s own writing (Flach, Knoblich, & Prinz, 2004b): In a first session, participants were asked to draw zigzag line patterns with constant or alternating amplitudes on a writing pad. In a second session, participants observed a moving point light display reproducing their own o r so mebody el se’s w riting pa tterns. They were instructed to press a button at the exact moment in time at which the dot changed its d irection a t co rners. I n o rder t o per form w ell i n t his t ask o ne needs to time one’s own action based on a temporal prediction of the next turning point in the trajectory. Initially, the task was not easy for the participants, but they learned to per form w ell i n la ter pa rts o f t he ex periment. T iming er rors between the time of their tap and the time of the turning points in
Bodily and Motor Contributions to Action Perception
69
the v isual move ment t rajectory d ecreased. A fter participants had reached an asymptote in their general task performance, differences in t iming er ror w ere obs erved be tween s elf- a nd o ther-generated trajectories. If the trajectories were irregular (changing amplitudes) this error was lower when participants synchronized with a self-produced trajectory. For regular trajectories no self-other differences in timing error were observed. In other words, participants could better coordinate the timing of their actions with self-generated visual trajectories, when they had sufficient practice with the synchronization task and when the production of the trajectories was relatively unconstrained.
Action Perception and Body Sense So far, the discussion has focused on how the ability to perform certain actions influences how one perceives others (or one’s earlier self) performing t hese ac tions. The results demonstrate t hat perception and recognition of others’ actions involves a direct, similarity-based matching of p erceptual re presentations of o bserved a ctions on to action representations in the observer. Furthermore, the results on prediction and coordination suggest that this match can result in a simulation of the future consequences of the observed action that is based on i nternal models capturing contingencies between certain movements a nd t he per ceptual co nsequences t hey p roduce i n t he world, g iven a pa rticular co ntext (cf. Ha milton, Wolpert, & F rith, 2004). However, it is not clear whether peripheral sensation of one’s own body is a prerequisite for being able to run such simulations. Do we need continuous input from our tactile and proprioceptive sensors in order to engage in action simulation? In other words, do we need to sense our body in order to fully understand others’ actions? This question was addressed in a study that addressed action perception and action understanding in two individuals who live with the extremely rare condition of selective and complete haptic deafferentation due to a sensory neuronopathy (Bosbach, Cole, Prinz, & Knoblich, 2005). These individuals have completely lost their senses of c utaneous t ouch a nd p roprioception. Thus t hey d on’t h ave a ny peripheral i nformation f rom t heir bod ies bel ow t he n eck ( IW, s ee Cole, 1995) or below the nose (GL, see Cole & Paillard, 1995). Bosbach and c olleagues hypothesized th at th ese p atients s hould h ave
70
Embodiment, Ego-Space, and Action
deficits i n ac tion u nderstanding i f per ipheral s ensory i nformation about one’s body is required for action simulation. They tested this hypothesis using Runeson and Frykholm’s (1981; see also Grezes, Frith, & Passingham, 2004) box lifting task. In one condition t he t wo de afferented i ndividuals a nd a ge-matched co ntrols observed videos of healthy actors lift ing boxes having different weights. The actors had been informed about the true weight of the box before they lifted i t. The task was to estimate the weight of the box lifted by the actor. In a second condition, the deafferented individuals and the controls watched videos of healthy actors who had been either told the true weight of the box to be lifted o r had be en deceived about the true weight of the box (e.g., they were told the box was heavy when it was actually light). Replicating earlier findings (Cole & Sedgwick, 1992; Fleury et al., 1995), the results showed that, for weight judgments, there were no differences i n ac curacy be tween t he t wo de afferented patients a nd controls. However, whereas controls performed quite well in inferring t he obs erved ac tor’s ex pectation abo ut t he w eight (correct o r deceived?) the two deafferented individuals were hardly able to perform this task (their judgments were close to chance). In fact, both patients’ judgments were clearly less accurate than that of the least accurate person in an age-matched control group of 12. A follow-up study showed that neither IW himself nor control participants were able to derive expectations about weight when they observed IW lifting the box. How can these results be explained? Both, the weight task and the expectation task require deriving a hidden state from the kinematics of an observed action (Runeson & Frykholm, 1981). However, deriving t he weight of t he box does not s eem to require simulating t he observed action. In fact, t he duration of t he lifting phase provided a s imple k inematic c ue t o der ive t he w eight o f t he bo x. H owever, deriving the actor’s expectation from the observed action seems to require action simulation that depends on peripheral sensory information a bout one’s b ody. This seems to be a p rerequisite for bei ng able to make use of the more complex kinematic cues that underlie judgments of an actor’s expectation (the relative duration of the lift phase relative to the overall duration of the movement). These results imply that the internal models that are used in action simulation are not f ully f unctional when per ipheral i nformation about one’s own body is missing.
Bodily and Motor Contributions to Action Perception
71
Conclusions Converging evidence supports the view that the perception of others’ actions is constrained and informed by perceivers’ body schema and t heir ab ility t o per form t he per ceived ac tions: The s ame la ws that g overn per formance co nstrain wha t i s per ceived a s d oable i n others. Becoming an expert in a particular action domain sharpens the perception of corresponding actions. Perception of one’s previous ac tions i nforms s elf-identification a nd l eads t o m ore ac curate predictions. The lack o f per ipheral s ensory i nformation u pdating one’s body schema can result in difficulties with using complex cues in action perception. All o f t hese r esults su ggest cl ose l inks be tween per ception a nd action. These links are governed by two functional principles. First, representations of perceptual events that can be caused by one’s actions (“common code s”) provide a m edium i n wh ich perception and action are commensurable (Prinz, 1997). This assumption allows one to define similarity relations between perception and action. Second, when common codes are activated, simulation mechanisms in the motor system will predict which events are likely to be perceived next. Such simulations use internal models capturing the contingencies between one’s body, one’s movements, and the environment. In this way “motor knowledge” is used to provide a context for the perception of future events (cf. Wilson & Knoblich, 2005). Acknowledging t he i ntricate bo unds be tween per ception a nd action does not leave the study of “higher” cognitive processes unaffected (Pecher & Z waan, 2 005). L anguage researchers d iscuss how language is grounded in perception and action (Barsalou et al., 2003; Glenberg & Kaschak, 2002; MacWhinney, this volume). Research in social cognition explores how perception-action links serve to align interacting i ndividuals (cf. S ebanz, B ekkering, & K noblich, 2 006). Thus new research on perception and action will likely lead to a new understanding of how people ma ke sense of t he world a nd of how they interact with their conspecifics.
References Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., & Qin, Y. (2004). A n i ntegrated t heory o f t he m ind. Psychological Re view, 111. 1036–1060.
72
Embodiment, Ego-Space, and Action
Bangert, M., Parlitz, D., & A ltenmueller, E. (1999). An interface for complex a uditory–sensorimotor i ntegration: W here t he p ianist’s c ortex maps perception to action. Neuroimage, 9, S419. Barresi, J., & Moore, C. (1996). Intentional relations and social understanding. Behavioral & Brain Sciences, 19, 107–154. Barsalou, L .W., Si mmons, W . K ., B arbey, A ., & Wi lson, C . D . ( 2003). Grounding c onceptual k nowledge in m odality-specific systems. Trends in Cognitive Sciences, 7, 84–91. Beardsworth, T., & Buckner, T. (1981). The ability to recognize oneself from a video recording of one’s movements without seeing one’s body. Bulletin of the Psychonomic Society, 18, 19–22. Blakemore, S . J., & De cety, J. ( 2001). F rom t he p erception o f ac tion to the u nderstanding o f i ntention. Nature Re views N euroscience, 2 , 561–567. Bosbach, S ., C ole, J., P rinz, W ., & K noblich, G . ( 2005). U nderstanding another’s expectation f rom ac tion: The role of p eripheral sensation. Nature Neuroscience, 8, 1295–1297. Calvo-Merino, B., Glaser, D. E., Grezes, J., Passingham, R. E., & Haggard, P. ( 2005). A ction obs ervation a nd ac quired m otor s kills: A n f MRI study with expert dancers. Cerebral Cortex, 15, 1243–1249. Casile, A ., & Gie se, M . A . (2005). Cr itical features for t he re cognition of biological motion. Journal of Vision, 5, 348–360. Casile, A., & Giese, M. A. (2006). Non-visual motor learning influences the recognition of biological motion, Current Biology, 16, 69–74. Clark, A. (1997). Being there: Putting brain, body and world together again. Cambridge, MA: MIT Press. Cole, J. (1995). Pride and a daily marathon. Cambridge, MA: MIT Press. Cole, J. D., & Paillard, J. (1995). Living without touch and peripheral information a bout b ody p osition a nd move ment: S tudies up on d eafferented subjects. In J. Bermúdez, A. Marcel, & N. Eilan (Eds.), The body and the self (pp. 245–266). Cambridge, MA: MIT Press. Cole, J., D., & Sedgwick, E. M. (1992). The perceptions of force and of movement in a ma n without large myelinated sensory afferents below the neck. Journal of Physiology, 449, 503–515. Cross, E. S., Hamilton, A., & Gr afton, S. T. (2006). Building a motor simulation de n ovo: Obs ervation of d ance by d ancers. NeuroImage, 31, 1257–1267. Cutting, J. E., & Kozlowski, L. T. (1977). Recognizing friends by their walk: Gait perception without familiarity cues. Bulletin of the Psychonomic Society, 9, 353–356. Cutting, J. E ., P roffitt, D. R ., & K ozlowski, L . T. (1978) A b iomechanical invariant for g ait p erception. Journal of E xperimental P sychology: Human Perception and Performance, 4, 357–372.
Bodily and Motor Contributions to Action Perception
73
Decety, J., & Grezes, J. (1999). Neural mechanisms subserving the perception of human actions. Trends in Cognitive Sciences, 3, 172–178. Decety, J., & J eannerod, M. (1995). Mentally simulated movements in virtual reality: Does Fitts’ law hold in motor imagery? Behavioral Brain Research, 72, 127–134. De’Sperati, C ., & Viv iani, P. (1997). The rel ationship b etween c urvature and v elocity i n t wo-dimensional sm ooth p ursuit e ye m ovements. Journal of Neuroscience, 17, 3932–3945. Dittrich, W. H ., Troscianko, T., L ea, S . E . G ., & M organ, D. (1996). Perception of emotion from dynamic point-light displays represented in dance. Perception, 25, 727–738. Dokic, J., & Proust, J. (2002). Simulation and knowledge of action. Amsterdam: John Benjamins. Fitts, P. M. (1954). The information capacity of t he human motor system in controlling t he a mplitude of movement. Journal of E xperimental Psychology, 47, 381–391. Flach, R., Knoblich, G., & P rinz, W. (2003). Off-line aut horship effects in action perception. Brain and Cognition, 53, 503–513. Flach, R., Knoblich, G., & Prinz, W. (2004a). The two-thirds power law in motion p erception: W hen do m otor a nticipations c ome i nto pl ay? Visual Cognition, 11, 461–481. Flach, R., Knoblich, G., & Prinz, W. (2004b). Recognizing one’s own clapping: The role of temporal cues in self-recognition. Psychological Research, 11, 147–156. Fleury, M., Bard, C., Teasdale, N., Paillard, J., Cole, J., Lajoie, Y., & Lamarre, Y. (1995). Weight judgement: The discrimination capacity of a de afferented patient. Brain, 118, 101–108. Flores d’Arcais, J-P. (1994). Order of strokes: Writing as a cue for retrieval in reading Chinese characters. European Journal of Cognitive Psychology, 6, 337–355. Fogassi, L., Ferrari, P. F., Gesierich, B., Rozzi, S., Chersi, F., & Rizzolatti, G. (2005). Pa rietal lob e: From ac tion o rganization to i ntention u nderstanding. Science, 308, 662–667. Funk, M ., Sh iff rar, M ., & B rugger, P. ( 2005). Ha nd m ovement obs ervation by i ndividuals b orn w ithout ha nds: P hantom l imb e xperience constrains visual limb perception. Experimental Brain Research, 164, 341–346. Gallese, V., Fadiga, L. Fogassi, L., & R izzolatti, G. (1996). Action recognition in the premotor cortex. Brain, 119, 593–609. Gibson, J. J. ( 1979). The ecol ogical appr oach t o v isual pe rception. B oston, MA: Houghton Mifflin. Glenberg, A. M., & Kaschak, M. P. (2002). Grounding language in action. Psychonomic Bulletin & Review, 9, 558–565.
74
Embodiment, Ego-Space, and Action
Goldman, A. ( 2006). Simulating m inds: The phil osophy, ps ychology an d neuroscience of mindreading. New York: Oxford University Press. Greenwald, A . G . (1970). S ensory f eedback m echanisms i n p erformance control: Wit h s pecial re ference to t he id eo-motor me chanism. Psychological Review, 77, 73–99. Grezes, J., & De cety, J. ( 2001). Fu nctional a natomy o f e xecution, m ental simulation, observation, and verb generation of actions: A meta-analysis. Human Brain Mapping, 12, 1–19. Grèzes, J., Frith, C. D., & Pa ssingham, R. E. (2004). Inferring false beliefs from the actions of oneself and others: an f MRI study. NeuroImage, 21, 744–750. Grosjean, M., Shiffrar, M., & Knoblich, G. (2007). Fitt’s law holds in action perception. Psychological Science, 18, 95–99. Grush, R. (2004). The emulation theory of representation: Motor control, imagery, and perception. Behavioral and Brain Sciences, 27, 377–442. Hamilton, A., Joyce, D. W., Flanagan, J. R., Frith, C. D., & Wolpert, D. M. (2007). Kinematic cues in perceptual weight judgment and their origins in box lifting. Psychological Research, 71, 13–21. Hamilton, A., Wolpert, D. M., & Frith, C. D. (2004). Your own action influences how you perceive another person’s action. Current Biology 14, 493–498. Harris, P. (1995). From simulation to folk psychology: The case for development. In M. Davies & T. Stone (Eds.), Folk psychology (pp. 207–231). Oxford: Blackwells. Haslinger, B ., E rhard, P., A ltenmueller, E ., S chroeder, U., B oecker, H ., & Ceballos-Baumann, A . O . ( 2005). T ransmodal s ensorimotor n etworks during action observation in professional pianists. Journal of Cognitive Neuroscience, 17, 282–293. Haueisen, J., & K noesche, T. R. (2001). Involuntary motor activity in pianists evoked by music perception. Journal of Cognitive Neuroscience, 13, 786–792. Hommel, B ., Müsseler, J., A schersleben, G ., & P rinz, W. (2001). The theory of event coding (TEC): A f ramework for perception and action. Behavioral and Brain Sciences, 24, 849–937. Hubbard, T. L . (1995). Environmental i nvariants i n t he representation of motion: I mplied d ynamics a nd re presentational mome ntum, g ravity, friction, and centripetal force. Psychonomic Bulletin & Review, 2, 322–338. Hubbard, T. L. (2005). Representational momentum and related displacements in spatial memory: A review of the findings. Psychonomic Bulletin & Review, 12, 822–851. Iacoboni, M., Woods, R . P., Brass, M., Bekkering, H., Ma zziotta, J. C ., & Rizzolatti, G. (1999, December 24). C ortical mechanisms of human imitation. Science, 286, 2526–2528.
Bodily and Motor Contributions to Action Perception
75
James, W. (1890). The principles of psychology (2 vols.). New York: Holt. Jeannerod, M. (2001). Neural simulation of action: A unifying mechanism for motor cognition. NeuroImage, 14, S103–S109. Johansson, G. (1973). Visual perception of biological motion and a model for its analysis. Perception and Psychophysics, 14, 201–211. Kandel, S ., O rliaguet, J. -P., & B oe, L .-J. ( 2000). De tecting a nticipatory events in handwriting movements. Perception, 29, 953–964. Kandel, S., Orliaguet, J.-P., & Viviani, P. (2000). Perceptual anticipation in handwriting: The role of i mplicit motor c ompetence. Perception & Psychophysics, 62, 706–716. Keller, Knoblich, & Repp (2007). Pianists duet better when they play with themselves. Consciousness and Cognition, 16, 102–111. Kerzel, D., Jordan, J. S., & Muesseler, J. (2001). The role of perceptual anticipation i n t he lo calization o f t he final p osition o f a m oving t arget. Journal of E xperimental P sychology: H uman P erception an d P erformance, 27, 829–840. Kieras, D., & M eyer, D. E . (1997). A n overview of t he E PIC a rchitecture for cognition and performance with application to human-computer interaction. Human-Computer Interaction, 12, 391-438. Knoblich, G., & Flach, R. (2001). Predicting the effects of actions: Interactions of perception and action. Psychological Science, 12, 467–472. Knoblich, G., & Fl ach, R. (2003). Action identity: Evidence from self-recognition, prediction, and coordination. Consciousness and Cognition, 12, 620–632. Knoblich, G ., & P rinz, W. ( 2001). Re cognition o f s elf-generated ac tions from kinematic displays of drawing. Journal of Experimental Psychology: Human Perception and Performance, 27, 456–465. Knoblich, G., Seigerschmidt, E., Flach, R., & Prinz, W. (2002). Authorship effects in the prediction of handwriting strokes. Quarterly Journal of Experimental Psychology, 55A, 1027–1046. Knoblich, G., Thornton, I., Grosjean, M., & Shiffrar, M. (Eds.). (2006). Perception of the human body. New York: Oxford University Press. Kohler, E ., Ke ysers, C ., Umiltà, M ., Fogassi, L ., G allese, V., & R izzolatti, G. (2002). Hearing sounds understanding actions: Action representation in mirror neurons. Science, 297, 846–848. Korte, A. (1915). Kinematoskopische Untersuchungen. Zeitschrift fuer Psychologie, 72, 194–296. Koski, L ., Wohlschläger, A., Bekkering, H., Woods, R . P., Dubeau, M-C., Mazziotta, J. C. , and Iacoboni, M. (2002). Modulation of motor and premotor ac tivity during i mitation of t arget-directed ac tions. Cerebral Cortex, 12, 847–855. Liberman, A . M ., & W halen, D. H . (2000). O n t he rel ation o f sp eech to language. Trends in Cognitive Sciences, 4, 187–196.
76
Embodiment, Ego-Space, and Action
Loula, F., Prasad, S., Harber, K., & Shiff rar, M. (2004). Recognizing people from t heir movements. Journal of E xperimental Psychology: Human Perception & Performance, 31, 210–220. Merleau-Ponty, M. (1945). Phénoménologie de la perception. Paris: Éditions Gallimard. Newell, A. (1990). Unified theories of cognition. Cambridge, MA: Harvard University Press. Newell, A ., Sha w, J. C ., & Si mon, H . A . (1958). C hess-playing p rograms and the problem of complexity. IBM Journal of Research and Development, 2, 320–335. Newell, A ., & Si mon, H . A . (1972). Human p roblem s olving. E nglewood Cliffs, NJ: Prentice-Hall. Pecher, D., & Zwaan, R. A., (Eds.). (2005). Grounding cognition: The role of perception and action in memory, language, and thinking. Cambridge, UK: Cambridge University Press. Piaget, J. (1969). The mechanisms of perception. London: Routledge & Kegan Paul. Plamondon, R., & Alimi, A. M. (1997). Speed/accuracy trade-offs in targetdirected movements. Behavioral and Brain Sciences, 20, 279–349. Pollick, F. E., Paterson, H., Bruderlin, A. & Sanford, A. J. (2001) Perceiving affect from arm movement. Cognition, 82, B51–B61. Prinz, W. (2002). Experimental approaches to imitation. In A. N. Meltzoff & W. Prinz (Eds.), The imitative mind: Development, evolution, and brain bases (pp. 143–162). New York: Cambridge University Press. Prinz, W. (1997). Perception and action planning. European Journal of Cognitive Psychology, 9, 129–154. Repp, B. H. (1987). The sound of two hands clapping: an exploratory study. Journal of the Acoustical Society of America, 81, 1100–1109. Repp, B. H., & K noblich, G. (2004). Perceiving ac tion identity: How pianists re cognize t heir o wn p erformances. Psychological S cience, 1 5, 604–609. Rizzolatti, G., & Cr aighero, L. (2004). The mirror-neuron system. Annual Review of Neuroscience, 27, 169–192. Runeson, S., & F rykholm, G. (1983) K inematic specification of dynamics as an informational basis for person-and-action perception: expectation, gender recognition, and deceptive intention. Journal of E xperimental Psychology: General, 112, 585–615. Sebanz, N., Bekkering, H., & Knoblich, G. (2006). Joint action: Bodies and minds moving together. Trends in Cognitive Sciences, 10, 70–76. Sebanz, N., Zisa, J., & Shiffrar, M. (2006). Bluffing bodies: Inferring intentions fr om o bserved a ctions. Journal of C ognitive N euroscience (Suppl.). Shiff rar, M., & Freyd, J. (1990). Apparent motion of the human body. Psychological Science, 1, 257–264.
Bodily and Motor Contributions to Action Perception
77
Shiff rar, M., & F reyd, J. (1993). Timing a nd apparent motion path choice with human body photographs. Psychological Science, 4, 379–384. Shiff rar, M ., & P into, J. ( 2002). The visual analysis of bodily motion. In W. P rinz & B . H ommel ( Eds.), Common me chanisms in pe rception and action: Attention and Performance (Vol. 19, pp. 381–399). Oxford: Oxford University Press. Spivey, M ., Gro sjean, M ., & K noblich, G . ( 2005). C ontinuous a ttraction toward phonolo gical c ompetitors: Thinking w ith yo ur ha nds. Proceedings of the National Academy of Sciences, 102, 10393–10398. Stevens, J. A ., F onlupt, P., Sh iff rar, M ., & De cety, J. ( 2000). N ew a spects of motion perception: Selective neural encoding of apparent human movements, NeuroReport, 11, 109–115. Thornton, I. M. (2006). Point light walkers and beyond. In G. Knoblich, I. Thornton, M. Grosjean, & M. Shiffrar (Eds.), Human body perception from the inside out. New York: Oxford University Press. Thornton, I. M., & Knoblich, G. (2006). Action perception: Seeing the world through a moving body. Current Biology, 16, R27–R29. Umiltà, M. A., Kohler, E., Gallese, V., Fogassi, L., Fadiga, L., Keysers, C., & Rizzolatti, G. (2001). I know what you are doing: A neurophysiological study. Neuron, 31, 155–165. Van S ommers, P . ( 1984). Drawing an d c ognition: D escriptive an d e xperimental s tudies of g raphic p roduction p rocesses. Ca mbridge: Ca mbridge University Press. Viviani, P. (2002). Motor competence in the perception of dynamic events. In W. P rinz & B . H ommel ( Eds.), Common m echanisms i n p erception an d a ction: A ttention an d pe rformance (Vol. 19, pp. 4 06–442). Oxford: Oxford University Press. Viviani, P., Baud-Bovy, G., & Re dolfi, M. (1997). Perceiving a nd tracking kinesthetic s timuli: Fu rther e vidence of motor -perceptual i nteractions. Journal of E xperimental P sychology: H uman P erception an d Performance, 23, 1232–1252. Viviani, P., Campadelli, P., & M ounoud, P. (1987). Visuo-manual pursuit tracking of human t wo-dimensional movements. Journal of E xperimental Psychology: Human Perception and Performance, 13, 62–78. Viviani, P., & Mounoud, P. (1990). Perceptuomotor compatibility in pursuit tracking of two-dimensional movements. Journal of Motor Behavior, 22, 407–443. Viviani, P., & St ucchi, N. (1989). The effect of movement velocity on form perception: G eometric i llusions i n dy namic d isplays. Perception & Psychophysics, 46, 266–274. Viviani, P., & Stucchi, N. (1992). Biological movements look uniform: Evidence of motor-perceptual interactions. Journal of Experimental Psychology: Human Perception and Performance, 18, 603–623.
78
Embodiment, Ego-Space, and Action
Wilson, M. (2002). Six views of embodied cognition. Psychonomic Bulletin & Review, 9, 625–636. Wilson, M., & Knoblich, G. (2005). The case for motor involvement in perceiving conspecifics. Psychological Bulletin, 131, 460–473. Wolff, W. (1931). Zuordnung individueller Gangmerkmale zur IndividualCharakteristik [Assignment of individual attributes of gait to the individual characteristic]. Beihefte zur Zeitschrift für angewandte Psychologie, 58, 108–122. Wolpert, D. M., Doya, K., & Kawato, M. (2003). A unifying computational framework f or m otor c ontrol a nd s ocial i nteraction. Philosophical Transactions of the Royal Society, 358, 593–602. Wolpert, D. M., & Kawato, M. (1998). Multiple paired forward and inverse models for motor control. Neural Networks, 11, 1317–1329. Wray, R . E ., & J ones, R . M . (2005). A n i ntroduction to S oar a s a n a gent architecture. In R. Sun (Ed.), Cognition and multi-agent interaction: From cognitive modeling to social simulation (pp. 53–78). Cambridge, UK: Cambridge University Press.
3 The Social Dance On-Line Body Perception in the Context of Others
Catherine L. Reed and Daniel N. McIntosh
Perception of Others Is Inherently Social Humans ha ve e volved t o ac t i n t he w orld—to d o t hings. F urther, many o f t he m ost i mportant a nd co mplicated t hings w e d o, f rom passing t he cream for t he coffee at t he breakfast table to surviving a cocktail party, we do in the context of interacting with others. As social animals, one of the key issues to be resolved by each individual is how to operate i n t he context of other humans. Our perceptions of others help us, for example, to cooperatively work with others for successful adaptation to the environment, mate, negotiate social hierarchy, detect cheaters, and enhance our group’s status and resources.
Overview In this chapter, we take an embodied approach to social perception, emphasizing that interacting with other people is much more complicated t han i nteracting w ith u nmoving, i nanimate ob jects. 79
80
Embodiment, Ego-Space, and Action
Human per ceptual s ystems a re e volved t o c apitalize o n t he f unctional importance of perceiving others as people and understanding what t hey a re going to do next. We ma ke t he c ase t hat t he ke y to reducing these perceptual and cognitive complexities is the creation of self-other correspondences between the perceiver and the person viewed via the use of specialized body processing mechanisms. Specifically, we argue that our own bodies help us organize information from others. We combine research on body perception with research on embodied emotional perception a nd suggest t hat t hese specialized body processing mechanisms may be f undamental perceptual mechanisms o n wh ich soc ial-emotional p rocesses a re ba sed. F urther, disruption of these mechanisms may lead to social-emotional deficits such as those observed in autism.
The Importance of Social Perception Essential to functioning in a social environment is the accurate perception of others. To do so, people use verbal and nonverbal information, the latter including facial, vocal, and postural behaviors. It is obvious that humans communicate through facial expressions, and the ma jority o f r esearch ha s f ocused o n t he r ole o f fac es i n soc ial perception a nd co mmunication. U nfortunately, t he v ery s alience of faces has caused researchers to deemphasize a perhaps more primal method of social perception, namely body posture recognition. Without denying t he significance of t he face, we focus here on t he body as an example of more general person perception. Others’ body postures give us information about who they are, what they are doing, what they intend to do, and, most significant to us, whether and how we should respond to what they are doing. There i s subst antial e vidence t hat body post ures ha ve m eaning to humans and other animals. Consider the male gorilla display of pounding the chest or similar human displays on the football field. Animals i ntegrate v isual, movement, a nd postural i nformation for the purpose of environmental assessment (especially risk assessment) and the determination of subsequent action (Blanchard, Blanchard, & Hori, 1988). For example, fish use postures to communicate dominance a nd ter ritory (e.g., De sjardins, Ha zelden, Van der K raak, & Balshine, 2006; Martin & Redka, 2006) as do other animals that do not have the facial musculature to use facial expression (Blanchard,
The Social Dance
81
Blanchard, Rogers, & Weiss, 1990). In addition, animals’ neural systems su pport post ure d iscrimination i n o ther a nimals. S ingle-cell extracellular r ecordings f rom sh eep’s tem poral co rtical n eurons indicate p referential n eural r esponses t o b ipedal h uman post ures that disappear when humans adopted quadrupedal postures, when human body parts alone were presented, or when human faces were presented (Kendrick & Baldwin, 1989). Thus, body posture processing appears to be a basic mechanism in animals. This importance of body posture processing across species suggests that humans may have evolved specialized processing for this function. Proponents of embodied cognition emphasize that the body acts in the world. Here we argue that embodied cognition must also emphasize t hat t he body ac ts a nd reacts i n a social world . I ndeed, the centrality of the group and of collective action to humans suggests t hat effective social functioning is fundamental to individual and species success. Thus, the existence and influence of specialized body-based processes and representations on perception have even greater relevance if they are used in the context of perceiving others. These p rocesses a re m ost r elevant t o soc ial i nteraction wh en t hey occur in real time, so that we know what others are currently doing and f eeling—and so w e k now h ow t o r espond a ppropriately. The importance of fast and accurate body posture recognition makes it likely that people possess specialized processes for coordinating perceptions of ourselves and others. Thus, social perception emphasizes the need for body-specific representations and processes to organize information from the self and others in the service of social action. The Added Complexities of Social Perception for Embodied Cognition For over a c entury, researchers have ack nowledged that our bodies play a critical role in our cognitive development, functional capabilities, and emotions (Darwin, 1889; Gibson, 1977; James, 1884; Piaget, 1954). Theories o f em bodied cog nition em phasize t he i mportance of s ensorimotor e xperience a nd t he i nteraction of t he b ody i n t he world (cf., Harnad, 1990). Human cognition, including social-emotional perception, has sensorimotor roots. Human’s current cog nitive c apacities e volved f rom t hose o f o ur p rimal a ncestors wh ose neural resources were dedicated primarily to motor and perceptual
82
Embodiment, Ego-Space, and Action
processing bec ause i mmediate, o n-line i nteractions w ith t he en vironment were most relevant for su rvival (Wilson, 2 002). De velopmentally, a s human i nfants learn to control t heir own movements and perform certain actions, they develop an understanding of their own basic perceptual and motor-based abilities, which provide t he fundamental bases for acquiring more complex cognitive processes (Piaget, 1 954; Thelen, 1 995; Thelan & S mith, 1 994). A ccording t o this view, cognition depends on the kinds of experiences that come from having a body with particular perceptual and motor capacities that are inseparably linked. Such bodily experiences in the form of affordances, k nowledge, a nd g oals co mbine t o c reate t he ba sis f or memory, emotion, language, and all other aspects of life (Glenberg, 1997). These p revious s ensorimotor ex periences a re r einstantiated and simulated to aid and constrain abstract thought (Barsalou, 2002, 2003; Glenberg, 1997; Glenberg & Kaschak, 2002). Theories of embodied cognition also emphasize the importance of the context in which actions take place. Nonetheless, it is important t o co nsider t he g oal o f t he ac tivity a nd t he ob jects a t wh ich bodily ac tions a re d irected (Moore, 2 006). I f a n organism ac tively constructs a sensorimotor representation based on a set of environmental features relevant to the action it is performing, that same environmental space is perceived differently if the goal of the action changed, thereby changing the relative salience of different features. For instance, a chair would be perceived differently if one’s goal was to put books on the chair or to sit in it. However, when bodily actions are considered within the context of other people, the agency and the intentionality of the other person necessarily cha nges p rocessing b y add ing s elf-other r elationships. Planning a ctions t o a ffect a cha ir f or s itting i s f undamentally l ess complex than planning action to affect a chair of a department. Others’ intentions and ability to perform their own actions increase the processing loads and constraints relevant to our own actions in the environment. It is not merely that an object moves. It is the agency of another person that increases the number of potential environmental cha nges ex ponentially a nd n ecessitates efficient, on-line, realtime processing. Not only do social interactions require one to assess the current state of the other person, they also require one to predict what the other person might do and plan how to respond. Moreover, the social object of perception changes in response to each of our actions, both adding information as to their intent, and necessitating
The Social Dance
83
changes in our own reactions. The timing of these social actions and reactions is essential for t he seamless, fluid communication t hat is observed in typical interactions between people. The apparent ease of this complex interpersonal dance makes it almost invisible, until one sees the functional consequences for those individuals who are unable to automatically join the dance. Indeed, it appears to be this kind of real-time prediction-action-reaction “dance” that individuals with social deficits such as autism are unable to do. Thus, successful social interaction requires an enormous amount of information about the intentional relations of the self and other to be processed. This is a significant challenge for the comprehension of the nature of human social understanding and theory of mind (Barresi & Moore, 1996). Our cognitive systems must have developed an efficient w ay for s elf-other correspondence to be co nstructed. One way to establish the commonalities between the self and other may be t hrough p rocesses a ttuned t o h uman b ody s tructure a nd b iomechanics (Wilson & K noblich, 2005; Wilson, 2001, 2005). In t his chapter, w e ex amine h ow body -specific re presentations a nd pr ocesses organize information from the self and others and how they play a significant role in social-emotional perception. Body Perception: Creating Self-Other Correspondences The prioritization of perceptual processing associated with conspecifics is clearly evident in humans. Other humans capture attention and el icit co mplex beha viors ( Downing, B ray, R ogers, & Ch ilds, 2004; Ro, Russell, & L aVie, 2001). Slaughter and Heron (2004) have summarized research documenting that from early on, infants react to h umans d ifferently f rom t he w ay t hat t hey r eact t o o ther m ultifaceted, a ttention-grabbing st imuli: i nfants m ove a nd v ocalize differently t o pe ople a nd ob jects ( Legerstee, 1991, 1994; L egerstee, Pomerleau, Malcuit, & Feider, 1987; Trevarthan, 1979, 1993); infants favor human faces and voices relative to other stimuli (Fantz, 1963; Johnson & Morton, 1991; Kuhl, 1987); infants are able to recognize and identify specific people before they can distinguish specific nonhuman objects (Bonatti, Frot, Zangl, & Mehler, 2002; Field, Cohen, Garcia, & Gr eenberg, 1 984); a nd i nfants c ategorize h umans a nd other ob jects d ifferentially (Quinn & E imas, 1998). E mpirical e vidence supports the idea that infants treat other humans as “special.”
84
Embodiment, Ego-Space, and Action
Thus, developing perceptual systems appear to be p rimed for “ like my species” information. Much of social perception requires an assessment of how much “like m e” a nother perso n ma y be . A lthough perso n per ception i s more co mplex t han ob ject per ception, w e a lso a re eq uipped w ith unique templates to use in this task. Unlike the majority of objects that we encounter in our lives, visually perceived body postures, actions, and facial expressions of other people can be mapped onto and reproduced by our own body a nd fac e. We c an identify other humans, a s humans, bec ause t hey pos sess both a h uman fac e a nd a human body t hat a re s imilar t o ours a nd t hat c an ma ke similar expressions, postures, and movements. Research in body perception, in particular, emphasizes the importance of self-other correspondences in the perception of others. Structural similarity between the self and other provides information r egarding co mmonalities i n spa tial la yout. I t a lso per mits inferences r egarding t he body ’s biomechanics w ith wh ich one c an determine wh ether o ne’s o wn body co uld per form s imilar m ovements (Rizzolatti & Craighero, 2004; Wilson, 2001). Recent evidence suggests that the brain provides a specia l status to perceptual stimuli that correspond to one’s own body (Blakemore, 2006; Downing, Yuhong, Shuman, & Kanwisher, 2001; Grossman, 2006; Saxe, 2006). The similarities between another person’s body and our own permit multimodal inputs from both bodies to be represented in a common representation. Specialized Body Representations At the core of self-other mapping, there must be a representation of the body t hat contains the basic spatial layout and biomechanics of the human body to help organize perceptual inputs of other bodies and objects. Developmental, behavioral, and neuropsychological studies have all provided evidence for a long-term spatial representation specific to the body.1 This representation of the body specifies the relations among the body parts. It is shown to be spatially organized, supramodal, and used for representing other bodies as well as one’s own (Buxbaum & Coslett, 2001; Gallagher, 2005; Reed, 2002; Reed & Farah, 1995; Schwoebel, Buxbaum, & Coslett, 2004). The adult neuropsychological literature provides additional evidence for a long-term
The Social Dance
85
spatial representation of the body that is unique from other object representations. Patients with autopagnosia cannot locate body parts on themselves or others despite demonstrating knowledge of bodies, naming of body parts, and relatively intact spatial abilities (DeRenzi & Scotti, 1970; Ogden, 1985). Deficits in spatial body representations are a lso n ot l imited t o t he v isual m odality i n so me pa tients. On e patient w as u nable t o l ocate h is body pa rts b y t ouch o r b y v ision (Ogden, 1985). Developmental r esearchers ha ve i nvestigated wha t a spects o f human fac es a nd bod ies a llow i nfants t o d istinguish t hem f rom other objects. For faces, the relative location of facial features appears to be important. Infants within the first few months show a st rong preference for st imuli t hat resemble human faces over comparably complex, h igh-contrast pa tterns ( see Ma urer, 1 985 f or a r eview). Young infants prefer to look at typical faces with the features in their canonical positions relative to scrambled faces with t he features in noncanonical positions (Johnson & Morton, 1991). Given that preferences for human facial patterns exist at birth, it seems likely that the newborn v isual s ystem i s somehow t uned, either by a n i nnate template o r a per ceptual b ias f or h igh-contrast t op-bounded pa tterns, to r ecognize, track, and fi xate on facelike patterns (Johnson, 1997; J ohnson & M orton, 1991; Turati, S imion, M ilani, & U miltà, 2002; Valenza, Simion, Macchi Cassia, & Umiltà, 1996). A sensitivity to human bodies also appears early in development. Some evidence from phantom limb patients points to the existence of primitive spatial body representation that may be hard-wired in the brain (but see Price, 2002 for a discussion). Despite a lack of sensory input and experience using limbs, individuals with aplasia (i.e., born without l imbs) o ften e xperience p hantom l imb s ensations (Weinstein & Sersen, 1961) and perceive movements of other people’s limbs no differently from typically developing individuals (Funk, Shiffrar, & Brugger, 2 005). I n t ypically de veloping ch ildren, a s ensitivity to the configuration of body parts appears sometime after the first year of life. Slaughter and colleagues (Slaughter & Heron, 2004; Slaughter, Heron, & Sim, 2002) showed typical and scrambled human body and face images to infants between the ages of 12 and 18 months. Based on looking preference measures, infants younger than 18 months of age clearly distinguished between t ypical and scrambled images of faces b ut n ot bod ies. H owever, b y 18 m onths o f a ge i nfants co uld discriminate t ypical f rom s crambled bod ies. A lthough i nfants’
86
Embodiment, Ego-Space, and Action
perceptual ex pectations about t ypical human fac es de velop e arlier than those about human bodies, it is clear that a human body template is shared among infants. During the first year of life, infants also acquire a sensitivity, or a template, for the biomechanics of how humans move and the organization of the human body (Fox & McDaniel, 1982; see Pinto, 2005 for a r eview). Three-month-old i nfants p refer po int-light w alker displays o f u pright w alking h umans r elative t o in verted w alking humans (Fox & McDaniel, 1982). By 5 months of age they are able to distinguish global form as long as it provides a context in which the features are salient. Pinto (2005) argues that infants’ visual sensitivity to the structure of the human body corresponds with the infants’ acquisition of a basic set of motor skills during the first 18 months. Thus, s ensitivity f or h uman bod ies co mbines v isual per ception and motor production. Researchers have proposed t hat infants use a supramodal body s cheme to integrate a nd process stimuli across sensory m odalities a nd spec ifically ac ross v ision a nd p roprioception. In a number of experiments, newborns demonstrate an ability to copy gestures including mouth opening, tongue protrusion, and lip protrusion when adults model those gestures (e.g., Abravanel & DeYong, 1991; Meltzoff & Moore, 1977, 1983, 1989, 1992, 1995, 1997). One st udy a lso reports t hat neonates c an i mitate s equential finger movements (Meltzoff & Moore, 1977). The ability of infants to view another perso n’s m ovements a nd r eproduce t hem w ith t heir o wn bodies i ndicates t hat i nfants have a r epresentational s ystem of t he body that links the actions of the self (proprioception) to the actions of another (vision) via supramodal or cross-modal integration (Meltzoff & Moore, 1995).
Using One’s Own Body to Organize Information From Others Across t he l ife-span, t he ac tivation o f a m ultimodal spa tial body representation contributes to the development of self-other relationships. B eyond e arly de velopment, t hough, how do one’s own body actions co ntribute t o t he u nderstanding o f o thers? To add ress t he question of whether we use our own body r epresentations to interpret the actions of others, researchers have examined how our own postures and actions change our visual perception of other people’s postures a nd ac tions. For example, i n Reed a nd Farah (1995), pa r-
The Social Dance
87
ticipants were asked to view and compare two sequentially presented postures o f a nother perso n. P articipants w ere a lso a sked t o m ove their l imbs wh ile per forming t he co mparative t ask. P erformance in t he p osture me mory t ask d epended on p articipant move ment. If pa rticipants m oved t heir a rms, t heir m emory for o ther pe ople’s arm post ures w as s electively i mproved. W hen pa rticipants m oved their l egs, t heir m emory f or o ther pe ople’s l eg post ures s electively improved. I mportantly, t he i nteraction be tween body m ovement and memory was specific to t he body. When t he primary memory task was changed to remembering the positions of upper and lower regions of a n abst ract object, movement had n o effect on memory. Further, these facilitory effects could not be attributed to imitation. When participants matched one part of their body to the remembered position a nd m oved t he o ther pa rt, body pa rt m emory s electively improved for the moving parts, not the imitating parts. A follow-up study (Reed & McGoldrick, 2007) confirmed that participant movement was also critical. No effect was found if participants watched another move. Thus, it appears that the same body representation is used to encode the body positions for the self and others. Our own bodies selectively affect how we perceive the bodies of others. Specialized Body Processing The fact that what we do with our own bodies influences t he p erceptions of other people’s bodies suggests that there is something distinctive about the way we view the human body. Do humans have specialized p rocessing m echanisms f or r ecognizing t he b ody p ostures of others? Previous research ha s compared t he perception of the human body t o i nanimate objects (Heptulla-Chatterjee, Freyd, & Shiffar, 1996; Shiffrar, 2006; Shiffrar & F reyd, 1990, 1993). However, human bodies differ from inanimate objects along a number of dimensions, i ncluding sha pe, r igidness, m ovement cha racteristics, and intentionality (Shiffrar, 2006). Animate objects may be a m uch better comparison group and human faces may be the best (Slaughter, Stone, & Reed, 2004). Humans can configure their faces and body postures into positions that convey emotions, intentions, and other meaningful i nformation. Giv en t hat fac es a nd bod ies a re u sually attached, t here a re good reasons to bel ieve t hat t here a re similarities in face and body per ception. Both are important social stimuli
88
Embodiment, Ego-Space, and Action
for which recognition has evolutionary importance. Both are viewed extensively from birth (Slaughter & Heron, 2004). Both are identified at subordinate categorical levels (Tanaka & Gauthier, 1997). Reed a nd co lleagues ( McGoldrick & Re ed, 2 007; Re ed, S tone, Grubb, & M cGoldrick, 2 006a; Re ed, S tone, & M cGoldrick, 2 006b; Reed, Stone, Bozova, & Tanaka, 2003) have investigated similarities and differences in the way that bodies and faces are perceived. Many researchers ha ve a rgued t hat t he per ception o f fac es i s “ special” because, u nlike ot her objects, i t de pends o n c onfigural processing (e.g., Ma urer, L e Gr and, & M ondloch, 2 002). I n o ther w ords, o ne cannot distinguish t wo t ypical faces on t he presence or absence of their parts. Instead, one needs precise metric information about the shape and distance among the features. A similar argument can be made f or d istinguishing be tween t wo body post ures. Re searchers argue that configural processing develops for fast and accurate fine level distinctions within an object class (Tanaka & Gauthier, 1997). As a r esult, Re ed a nd co lleagues i nvestigated wh ether co nfigural processing was also used for body postures. To t hink about specialized processes used for human body per ception, i t h elps t o co nsider t he co nfigural p rocessing co ntinuum (e.g., Re ed e t a l., 2 006a, b). A t o ne en d o f t he co ntinuum, ob jects (e.g., houses) are recognized on the basis of their parts, largely independent of where they are located or in what orientation the object is situated. However, at the other end of the continuum, objects (e.g., faces) are recognized holistically by templates in which parts are not explicitly represented (Tanaka & Farah, 1993). Thus, faces have been proposed to be recognized by a different perceptual mechanism than most o ther o bjects. The r ecognition o f fac es r equires t he r elative spatial r elationship a mong t he f eatures a s w ell a s t heir i ndividual shapes and metric distances from each other. Faces are also different from other objects in that they are highly sensitive to orientation. It is much more difficult to recognize faces upside down than upright. The question is whether human body postures are more like houses or more like faces. One o f t he m ost w idely u sed em pirical i ndicators o f co nfigural processing is the face inversion effect (Yin, 1969): the recognition of upright faces is faster and more accurate than inverted faces. Inversion d isrupts t he spa tial r elations a mong f eatures. T o i nvestigate whether co nfigural p rocessing w as u sed f or body post ure r ecognition, Reed and colleagues (Reed et al., 2003; Reed et al., 2006a) used a typical inversion paradigm (see Figure 3.1). Participants viewed two
The Social Dance
89
+
FOCUS ON CROSS (250 MS)
SEE PICTURE #1 (250 MS)
BLANK SCREEN (1000 MS)
SEE PICTURE #2 IS IT SAME OR DIFFERENT?
SEE PICTURE #1 (250 MS)
BLANK SCREEN (1000 MS)
SEE PICTURE #2 IS IT SAME OR DIFFERENT?
+
FOCUS ON CROSS (250 MS)
Figure 3.1 A typical body inversion paradigm. The viewer is presented with two stimuli pre sented s equentially a nd d etermines w hether t he t wo s timuli a re t he same or different postures. Stimuli are either presented in upright (top) or inverted (bottom) positions. The correct response to both example trials is “different.”
sequentially presented stimuli in the same orientation (i.e., either both upright or both inverted); they determined if the two stimuli were the same or d ifferent. When the recognition of abstract (i.e., meaningless) body postures were compared to houses, strong inversion effects were found for bod ies but not houses. To de termine whether body postures were more like faces, faces and body postures were evaluated. Comparable, strong inversion effects were found for both faces and body postures (Reed, Stone et al., 2003). These results suggest that both faces and bodies are processed configurally. Additional evidence for the configural processing of bodies was provided by Stekelenburg and de G elder’s (2004) EEG study in which they demonstrated that the N170 component, a component that has been associated with the configural processing of faces, was relatively en hanced a nd delayed by inverted bodies compared to upright bodies. Once it was determined that body postures were recognized by configural m echanisms, Re ed a nd co lleagues u sed t his i nformation to determine what body information defined a h uman body. To find out what body information was necessary for the visual system to treat the body as a body, they examined what body stimulus manipulations lead to the breakdown of configural processing and loss of the inversion effect (Reed et al., 2006a). After replicating the
90
Embodiment, Ego-Space, and Action
body i nversion effect f or wh ole bod ies, t hey t hen el iminated a nd manipulated t he body st imulus. W hen ha lf bod ies w ere te sted i n which bodies were divided along the vertical axis, dividing left from right, the inversion effect was maintained. It appears that the visual system w as ab le t o r econstruct t he body f rom t he l ong-term body representation bec ause t he body i s la rgely s ymmetrical a long i ts vertical axis. Then half bodies were tested in which the bodies were divided at t he waist, a long t he horizontal a xis. It was possible t hat the body inversion effect was largely derived from head, trunk, and arm positions because that portion of the body provided the greatest amount o f v ariation i n post ure. However, t he i nversion e ffect was lost for both upper and lower portions of the body. When examining body parts such a s the arms in which local configural information could be used, no inversion effects were found. Of interest, however, was that both upright and inverted body parts were highly discriminable. Last, when body parts were scrambled within the context of the body (i.e., arms put in the head position, legs in arms, etc.), no inversion effect was found and discriminability among postures was exceptionally low. When limbs are placed on the trunk in locations where they are not supposed to be, not only does the visual system not treat the stimulus as a body, but also its ability to discriminate plummets. When t he spatial locations of l imbs on t he t runk v iolate t he spatial body representation, body posture perception is impaired. Thus, t he w ay w e v iew t he h uman body i s d istinct f rom m ost other objects. Like faces, bodies show inversion effects but houses do not. This suggests that the configuration of body parts is important for recognition. Until now, faces were the primary class of objects to consistently produce a significant inversion effect for the untrained, average v iewer. These findings su ggest t hat a d ifferent mechanism is used to recognize human body postures from inanimate objects such a s houses. L ast, t he body a ppears to be defined by its h ierarchical structure. Configural processing appears to interact with the spatial body representation for posture recognition.
Sources of Specialized Body Processing Another question is where does this specialized processing of the human body co me from? Some researchers have claimed that configural processing is really the result of expertise and that because we
The Social Dance
91
interact with other humans with their faces and body postures that we have a lot of expertise in recognizing other people. We are experts in face and body posture recognition. Is it possible that expertise from having a body can influence body perception? By investigating body perception, two possible sources of expertise can be examined, visual expertise and embodiment expertise (Reed, Nyberg, & Grubb, 2007). Visual expertise suggests that sufficient experience viewing a particular class of objects can lead to a sh ift f rom pa rts-based recognition to co nfigural recognition (e.g., Tanaka & G authier, 1997). For example, Diamond and Carey (1986) found the inversion effect in d og sh ow j udges t o be m uch la rger f or b reeds i n t heir d omain of expertise as compared to other breeds. Gauthier and Tarr (1997) demonstrated that trained individuals could demonstrate configural processing for the recognition of “greebles,” a novel object class, that untrained individuals recognized by their parts. Another source of ex pertise, embodiment, comes f rom having a body a nd k nowing h ow t o u se i t. I n o ther w ords, m otor, proprioceptive, and kinesthetic experiences associated with living in a body and knowing how it works can influence our visual perception. To investigate these two types of expertise, Reed and colleagues (Reed et al., 2007) had participants discriminate common and rare postures of people and dogs using a t ypical inversion paradigm. These participants were average v iewers w ith no exceptional ex pertise i n dog training or dog show judging. Dogs were selected as comparison stimuli because they are animate and are some of the most frequently seen animals in our everyday lives. Our intent was to equate viewing f requency w ith humans as much as possible. Based on pretesting that determined ratings of typicality for postures that were most human a nd m ost d oglike, t he h ighest-rated post ures f or h umans and dogs were used. The postures that were most typical for humans were a lso r ated l east t ypical f or d ogs a nd v ice v ersa. R are st imuli were created by placing humans in the dog postures and dogs in the human postures. This crossing of animal by viewing frequency led to different predictions for the two types of expertise. For visual expertise, greater configural processing was expected for humans than for dogs because humans see more humans. Also, effects of frequency were expected for both humans and dogs in that there should be greater inversion effects f or co mmon ob jects bec ause co mmon ob jects a re v iewed more frequently than rare objects. For embodied expertise, greater
92
Embodiment, Ego-Space, and Action
configural processing was expected for humans than for dogs but for a different reason, namely that humans have more experience interacting in a human body. Further, effects of viewing frequency were expected o nly f or d ogs bec ause h umans k now h ow t o g et i nto a ll biomechanically possible postures. For dog postures, the close mapping between what humans do and the rare dog stimuli could permit embodiment expertise to be applied to the recognition of those nonhuman stimuli. The only postures in this group that humans cannot map their own bodies onto were common dog postures. Results confirmed greater inversion effects for human compared to dog stimuli. Viewing frequency did not influence the robust inversion effects found for human postures: For accuracy data, comparable inversion effects were found for both common and rare human postures. However, viewing frequency did influence inversion effects for dog postures. Only rare dog postures produced an inversion effect. These ra re post ures of dogs i n human positions d epicted post ures for which the mapping of the observer’s body was easily made to the dog’s body. This pattern of results supports the idea that embodiment ex pertise co ntributes t o v isual body p rocessing. Thu s, both embodiment ex pertise a nd v isual ex pertise co ntribute t o t he co nfigural processing of bodies. This conclusion is consistent with data from biological motion studies in which observers most accurately can recognize movement f rom t heir own bod ies, but t hey a re better at recognizing the movements of friends than strangers (Loula, Prasad, Ha rber, & Sh iffrar, 2005). Also, it is consistent with other findings that demonstrate graded body-specific perceptual effects as the stimulus perceived grows increasingly different from the human body co nfiguration ( Cohen, 2 002). C onfigural p rocessing m echanisms used for recognition in our own domains of expertise can be co-opted when people use their own body r epresentations to interpret the visual world.
Summary Humans a re em bodied a nd o ptimized f or p rocessing “ like m e” information. P erceptual p rocesses i ntegrate i nformation abo ut the obs erver’s own body a nd others’ bod ies. Processing cha nges i f stimulus features resemble information in the observer’s own body representation. S pecialized body r epresentations a nd m echanisms
The Social Dance
93
permit p rocessing efficiencies t o h elp o rganize i nformation f rom other people’s bodies. What good a re t hese mechanisms? The significance of i nterpersonal ac tion su ggests t hat a ke y f unction o f t hese m echanisms i s social per ception. People’s per ceptual g oals a re f unctionally ba sed around wha t o ther pe ople a re c urrently d oing; k nowing wha t another is about to do increases the chances that one’s own actions will be appropriate. Because em otions i nvolve r esponse-coordination pack ages (Atkinson & Adolph, 2005; Panksepp, 1998), knowing the emotional state of others is a critical and effective way to understand and predict t heir ac tions. F or ex ample, a fter r eturning a m idterm ex am, faculty do well to distinguish among an angry student approaching versus a sad or frightened one—and to distinguish this quickly and without waiting for verbal signals. The general importance of such situations suggests that social-emotional perception is an obvious place to look for the role of the embodied processes. Thus, we next discuss how t hese spec ialized body-perception processes i nfluence emotional perception. Emotional Body Perception When on t he subway, dating, or g iving a j ob talk, t wo of t he most important t hings t o k now abo ut r elevant o thers a re wh ether t hey are f eeling pos itively o r n egatively, a nd wha t t hey a re l ikely t o d o next. M oreover, a g roup t hat c an q uickly a nd ac curately t ransmit emotional information among its members has a collective survival advantage. For example, a p rairie dog will express fear by drawing attention to itself by means of sounding an alarm, thus allowing the others i n i ts p rairie d og co mmunity t o d ive u nderground. Giv en the ad aptive i mportance o f k nowing o thers’ em otional st ates, w e focus h ere o n h ow t he a forementioned body -specific mechanisms can facilitate the processing of interpersonal emotional stimuli. This functional approach to social perception provides an explanation for why we might have specialized representations, processes, and neural networks for helping us perform the function of what we should do n ext. Theory o f m ind, e motional c ontagion, s ocial-emotional perception all contribute to tell us about what others are going to do, and how we should respond to their actions.
94
Embodiment, Ego-Space, and Action
Perceiving a nd i nterpreting o ther pe ople’s em otional st ates i s critical for effective social interaction. Emotions are widely regarded as evolutionary adaptations. They evoke behaviors that improve an animal’s cha nces f or su rvival a nd p rocreation i n t hat t hey enab le animals to cope with threats and opportunities presented to them by their physical and social environments (Atkinson & Adolph, 2005; Lazarus, 1991). The production and perception of emotional signals (Darwin, 1 872/1998) ma y ha ve e volved a s r esponse-coordination packages to meet particular environmental challenges such as avoiding p hysical ha rm (fear) a nd co ntaminants (disgust) ( Atkinson & Adolph, 2005; Panksepp, 1998). In social interactions, the necessity of processing speed and the generation of specific behaviors makes specialized cognitive and neural systems beneficial for the processing of certain socially and emotionally relevant information. Given that the perception of i ntentional agents adds significantly to perceptual processing loads, it would make sense to co-opt specialized body representations and processes in the service of social perception, especially when facial ex pressions a nd body post ures provide relevant information. Understanding Others through One’s Own Facial Expressions and Body Postures Understanding t he ac tions, i ntentions, a nd em otions o f o thers i s vital f or soc ial f unctioning. There a re u ndoubtedly m ultiple p rocesses that underlie such knowledge of others; nonetheless, facial expressions and body postures appear to be basic sources of insight into t he emotions and intentions of others (Hatfield, C acioppo, & Rapson, 1994). It is common that people must interpret the meaning of a soc ial i nteraction f rom t he sub tle n uances o f a fac ial ex pression or body gesture. We often acquire information about the internal em otional st ates o f o thers f rom t he sl ump o f t he sh oulders o r the tilt of a head. These are subtle facial and body posture cues that most people perceive quite naturally, w ithout effort. The ease w ith which we are able to understand these cues and match our own bodies to others displaying these cues suggests that we have specialized processes assisting the correspondence between perceived emotions and bodily configurations in ourselves and others. As described in the p revious s ection, k nowledge o f o ne’s o wn body ma y a ssist i n
The Social Dance
95
this process by providing a c ross-modal template upon wh ich one can come to understand others (Meltzoff & M oore, 1995). Further, our own ac tions c an i nfluence how readily we a re able to perceive another body configuration (Reed & Farah, 1995; Reed & McGoldrick, 2007). The r ecognition o f a nother perso n’s em otion i s u ndeniably a n important factor in the perception of others. A large body of research has documented that facial expressions can unambiguously convey certain emotions such as happiness, sadness, fear, anger, and disgust. More recently, body postures have been shown to unambiguously convey emotions such as pride (Tracy & Robins, 2004) and touch has been shown to unambiguously convey emotions such as sympathy or love (Hertenstein, Keltner, App, Bulleit, & Jaskolka, 2006). It is also important to b e a ble to t ell t he s ubtle d ifferences i n bod y post ure between emotional and nonemotional actions. Recognizing whether someone is reaching over their head to strike you or to scratch behind an ear could be critical for survival. Discerning these subtle nuances of body position or action is something that people tend to do well from static body postures or even with degraded information. For example, people can easily distinguish the emotional tone of movements in point-light displays (Atkinson, Dittrich, Gemmel, & Young, 2004). Correspondences between one’s own body and the body of another are likely to play a role in our understanding of their emotional state. There appear to be rapid, automatic processes that lead to matching the facial expressions, vocal tones, postures, and movements of others (Hatfield et al., 1994; Moody & McIntosh, 2006). Thi s mirroring or mimicry of others may provide means by which we use our own body and face to gain information about another person. Although, as evident above, much of the research on self-other correspondences has focused on the body, the majority of research on mimicry has focused on facial expressions. Despite t he relative emphasis on the face, the processes described below are likely to be the same for the bod y; w ork t hat i ncludes t he bod y s uggests t hat t his a ssumed similarity is accurate. A number of studies together diagram a process by which mimicry c an i nfluence o ur k nowledge o f o ther’s em otions. M imicry i s theorized to lead to emotional contagion, when one person’s emotion g enerates a s imilar em otion i n a n obs erver ( Hatfield e t a l., 1994; McIntosh, 2006; McIntosh, Druckman, & Z ajonc, 1994). The
96
Embodiment, Ego-Space, and Action
matched facial or bodily position or movement may cause internal feedback that can create a cha nge in a f elt emotional experience: If you smile, you feel happier, and if you slump, you feel sadder (Dulcos, Laird, Schneider, Sexter, Stern, & Van Lighten, 1989; McIntosh, 1996; Riskind, 1984). Because facial expressions can initiate or modify emotional feelings in a person (see reviews of the facial feedback hypothesis by Adelman & Zajonc, 1989; McIntosh, 1996), a mimetic assumption o f a nother’s em otional fac ial ex pression, post ure, o r vocal tone may in turn cause the observer to feel what the observed person is feeling. Facial postures can induce emotional states even when the manipulation i s not a n e motional one . For e xample, a fter participants repeatedly utter sounds that place the facial muscles in a scowl (“ü”) or a smile (“ee”), their mood changes to match the analogous expression (McIntosh, Z ajonc, V ig, & E merick, 1997). I n add ition, wh en participants put pens i n t heir te eth c reating a pa rtial sm ile or put their pens i n t heir lips creating a pa rtial frown, t he facial postures induced the corresponding emotional states within the participants (Berkowitz & T roccoli, 1990; Ohira & K urono, 1993; L arsen, Kasimatis, & Frey, 1992; Strack, Stepper, & Martin, 1988). Furthermore, the facia l post ures a lso i nfluenced pa rticipants’ em otional a ssessments of visual stimuli. For example, cartoons were judged to be more amusing when the pen was in the teeth relative to the lips. That t he effect of facial movement on a ffective experience is r elevant i n t rue i nterpersonal s ituations ha s be en dem onstrated b y McIntosh (2006). Observers watching others’ natural, spontaneous emotional responses to positive and negative videos mimicked their facial expressions, and the observer’s own emotions changed in association with the models’ emotions. That people mimic live, dynamic facial expressions supports the naturalistic importance of mimicry. Further, this process appears directly relevant to social understanding, as mimicry influences the perception and interpretation of others’ facial expressions of emotion (Niedenthal, Brauer, Halberstadt, & Innes-Ker, 2001). Building on the research demonstrating the importance of facial action on emotion, researchers have shown that body postures also influence one’s subjective experience of emotion (Dulcos et al., 1989; Riskind, 1984; Stepper & Strack, 1993). Individuals induced to assume postures characteristic of certain emotions reported feelings that corresponded to those postures (Dulcos et al., 1989). For example, indi-
The Social Dance
97
viduals wh o sl umped ten ded t o f eel s ad wh ile t hose wh o s at m ore forward with a clenched fist tended to feel anger (Dulcos et al., 1989). One’s own em otional st ate ma y t hen a llow f or a g reater u nderstanding of the other person via a shared emotional experience. Part of this shared emotional experience appears to come from self-other correspondences facilitated by body-specific processing. Further, the emotional quality of body postures may also alter the perceived relationship between oneself and others even when the task has nothing to d o w ith em otional a ssessment. W ilbarger, Re ed, a nd M cIntosh (2007) investigated whether the perception of another’s posture was influenced by an interaction between the emotional quality of one’s own post ure a nd a m odel’s post ure. P articipants a ssumed a posture as per t he experimenter’s instructions, maintained the posture for s everal s econds, a nd t hen resumed a n eutral position (i.e., legs together, arms at side). They then viewed a m odel in a post ure and determined whether the model’s posture was different from the one they had just assumed. Postures were affectively positive, affectively negative, abst ract/meaningless post ures, a nd m eaningful b ut n ot emotional. Re sults i ndicated a cl ear d ifference between emotional and nonemotional postures, whether they were meaningful or not. When participant’s own postures matched the viewed postures, participants were selectively i mpaired i n t he speed a nd accuracy w ith which they recognized emotional postures. One explanation for this pattern of results is that the assumption of em otional body post ures automatically ac tivates t ightly defined emotional categories and states specific to that person. The specificity of the observer’s own emotional bodily experience may create a more specific criterion for judging the viewed posture of another as the “same.” Thus, self-other correspondences do not only aid social perception by emphasizing similarities between the self and others, but t hey a lso help to distinguish fine level differences t hat helps to maintain one’s own emotional state within a social context. Deficits in Understanding Others: Autism We have argued that social perception is facilitated by the perceiver matching the movements, postures, and facial expressions of others. Support f or t he i mportance o f s elf-other co rrespondences c an be found by examining the effects of breakdowns in this process. Below
98
Embodiment, Ego-Space, and Action
we demonstrate that these specialized face and body processing mechanisms a re e ssential co mponents o f t ypical soc ial-emotional functioning t hrough deficits i n socia l per ception f or peo ple w ith autism. In other words, autism provides a window into these mechanisms because individuals with autism produce outcomes that would be expected if these mechanisms were disrupted. Some pe ople a ppear n ot t o perceive o r r espond t ypically t o t he social milieu. Deficits in social perception are associated with difficulties in social adjustment. In fact, a hallmark of autism is an apparent lack of awareness and perception of other people’s emotions and social contexts. In a t heir review of the autism literature, Volkmer, Chawarska, and Klin (2005) conclude that the research points to the existence of f undamental deficits in t he earliest social processes in autism, with disturbances of affective contact which then influence many o ther a reas o f de velopment. W e bel ieve t hat t his i mpaired social perception may, in part, be a result of deficits in basic face and body processing mechanisms that create self-other correspondences. If physically matching the emotional facial or postural position of another is critical to understanding their emotions, then one would predict t hat t hose who do not or c annot match others would have difficulty in emotional perception. Indeed, this is the pattern we see among people with autism. Recent e vidence sh ows t hat pe ople w ith a utism d o n ot q uickly and a utomatically m imic em otional ex pressions. T ypically de veloping pe ople a utomatically, u nintentionally, a nd q uickly ma tch the m ovements o f a n o bserved m odel ( Dimberg, 1 982; Di mberg, Thunberg, & Elmehed, 2000). When adolescents and adults with autism were shown emotional facial ex pressions, however, electromyography showed that they did not display the mimicry seen in a comparison g roup matched on age, gender, a nd verbal i ntelligence (McIntosh, Rei chmann-Decker, W inkielman, & W ilbarger, 2 006). Neither group had difficulties matching facial expressions of others when they were asked to do so. These data indicate that people with autism have a specific deficit in the automatic matching of emotional facial expressions. Such a deficit would be p redicted to i mpair t he experience o f sel f-other co rrespondences, a lter t heir per ception o f others, and impair understanding of others’ emotional states. Just as predicted by an embodied approach to social-emotional perception, people w ith autism do not perceive others i n a t ypical manner. Most generally, autism has been associated with atypical face
The Social Dance
99
and configural processing, as indicated by the lack of a face inversion effect (i.e., upright fac es a re recognized be tter t han i nverted fac es) (Dawson, Webb, & McPartland, 2005; Hobson, Ouston, & Lee, 1988; Langdell, 1978). Significantly, this impairment is not limited to facial perception, but is also evident in body perception. Reed, Beall, Stone, Kopelioff, Pulham, a nd Hepburn (2007) found t hat high-functioning individuals with autism were insensitive to other people’s body postures. They used an inversion paradigm that compared the recognition of upright a nd inverted faces, body post ures, a nd houses. In contrast to typically developing adults who demonstrated inversion e ffects f or bo th fac es a nd body post ures, ad ults w ith a utism demonstrated only a face inversion effect. Because these adults were high-functioning a nd had a ll pa rticipated i n soc ial sk ills cla sses that emphasized face awareness, these adults with autism may have acquired face recognition ex pertise, a lbeit atypically, t hat could be used for configural face processing. However, this face expertise was not used for body posture processing. This difficulty in perceiving others has consequences for socialemotional p erception i n p eople wi th autism. C onsistent wi th th e above findings, they appear to focus on individual facial features rather than configurations when perceiving emotional expressions. Rutherford and McIntosh (2007) found that individuals with autism use facial features in a rule-based approach of emotional perception, r ather th an th e t emplate-based s trategy u sed b y ty pically developing people. Participants were shown pairs of stylized color face images of anger, disgust, happiness, fear, sadness, and surprise. They were asked to indicate which of two faces expressing a single emotion t hat v aried i n i ntensity w ere m ost r ealistic dep ictions o f the specified emotion. For all but surprise, those with autism were more l ikely to accept t he most e xaggerated i mages a s most re alistic. These extremely exaggerated faces were created by quadrupling the average d isplacements of six facial features for t ypical ex pressions of these emotions, based on pretesting and norming. Because rule-based strategies are more tolerant of extreme stimuli than are template-based ones, t his i s t he pattern t hat would be ex pected i f people with autism do not automatically learn to read the emotions of others, but instead have learned explicit rules to decipher others’ emotions. We believe that the absence of the self as a referent may require t he u se o f ex plicit r ules i n j udging ex pressions, a nd t hus make perception of others’ emotional states less automatic and less
100
Embodiment, Ego-Space, and Action
connected to one’s own experiences, leading to errors or inefficiencies in such perception. This rule-based approach is likely to be slower than the configural, tem plate-based p rocesses w e a rgue a re c ritical f or sm ooth social interactions. Evidence for the predicted impairment in quickly extracting emotional information from other is provided by Fazendeiro, Winkielman, a nd McIntosh (2007). When st imuli were presented very quickly, people with autism were specifically impaired in the ability to accurately determine the emotional valence of facial expressions, b ut w ere s imilar t o t ypically de veloping a nd r eading disabled controls i n t heir ability to ex tract major nonaffective features (e.g., gender) from faces or to distinguish between nonaffective and nonfacial stimuli. These data support the notion that individuals w ith a utism ha ve a spec ific i mpairment i n e arly ex traction o f emotion—just wha t w ould be p redicted i f automatic processes a re impaired in this group. Because of the importance of quickly perceiving others’ emotions during dynamic social encounters, our line of thought would predict that people with autism should have difficulty in perceiving emotion from movement. Indeed, they appear to be impaired in the assessment of dy namic emotional c ues f rom bod ily ac tion. Hobson a nd colleagues (Hobson, 1995; Moore, Hobson, & Lee, 1997) found that typical, as well as developmentally delayed, adults, adolescents, and children could accurately describe t he emotion conveyed i n pointlight displays, but individuals with autism who demonstrate deficits in emotion processing had m ore d ifficulty w ith t he t ask (Ozonoff, Pennington, & Rogers, 1990). Thus, u nlike t ypically developing i ndividuals for whom specialized face a nd body p rocessing develops naturally, i ndividuals w ith autism appear to lack t he predisposition for t hese specialized processing m echanisms. A s a r esult, t hey a re m issing a f undamental component of social perception. Most strikingly, this deficit appears to a ffect their ability to engage in real-time social interactions that require the precise timing of responses to others’ social cues.
Conclusions and Future Directions The introduction of specialized body processing mechanisms as an essential part of emotional processing adds to our understanding of
The Social Dance
101
social perception. The idea t hat emotional perception i s embodied is not new. However, what is new is our emphasis on the fundamental perceptual processes from which embodied emotional processes derive. S ocial ps ychologists a re em bracing t he ten ets o f em bodied cognition a s a m eans to ex plain how we u nderstand t he emotions of others. They emphasize how the reinstantiation of previous sensorimotor experience during emotional and social information processing i s a n e ssential process f or u nderstanding o thers’ em otions (Neidenthal, Ba rsalou, Winkielman, K rauth-Gruber, & R ic, 2 005). Further, they propose that our bodily states that arise during social information processing, or our own simulation of emotional events, provide i nsights i nto a nother perso n’s em otional st ates. We a rgue that t hese m odels o f em bodied em otion a re c urrently m issing t he necessary body-processing mechanisms from which the simulations of emotional experience operate; a full understanding of social-emotional perception requires incorporation of such mechanisms. If one cannot c reate t he ba sic correspondences be tween a nother person’s body and one’s own, such as what may be found in autism, then one cannot eng age i n t he a ppropriate s imulation p rocess t hat l eads t o emotional understanding. Our integration of body per ception and social perception literatures also points to some future directions for research in face perception. Theories of bod y perception o ften l ink the i mportance of perception and action and emphasize the correspondences between the self and others. In contrast, these ideas are often implicit or not addressed at all in theories of face perception and emotion. Given the value of self-other correspondences for social perception, a promising line of research would be to take a more functional approach to the actions that facial expressions perform in social contexts. Specifically, studies of face and body perception should explicitly investigate whether self-other correspondence leads to faster more accurate processing of emotions as well as a better understanding of what the other person i s feeling or i s l ikely to do next. Similarly, g iven t hat the ba sic processes of self-other correspondence a nd body per ception a re l ikely t o be a ltered b y t he p urpose a nd co ntext i n wh ich the per ception i s t aking p lace, r esearch o n t he ba sic m echanisms should expand to investigating such influences. It is only by putting the mechanisms within the contexts in which they function that we can gain a m ore dy namic u nderstanding of how soc ial perception operates.
102
Embodiment, Ego-Space, and Action
In conclusion, humans have bodies that perform important actions in the world. Other humans are also acting in the world and toward each other, which adds to the complexity of the perceptual processing necessary to figure out what one should do next. We never need on-line b ody s pecific processing more t han when we de aling w ith social stimuli. Social interactions are similar to waltzing—one anticipates t he p lacement o f a pa rtner’s f oot f or t he p lacement o f o ne’s own, a partner’s push on the hand leads to the give of one’s own and together the social dance turns. To meet these social needs, specialized representations, processes, and neural networks have evolved to per mit fast a nd accurate calculations t hat a llow u s t o k now wha t co mes n ext. P erhaps m ore importantly, they help us coordinate social interactions. One of the primary purposes of embodied perceptual processes is to facilitate our f unctioning i n dy ads o r g roups, wh ether i t i nvolves sha ring a limited resource, establishing an alliance, learning the actions of the group, or communication rituals. There is increasing evidence that some of these specialized body mechanisms, including body posture recognition, fac e r ecognition, a nd em otional r ecognition, a re pa rt of what appears to be a la rger neural network optimized for social information (Blakemore, 2006; Grossman, 2006; Saxe, 2006; Haxby, Hoffman, & Gobbini, 2000). We are just beginning to understand how perceptual embodied processes interact with the communication and understanding of other people’s emotions and other evolutionarily important functions. Acknowledgments Some o f t he r esearch r eported i n t his cha pter w as su pported b y a grant to DNM from the National Alliance for Autism Research. The authors a ppreciate t he feedback on a n e arlier d raft of t his c hapter from Marlene Behrmann and Iris Mauss.
Note 1. In t he body perception l iterature, t he ter ms body schema, a nd body image have both been used to refer to the long-term, body-specific spatial representation. Both have also been used to refer to the on-line
The Social Dance
103
immediate b ody re presentation. Giv en t his c onfusion, t hese ter ms will not be used in this chapter. Instead we will refer to long-term spatial body representations.
References Abravanel, E., & De Yong, N. G. (1991). Does object modeling elicit imitative-like gestures from young infants? Journal of Experimental Child Psychology, 52, 22–40. Adelman, P. K., & Zajonc, R. B. (1989). Facial efference and the experience of emotion. Annual Review of Psychology, 40, 249–280. Atkinson, A. P., & Adolphs, R. (2005). Visual emotion perception: Mechanisms and processes. In L. F. Barrett, P. M. Niedenthal, & P. Winkielman ( Eds.), Emotion an d c onsciousness ( pp. 1 50–182). N ew Y ork: Guilford. Atkinson, A . P., D ittrich, W. H ., G emmell, A . J., & Y oung, A .W. (2004). Emotion p erception f rom dy namic a nd s tatic b ody e xpressions i n point-light and full-light displays. Perception, 33, 717–746. Barresi, J. & Moore, C. (1996). Intentional relations and social understanding. Behavioral and Brain Sciences, 19, 107–122. Barsalou, L.W. (2002). Being there conceptually: Simulating categories in preparation for situated action. In N. L. Stein, P. J. Bauer, & M. Rabinowitz (E ds.), Re presentation, me mory, an d d evelopment: E ssays in honor of Jean Mandler (pp. 1–19). Mahwah, NJ: Erlbaum. Barsalou, L .W. (2003). Situated si mulation i n t he human conceptual s ystem. Language an d C ognitive P rocesses, 18, 5 13–562. [Reprinted i n Moss, H ., & Ha mpton J. ( 2003). ( Eds.). C onceptual re presentation (pp. 513–566). East Sussex, UK: Psychology Press.] Berkowitz, L ., & T roccoli, B . (1990). Feelings, d irection of attention, a nd expressed evaluations of others. Cognition and Emotion, 4, 305–325. Blakemore, S . ( 2006). W hen t he o ther i nfluences t he s elf: I nterference between perception a nd action. In G. K noblich, I. M. Tho rnton, M. Grosjean, & M . Sh iffrar (Eds.), Human bo dy pe rception f rom th e inside out (pp. 413–426). New York: Oxford University Press. Blanchard, R. J., Blanchard, D. C., & Hori, K. (1988). An ethoexperimental approach to the study of defense. In R. J. Blanchard, P. F. Brain, D. C. Bl anchard, & S . Pa rmigiani (Eds.), Ethoexperimental approaches to the study of be havior, NATO ASI S eries (pp. 114–136). Dordrecht: Kluwer. Blanchard, R. J., Blanchard, D. C., Rodgers, J., & Weiss, S. (1990). The characterization and modeling of antipredator defensive behavior. Neuroscience and Biobehavioral Reviews, 14, 463–472.
104
Embodiment, Ego-Space, and Action
Bonatti, L., Frot, E., Zangl, R., & Mehler, J. (2002). The human first hypothesis: Identification of conspecifics and individuation of objects in the young infant. Cognitive Psychology, 44, 388–426. Buxbaum, L ., & C oslett, H. B. (2001). Specialized structural descriptions for human body parts: Evidence from autotopagnosia. Cognitive Neuropsychology, 18, 289–306. Carey, S. (1992). Becoming a face expert. Philosophical Transactions of the Royal Society of London-B, Biological Science, 335, 95–103. Cohen, L. R. (2002). The role of experience in the perception of biological motion. Dissertation Abstracts International, Section B: Sciences and Engineering, 63, 3049. Darwin, C . (1889). The e xpressions of th e e motions in m an an d anim als. New York: D. Appleton. Dawson, G., Webb, S. J., & McPartland, J. (2005). Understanding the nature of f ace p rocessing i mpairment i n a utism: I nsights f rom b ehavioral and electrophysiological studies. Developmental Neuropsychology, 27, 403–424. DeRenzi, E., & Scotti, G. (1970). Autopagnosia: Fiction or reality? Report of a case. Archives of Neurology, 23, 221–227. Desjardins, J. K ., Ha zelden, M . R ., Van der K raak, G . J., & B alshine, S . (2006). Male and female cooperatively breeding fish provide support for the “Challenge Hypothesis.” Behavioral Ecology, 17, 149–154. Diamond, R ., & C arey, S . (1986). W hy f aces a re a nd a re n ot sp ecial: A n effect of expertise. Journal of Experimental Psychology: General, 115, 107–117. Dimberg, U. (1982). Facial re actions to f acial e xpressions. Psychophysiology, 19, 643–647. Dimberg, U ., Thunberg, M ., & E lmehed, K . ( 2000). U nconscious f acial reactions to em otional f acial e xpressions. Psychological S cience, 11, 86–89. Downing, P., Bray, D., Rogers, J., & Childs, C. (2004). Bodies capture attention when nothing is expected. Cognition, 93, B27–B38. Downing, P., J iang, Y., Sh uman, M ., & K anwisher, N. ( 2001). A c ortical area selective for visual processing of the human body. Science. 293, 2470–2473. Dulcos, S. E., Laird, J. D., Schneider, E., Sexter, M., Stern, L., & Van Lighten, O. (1989). Emotion-specific effects of facial expression and postures on emotional e xperience. Journal of P ersonality an d S ocial Psychology, 57, 100–108. Fantz, R . L . ( 1963). Pa ttern v ision i n n ewborn i nfants. Science, 1 40, 296–297. Fazendeiro, T., Winkielman, P., & McIntosh, D. N. (2007). The automatic extraction o f f acial em otion i n i ndividuals w ith a utism: a s tudy o f core affective processing. Manuscript in Preparation.
The Social Dance
105
Field, T., Cohen, D., Garcia, R ., & Gre enberg, R . (1984). Mother-stranger face discrimination by the newborn. Infant B ehavior a nd D evelopment, 7, 19–27. Fox, R ., & M cDaniel, C . (1982). The per ception o f b iological m otion b y human infants. Science, 218, 486–487. Funk, M ., Sh iff rar, M ., & B rugger, P. ( 2005). Ha nd m ovement obs ervation by i ndividuals b orn w ithout ha nds: P hantom l imb e xperience constrains visual limb perception. Experimental Brain Research, 164, 341–346. Gallagher, S. (2005). How the body shapes the mind. Oxford: Oxford University Press. Gauthier, I ., & T arr, M . ( 1997) B ecoming a “ greeble e xpert”: E xploring mechanisms for face recognition. Vision Research, 37, 1673–1682. Gauthier, I., & Tarr, M. J. (2002). Unraveling mechanisms for expert object recognition: Bridging brain activity and behavior. Journal of Experimental Psychology: Human Perception & Performance, 28, 431–446. Gibson, J. J. ( 1977). The t heory of a ffordances. In R . Shaw & J. B ransford (Eds.), Perceiving, a cting an d kno wing ( pp. 67 –82). H illsdale, N J: Erlbaum. Glenberg, A . M . (1997). W hat me mory i s for . Behavioral and Br ain S ciences, 20, 1–55. Glenberg, A. M., & Kaschak, M. P. (2002). Grounding language in action. Psychonomic Bulletin & Review, 9, 558–565. Grossman, E. D. (2006). Evidence for a network of brain areas involved in the perception of biological motion. In G. Knoblich, I. M. Thor nton, M. Grosjean, & M . Shiffrar (Eds.), Human body perception from the inside out (pp. 361–386). New York: Oxford University Press. Harnad, S . ( 1990). The s ymbol g rounding p roblem. Physica D, 4 2, 335–346. Hatfield, E., Cacioppo, J. T., & R apson, R. L. (1994). Emotional contagion. Cambridge, UK: Cambridge University Press. Haxby, J. V ., H offman, E . A ., & G obbini, M . I ., ( 2000). The distributed human n eural s ystem f or f ace p erception. Trends i n C ognitive S ciences, 4, 223–233. Heptulla-Chatterjee, S., Freyd, J. J., & Sh iffar, M. (1996). Configural processing i n t he p erception o f appa rent b iological m otion. Journal of Experimental P sychology: H uman P erception an d P erformance, 22 , 916–929. Hertenstein, M. J., Kel tner, D., App, B., Bu lleit, B., & J askolka, A. (2006). Touch communicates discrete emotions. Emotion, 6, 528–533. Hobson, P . R . ( 1995). A pprehending a ttitudes a nd ac tions: S eparable abilities in early development? Development and Psychopathology, 7, 171–182.
106
Embodiment, Ego-Space, and Action
Hobson, R . P., Ouston, J., & L ee, A. (1988). What’s i n a f ace? The case of autism. British Journal of Psychology, 79, 441–453. James, W. (1884). What is an emotion? Mind, 9, 188–205. Johnson, M. H. (1997). Developmental cognitive neuroscience. Cambridge, MA: Blackwell. Johnson, M. H., & Morton, J. (1991). Biology and cognitive development: The case of face recognition. Oxford: Blackwell. Kendrick, K. M., & B aldwin, B. A. (1989). Visual responses of sheep temporal cortex cells to moving and stationary human images. Neuroscience Letters, 100, 193–197. Kuhl, P. (1987). The special-mechanisms debate in speech research: Categorization tests on animals and infants. In S. Harnad (Ed.), Categorical perception: The groundwork of cognition (pp. 355–386). New York: Cambridge University Press. Langdell, T . ( 1978). Re cognition o f f aces: A n app roach to t he s tudy o f autism. Journal of Child Psychology & Psychiatry, 19, 255–268. Larsen, R . J., K asimatis, M ., & F rey, K . (1992). F acilitating t he f urrowed brow: An unobtrusive test of the facial feedback hypothesis applied to unpleasant affect. Cognition and Emotion, 6, 321–338. Lazarus, R. S. (1991). Cognition and motivation in emotion. American Psychologist, 46, 352–367. Legerstee, M. (1991). The role of person and object in eliciting early imitation. Journal of Experimental Child Psychology, 5, 423–433. Legerstee, M. (1994). Patterns of 4-month-old infant responses to h idden silent and sounding people and objects. Early Development and Parenting, 20, 71–80. Legerstee, M., Pomerleau, R., Ma lcuit, G., & F eider, H. (1987). The development o f i nfants’ re sponses to p eople a nd a dol l: I mplications f or research i n c ommunication. Infant Be havior and D evelopment, 1 0, 81–95. Loula, F., Prasad, S., Harber, K., & Shiff rar, M. (2005). Recognizing people from t heir movements. Journal of E xperimental Psychology: Human Perception & Performance, 31, 210–220. Martin, R. A., & Redka, E. (2006).“Hedgehog”: A novel defensive posture in juvenile Amblyraja radiate. Journal of Fish Biology, 68, 613. Maurer, D. (1985). Infants’ perception of facedness. In T. Field, & N. F ox (Eds.), Social perception in infants (pp. 73–100). Norwood, NJ: Ablex. Maurer, D., LeGrand, R., & Mondloch, C. J. (2002). The many faces of configural processing. Trends in Cognitive Sciences, 6, 255–260. McGoldrick, J. E., & Reed, C. L. (2007). Configural but not holistic processing of body postures: bodies are not exactly like faces. Submitted. McIntosh, D. N. (1996). Facial feedback hypothesis: Evidence, implications and directions. Motivation and Emotion, 20, 121–147.
The Social Dance
107
McIntosh, D. N. (2006). Spontaneous facial mimicry, liking and emotional contagion. Polish Psychological Bulletin, 37, 31–42. McIntosh, D. N., D ruckman, D., & Z ajonc, R. B. (1994). Socially induced affect. In D. Druckman & R. A. Bjork (Eds.), Learning, r emembering, believing: Enhancing human performance (pp. 251–276, 364–371). Washington, D.C.: National Academy Press. McIntosh, D. N., Rei chmann-Decker, A., Winkielman, P., & Wi lbarger, J. L. (2006). When the social mirror breaks: Deficits in automatic, but not vol untary m imicry o f em otional f acial e xpressions i n a utism. Developmental Science, 9, 295–302. McIntosh, D. N., Z ajonc, R. B., Vig, P. S., & E merick, S. W. (1997). Facial movement, bre athing, t emperature, a nd a ffect: I mplications of t he vascular theory of emotional efference. Cognition an d Emot ion, 11, 171–195. Meltzoff, A. N., & Moore, K. M. (1977). Imitation of facial and manual gestures by human neonates. Science, 198, 75–78. Meltzoff, A . N., & M oore, K . M . ( 1983). N ewborn i nfants i mitate ad ult facial gestures. Child Development, 54, 702–709. Meltzoff, A . N., & M oore, K . M . ( 1989). I mitation i n n ewborn i nfants: Exploring the range of gestures imitated and the underlying mechanisms. Developmental Psychology, 25, 954–962. Meltzoff, A. N., & Moore, K. M. (1992). Early imitation within a functional framework: The importance of person identity, movement and development. Infant Behavior and Development, 15, 479–505. Meltzoff, A. N., & M oore, K. M. (1995). Infants’ understanding of people and things: From body imitation to folk psychology. In J. Bermúdez, A. Marcel, & N. Eilan (Eds.), The body and the self (pp. 43–69). Cambridge, MA: MIT/Bradford Press. Meltzoff, A. N., & Moore, K. M. (1997). Explaining facial imitation: A theoretical model. Early Development and Parenting, 6, 179–192. Moody, E . J., & M cIntosh, D. N. ( 2006). M imicry a nd autism: B ases a nd consequences of rapid, automatic matching behavior. In S. J. Rogers & J. Williams (Eds.), Imitation and the social mind: Autism and typical development (pp. 71–95). New York: Guilford. Moore, C . ( 2006). Re presenting i ntentional rel ationships a nd ac ting intentionally in inf ancy: C urrent in sights an d o pen q uestions. I n G. Knoblich, I. M. Thornton, M. Grosjean, & M . Shiff rar, M. (Eds.), Human body perception from the inside out (pp. 427–442). New York: Oxford University Press. Moore, D . G ., H obson, P. R ., & L ee, A . ( 1997). C omponents o f p erson perception: A n i nvestigation w ith autistic, nonautistic retarded a nd typically de veloping i n c hildren a nd adole scents. British J ournal of Developmental Psychology, 15, 401–423.
108
Embodiment, Ego-Space, and Action
Niedenthal, P. M ., B arsalou, L .W., Wi nkielman, P., K rauth-Gruber, S ., & Ric, F. (2005). Embodiment in attitudes, social perception, and emotion. Personality and Social Psychology Review, 9, 184–211. Niedenthal, P. M., Brauer, M. Halberstadt, J. B., & Innes-Ker, A. H. (2001). When did her smile drop? Facial mimicry and the influences of emotional state on the detection of change in emotional expression. Cognition and Emotion, 15, 853–864. Niedenthal, P. M., Halberstadt, J. B., & Innes-Ker, A. H. (1999). Emotional response categorization. Psychological Review, 106, 337–361. Ogden, J. A. (1985). Autopagnosia. Occurrence in a patient without nominal aphasia and with an intact ability to point to parts of animals and objects. Brain, 108, 1009–1022. Ohira, H., & Kurono, K. (1993). Facial feedback effects on impression formation. Perceptual and Motor Skills, 77, 1251–1258. Ozonoff, S., Pennington, B., & Rogers, S. (1990) Are there emotion perception deficits i n young autistic c hildren? Journal of C hild Psychology and Psychiatry, 31, 343–361. Panksepp, J. (1998). Affective neuroscience: The foundations of human and animal emotions. Oxford: Oxford University Press. Piaget, J. ( 1954). Construction of r eality in th e chil d. N ew Y ork: B asic Books. Pinto, J. ( 2006). De veloping b ody re presentations: A re view o f i nfants’ responses to biological motion displays. In G. Knoblich, I. M. Thor nton, M . Grosjean, & M . Sh iffrar, M. (Eds.), Human body pe rception from the inside out (pp. 305–322). New York: Oxford Press. Price, E . H . (2002). A c ritical re view of re ports a nd t heories of phantom limbs a mongst c ongenitally l imb-deficient sub jects a nd a p roposed theory for the developmental origins of body image. Behavioral and Brain Sciences. (Electronic version. Retrieved August 23, 2006, http:// www.bbsonline.org/documents/a/00/00/12/50/index.html) Quinn, P., & E imas, P. (1998). Evidence for global categorical representation of humans by young infants. Journal of Experimental Child Psychology, 69, 151–174. Reed, C . L . (2002). What is t he body schema? In W. Prinz & A . Meltzoff (Eds.), The imitative min d: D evelopment, e volution, an d b rain ba ses (pp. 233–243). Cambridge, UK: Cambridge University Press. Reed, C. L., Beall, P. M., Stone, V. E., Kopelioff, L., Pulham, D., & Hepburn, S. L. (2007). Brief report: Perception of body posture—what individuals w ith autism m ight b e m issing. Journal of A utism an d D evelopmental Disorders, 37, 1576–1584. Reed, C . L ., & F arah, M . J. ( 1995). The ps ychological re ality o f t he b ody scheme: A test with normal participants. Journal of Experimental Psychology: Human Perception and Performance, 21, 334–343.
The Social Dance
109
Reed, C. L ., & M cGoldrick, J. E . (2007). Action during perception a ffects memory: C hanging i nterference to f acilitation v ia p rocessing t ime. Social Neuroscience, 2, 134–149. Reed, C. L., Nyberg, A., & Grubb, J. (2007). The embodiment of perceptual expertise and configural processing. Submitted. Reed, C. L., Stone, V. E., Bozova, S., & Tanaka, J. (2003). The body inversion effect. Psychological Science, 14, 302–308. Reed, C. L., Stone, V. E., Grubb, J. D., & McGoldrick, J.E. (2006a). Turning configural pro cessing up side do wn: Pa rt-and w hole b ody p ostures. Journal of E xperimental P sychology: H uman P erception an d P erformance, 32, 73–87. Reed, C . L ., Sto ne, V. E ., & M cGoldrick, J. E . ( 2006b). N ot j ust p osturing: Configural processing of the human body. In G. Knoblich, I. M. Thornton, M. Grosjean, & M. Shiffrar, M. (Eds.), Human body perception from the inside out (pp. 229–258). New York: Oxford Press. Riskind, J. ( 1984). They s toop to c onquer: Gu iding a nd s elf-regulatory functions of physical posture after success and failure. Journal of Personality and Social Psychology, 47, 479–493. Rizzolatti, G., & Cr aighero, L. (2004). The mirror neuron system. Annual Review of Neuroscience, 27, 169–192. Ro, T., Russell, C., & Lavie, N. (2001). Changing faces: A detection advantage in the flicker paradigm. Psychological Science, 12, 94–99. Rutherford, M . D ., & M cIntosh, D . N., ( 2006). R ules v ersus p rototype matching: Strategies of perception of emotional facial expressions in the a utism s pectrum. Journal of A utism an d D evelopmental D isorders, 37, 187–196. Saxe, R. (2006). Uniquely human social cognition. Current Opinion in Neurobiology, 16, 235–239. Schwoebel, J. Bu xbaum, L . J., & C oslett, H. B. (2004). Representations of the human body in the production and imitation of complex movements. Cognitive Neuropsychology, 21, 285–298. Shiffrar, M ., & F reyd, J. J. ( 1990). A pparent m otion o f t he h uman b ody. Psychological Science, 1, 257–264. Shiffar, M., & Freyd, J. J. (1993). Timing and apparent motion path choice with human body photographs. Psychological Science, 4, 379–384. Slaughter, V., & Heron, M. (2004). Origins and early development of human body knowledge. Monographs of the Society for Research in Childhood Development, 69, 1–102. Slaughter, V., Heron, M., & Si m, S. (2002) Development of preferences for the human body shape in infancy. Cognition, 85, B71–B81. Slaughter, V., Sto ne, V. E ., & Re ed, C . L . (2004). Perception o f f aces a nd bodies: si milar or d ifferent? Current Directions in P sychological Science, 6, 219–223.
110
Embodiment, Ego-Space, and Action
Stekelenburg, J. J., & de Gelder, B. (2004). The neural correlates of perceiving human bodies: an ERP study on the body-inversion effect. Neuroreport, 15, 777–780. Stepper, S., & St rack, F. (1993). Proprioceptive determinants of emotional and nonemotional feelings. Journal of Personality and Social Psychology, 64, 211–220. Strack, F., Stepper, S., & Martin, L. L. (1988). Inhibiting and facilitating conditions of the human smile: A nonobtrusive test of the facial feedback hypothesis. Journal of Personality & Social Psychology, 54, 768–777. Tanaka, J. W., & G authier, I. (1997). Expertise in object and face recognition. In R. L. Goldstone, P. G. Schyns, & D. L. Medin (Eds.), Psychology of learning and motivation (pp. 83–125). San Diego, CA: Academic Press. Thelen, E. (1995). Time-scale dynamics in the development of an embodied cognition. In R. Port & T. van Gelder (Eds.), Mind in motion (pp. 69–100). Cambridge, MA: MIT Press. Thelen, E., & Smith, L. B. (1994). A dynamic systems approach to the development of c ognition an d a ction. C ambridge, M A: M IT P ress/Bradford Books. Tracy, J. L ., & Rob ins, R. W. (2004). Show your pride: Evidence for a d iscrete emotion expression. Psychological Science, 15, 194–197. Trevarthen, C . ( 1979). C ommunication a nd c ooperation i n i nfancy: A description of primary intersubjectivity. In M. Bullowa (Ed.), Before speech: The beginning of inte rpersonal communication (pp. 321–347). Cambridge, UK: Cambridge University Press. Trevarthen, C . (1993). The self born in intersubjectivity: The psychology of an infant communicating. In U. Neisser (Ed.), The pe rceived self: Ecological an d inte rpersonal s ources of se lf-knowledge (pp. 1 21–173). New York: Cambridge University Press. Turati, C., Simion, F., Milani, I., & Umiltà, C. (2002). Newborns’ preference for faces: What is crucial? Developmental Psychology, 38, 875–882. Valenza, E., Simion, F., Macchi Cassia, V., & Umilta, C. (1996). Face preference at birth. Journal of Experimental Psychology: Human Perception and Performance, 22, 892–903. Volkmar, F., Chawarska, K., & Klin, A. (2005). Autism in infancy and early childhood. Annual Review of Psychology, 56, 315–336. Weinstein, S ., & S ersen, E . A . ( 1961). P hantoms i n c ases o f c ongenital absence of limbs. Neurology, 11, 905–911. Wilbarger, J., Reed, C. L., & McIntosh, D. N. (2007). You can’t show me how I feel: Implicit influences of our affective postures on the perception of others. Manuscript in preparation. Wilson, M . ( 2001). P erceiving i mitatable s timuli: C onsequences o f i somorphism b etween i nput a nd o utput. Psychological B ulletin, 127, 543–553.
The Social Dance
111
Wilson, M. (2002). Six views of embodied cognition. Psychonomic Bulletin and Review, 9, 625–636. Wilson, M. (2005). Covert imitation: How the body schema acts as a prediction de vice. I n G . K noblich, I . M . Thornton, M . Gro sjean, & M . Shiffrar (Eds.), Human body perception from the inside out (pp. 211– 228). New York: Oxford University Press. Wilson, M., & Knoblich, G. (2005). The case for motor involvement in perceiving conspecifics. Psychological Bulletin, 131, 460–473. Yin, R . K. (1969). Looking at upside-down faces. Journal of E xperimental Psychology, 81, 141–145.
4 Embodied Motion Perception Psychophysical Studies of the Factors Defining Visual Sensitivity to Selfand Other-Generated Actions
Maggie Shiffrar
Introduction Traditionally, the visual system has been understood as a generalpurpose processor t hat a nalyzes a ll cla sses of v isual i mages i n t he same way (e.g., Ma rr, 1982; Shepard, 1984). According to t his perspective, t he s ame v isual p rocesses a re em ployed wh en obs ervers view objects a nd pe ople. This i s not u nrelated t o t he idea t hat t he visual system is a module (Pylyshyn, 1999) that is “encapsulated” unto i tself ( Fodor, 1983). W hile such a n a pproach ha s p roduced a plethora of scientific discoveries, it is necessarily limited. The purpose of t his cha pter i s t o confront t his m odular u nderstanding of the visual system in two steps. The first section will challenge the hypothesis that all visual images are analyzed by the same menu of perceptual processes. This challenge w ill come f rom psychophysical studies focusing on the visual analysis of human motion. Human action is often the most frequent, the most psychologically
113
114
Embodiment, Ego-Space, and Action
meaningful, and the most potentially life altering motion in normal human environments. As such, studies of action perception provide a m eans t o u nderstand h ow t he h uman v isual s ystem a nalyzes a fundamentally important category of motion stimuli. To that end, psychophysical studies will be reviewed that indicate the existence of pr ofound d ifferences be tween t he v isual per ception o f h uman motion and object motion. The second section will focus on the question of why; that is, why does the visual perception of human motion differ from the visual perception of object motion? Three possible reasons will be considered. First, human motion is the only category of visual motion that observers c an bo th p roduce a nd per ceive. A s a r esult, m otor p rocesses may s electively contribute to t he a nalysis of a nd t hus s electively i ncrease perceptual s ensitivity to human motion. S econd, a s essentially social animals, human observers have a lifetime of experience watching other people move. From t his perspec tive, human observers ma y ex hibit en hanced per ceptual s ensitivity t o h uman motion simply because they see so much of it. Finally, human motion carries more social-emotional information than any other category of visual motion. Thus, social-emotional processes might contribute to and facilitate the perception of human movement. Psychophysical tests will be used to investigate each of these possibilities in turn. The take-home m essage f rom t hese st udies w ill be t hat t he v isual s ystem cannot be understood as an isolated system. Instead, the visual analysis of human movement depends upon a convergence of motor processes, perceptual learning, and social-emotional processes. But first, does the perception of human motion differ from the perception of object motion?
Comparing the Perception of Human Motion and Object Motion Motion is an inherently spatial-temporal phenomenon as it involves the simultaneous change of information over space and time. To perceive movement, our visual system must therefore integrate dynamic changes across space and across time. While each of these processes cannot be u nderstood w ithout t he o ther, r esearchers t raditionally use different techniques to examine each subprocess. That approach will be employed here to compare and contrast the visual integration of human and object motions over space and over time.
Embodied Motion Perception
115
Motion Integration across Space Why d oes t he per ception o f v isual m otion r equire t he i ntegration of visual information over space? A primary reason comes from the structure of the visual system itself. The Aperture Problem Neurons in early stages of the visual system have relatively small receptive fields t hat measure luminance changes w ithin very sma ll image regions (e.g., Hubel & Wiesel, 1968). Small measurement areas mean that each neuron can only respond to a t iny subregion of an image. These local measurements must be combined to compute the motion of whole objects. A complication to this combinatorial process results f rom t he fact t hat t he local motion measurements obtained by i ndividual ne urons pr ovides on ly a mbiguous i nformation. This ambiguity, illustrated in Figure 4.1, is commonly referred to as the aperture problem. To u nderstand t his problem f rom a spa tial perspective, first consider that the motion of any luminance edge can be decomposed into the portion of motion that is parallel to the edge’s orientation and the portion that is perpendicular to the edge’s orientation. Because a neuron cannot track or respond to the ends of that edge if those ends fall outside of its receptive field, the neuron cannot measure any of the motion that is parallel to the edge. Instead, each motion sensitive neuron c an on ly detect t he component of mot ion that is perpendicular to the orientation of an edge. Because only this perpendicular component of motion can be m easured, a ll motions having t he s ame per pendicular co mponent o f m otion w ill a ppear to b e identical e ven w hen t hey d iffer significantly i n t heir pa rallel
The same line at time T + ∆ t
A translating line at time T
Figure 4.1 The aperture problem. Whenever a translating line is viewed through a relatively small receptive field, only the component of mot ion perpendicular to the l ine’s or ientation c an b e me asured. A s a re sult, a n i nfi nitely l arge f amily of different translations that all share the same perpendicular component of motion (illustrated here by the 5 arrows) cannot be distinguished from one another.
116
Embodiment, Ego-Space, and Action
components of motion. As a result, a d irectionally selective neuron will give the same response to many different motions. Because all known v isual s ystems, wh ether b iological o r co mputational, ha ve neurons with receptive fields that are limited in size, this measurement a mbiguity ha s be en ex tensively st udied (e.g., H ildreth, 1984; Shiffrar & Pavel, 1991; Wallach, 1976). How does the visual system compute the motions of whole objects from local measurements that are inherently ambiguous? While local motion measurements a re a mbiguous, mot ion measurements f rom two d ifferently o riented a nd r igidly co nnected l uminance ed ges can be u nambiguously i nterpreted ( Adelson & M ovshon, 1 982). When d ifferently o riented ed ges bel ong t o t he s ame so lid ob ject, the integration their motion signals is appropriate. However, when differently oriented edges belong to different objects or to the same nonrigid object, t heir motion s ignals should not be i ntegrated but rather segmented or analyzed separately. Indeed, the integration of motion measurements across different objects could have disastrous consequences. Imagine, for example, that you want to cross a street on which two cars are traveling toward each other at equal speeds. If your visual system combined motion measurements across these two cars, then these measurements would cancel each other out (because they are equal and opposite). In this case, your visual motion system would conclude that there is no motion in the street and as a result, you might step o ut to cross it. Obviously, people having visual systems that work in such a manner are no longer with us. So how does the visual system solve this aperture problem? The v isual s ystem c an o vercome t he a mbiguity o f l ocal m otion measurements b y p icking i mage so lutions t hat a re l ocal o r g lobal in t heir levels of a nalysis. At t he local level, t he v isual s ystem c an uniquely interpret ambiguous edge motion by relying on visible edge discontinuities. Ob jects a nd pe ople ha ve bo undary d iscontinuities such as endpoints (e.g., fingertips a nd pen cil er asers) a nd r egions of high curvature (e.g., elbows and corners) that indicate where one object ends and the next object begins. Motion processes use these local form cues to st rike t he correct ba lance between motion i ntegration w ithin i ndividual objects a nd motion s egmentation ac ross different objects. A global solution to the aperture problem involves integrating local motion signals across larger, spatially disconnected image regions. Models of this global integration process include the “intersection of constraints” a nd vector averaging (e.g., Adelson & Movshon, 1982; H. Wilson, Ferrera, & Yo, 1992).
Embodied Motion Perception
117
How d oes t he v isual s ystem s elect t he co rrect l evel o f a nalysis when dynamic images have different local and global interpretations? In one series of psychophysical studies that examined this question, simple t ranslating a nd r otating objects w ere v iewed t hrough multiple apertures. Local motion analyses would lead to the interpretation of each v isible edge moving independently of t he other edges. Global a nalyses w ould i nvolve t he i ntegration o f m otion s ignals across the edges and lead to the interpretation of a coherent moving object instead of the interpretation of multiple edges moving independently (Figure 4.2). The results of these studies demonstrate that the visual system tends to default to local analyses even when local solutions conflict with an observer’s prior knowledge of the underlying ob ject’s sha pe (Shiffrar & L orenceau, 1996; Sh iffrar & P avel, 1991). The s ame defa ult t o l ocal a nalyses i s f ound wh en obs ervers view relatively complex nonrigid objects, such as cars and scissors, through apertures (Shiffrar, Lichtey, & Heptulla-Chatterjee, 1997). (A)
(B) Local Solution
(C) Global Solution
Figure 4.2 Two solutions to t he aperture problem. (A) A d iamond translates to the right and is viewed through four apertures. The motion measurement within each aperture is ambiguous. (B) In a local interpretation, the motion within each aperture is interpreted independently of the other apertures. As a result, in this case, each line segment appears to translate in the direction perpendicular to its orientation. (C) In a global interpretation, motion signals are integrated across apertures so that all line segments appear to translate in the same, veridical direction.
118
Embodiment, Ego-Space, and Action
But so mething en tirely d ifferent ha ppens wh en obs ervers v iew human motion t hrough multiple apertures. In t his case, t he v isual system defaults to global image interpretations. For example, when a st ick figure r endition o f a w alking perso n i s v iewed t hrough apertures, obs ervers r eadily a nd ac curately i nterpret t he m otions of t he v isible l ine s egments a s a co herent, g lobal wh ole. T ypical descriptions of such stimuli include: “a walker,” “a man walking,” and “ someone mov ing.” C onversely, non rigid ob ject mot ion, s uch as a pa ir o f s cissors o pening a nd cl osing, i s per ceived a s g lobally incoherent when viewed through apertures. Typical descriptions of moving ob jects s een t hrough a pertures i nclude “ wormlike t hings that get longer,” “undulating lines,” and “a bunch of lines.” This pattern of results suggests that the processes underlying the integration of visual motion signals across space differ for human motion and object motion. Is t he i ntegration o f h uman m otion s ignals o ver spac e a lways different from the integration of object motion over space? Psychophysical e vidence suggests t hat only physically possible human actions are more globally integrated. For example, if a person walks impossibly fast or impossibly slow behind a set of apertures, observers default to local interpretations (Shiffrar et al., 1997). If observers view an upside-down person walking behind apertures, they interpret the display locally and hence, do not integrate motion information across the line segments. Thus, only physically possible human movement appears to be i ntegrated over larger spatial extents than object mot ion. The i mplications of t his finding w ill beco me cl ear during the discussion of the impact of motor experience and visual experience on action perception later in this chapter. Point-Light Displays Point-light d isplays represent a nother tech nique t hat i s commonly used t o e xamine mot ion i ntegration a cross d iscontinuous re gions of s pace. This tech nique was originally de veloped by Etienne Jules Marey for his studies of human gait in the 1890s (Marey, 1895/1972). In t he 1970s, G unnar J ohansson i ntroduced t his tech nique t o t he vision s ciences. I n i t, sma ll ma rkers o r po int-lights a re a ttached to t he major joints of moving ac tors, a s i llustrated i n Figure 4 .3A. The actors are filmed so that only the point-lights are visible in the resultant d isplays (see Figure 4 .3C). Even t hough a vast a mount of information is removed f rom t he original stimuli, observers of t he
Embodied Motion Perception
(A)
(C)
119
(B)
(D)
Figure 4.3 Point-light walker displays. (A) Markers are placed on the main joints and head of a walking person viewed from a saggital perspective (B) An egocentric or a llocentric v iew of a p oint-light walker (C) In t he experimental displays, only the motions of t he point-lights are visible. (D) Point-light walkers can be masked with additional points moving with the same trajectories.
resultant po int-light d isplays r eadily per ceive human m otion (e.g., Johansson, 1 973, 1 976). I ndeed, f rom po int-light d isplays a lone, observers can accurately determine an actor’s gender (Pollick, Key, Heim, & S tringer, 2 005), em otional st ate (Clarke, B radshaw, Field, Hampson, & R ose, 2 005), a nd dec eptive i ntent ( Runeson & F rykholm, 1983).
120
Embodiment, Ego-Space, and Action
The r esults o f st udies u sing po int-light d isplays s imilarly su pport the hypothesis that the visual perception of human movement depends upon a m echanism that globally integrates motion signals across space (e.g., Ahlström, Blake, & Ahlström, 1997; Bertenthal & Pinto, 1994; Cutting, Moore, & Morrison, 1988). One approach to this issue involves the presentation of point-light walkers within pointlight masks (Figure 4.3D). A point-light mask can be constructed by redistributing the spatial locations of each point from one or more point-light walkers. The size, luminance, and velocity of the points remain u nchanged. Thus, t he motion of e ach point i n t he ma sk i s identical to the motion of one of the points defining the walker. As a result, only the spatially global configuration of the points distinguishes the walker from the mask. The finding that subjects are able to detect the presence as well as the gait direction of an upright point-light walker hidden w ithin a point-light mask indicates that the mechanism underlying the perception of human movement operates over large spatial scales (Bertenthal & P into, 1994). When t he same masking technique is used with nonhuman motions, such as arbitrary figures (Hiris, Krebeck, Edmonds, & S tout, 2 005), a nd w alking d ogs, s eals (Cohen, 2 002), and horses (Pinto & Shiffrar, 2007), significant decrements are found in observers’ ability to detect these nonhuman objects. The se results add further support for the hypothesis that observers are better able to integrate human motion than nonhuman motion across disconnected regions of space. Motion Integration across Time Psychophysical r esearchers ha ve t raditionally u sed t he p henomenon of apparent motion to investigate the temporal nature of visual motion processes. In classic demonstrations of apparent motion, two spatially s eparated ob jects a re s equentially p resented so t hat t hey give r ise to t he perception of a s ingle moving object. E arly st udies demonstrated that apparent motion percepts depend critically upon the relationship between the temporal and spatial separations of the displays (Korte, 1915; Wertheimer, 1912). Indeed, these early studies triggered the establishment of Gestalt psychology by demonstrating that per ception d iffers f rom t he su mmation o f st imulus a ttributes (Ash, 1995).
Embodied Motion Perception
121
In a ll a pparent m otion d isplays, t he figure(s) sh own i n e ach frame c an be co nnected b y a n i nfinite number of possible paths. Observers typically report seeing only the shortest possible path of apparent motion (e.g., Burt & Sperling, 1981) even when that shortest pa th i s p hysically i mpossible. This phe nomenon i s c ommonly referred to as the shortest path constraint. An example can be found in o ld Western m ovies sh owing h orse d rawn w agons i n m otion. Interestingly, the wagon wheels sometimes appear to rotate rapidly in t he w rong d irection (Shiff rar, 20 01). This perceptual i llusion is an example of the shortest path constraint. Because the continuous rotational m otion o f t he wh eel spo kes i s dep icted v ia d iscontinuous m ovie f rames, t he wh eel spo kes c an p hysically r otate fa rther between frames than the interspoke distance. When this happens, the shortest distance between spokes can be backwards rather than forwards. A s a r esult, obs ervers per ceive back ward w agon wh eel rotation. Even though such motion is physically impossible, observers nonetheless see it clearly. Thus, observers perceive t he shortest possible paths of apparent object motion even when those paths are physically impossible. An interesting violation of this shortest path constraint is found with human motion. When humans move, their limbs follow curvilinear trajectories. As a result, the shortest, rectilinear path connecting a ny t wo l imb positions i s i nconsistent w ith t he biomechanical limitations o f h uman m ovement. Giv en t he v isual s ystem’s sh ortest-path bias, this raises of question of whether observers of human movement p erceive p aths of app arent h uman move ment t hat t raverse the shortest possible distance or paths that are consistent with the m ovement l imitations o f t he h uman body . This q uestion h as been tested i n st udies of a pparent motion perception w ith st imuli consisting of photographs of a human model in different poses. The poses were selected so that biomechanically possible paths of apparent human motion conflicted with the shortest possible paths (Shiffrar & F reyd, 1990, 1993). F or ex ample, o ne st imulus co nsisted o f two photographs of a standing woman with her right arm positioned on either side of her head (Figure 4 .4). The shortest path connecting these two arm positions would require the arm to pass through the head while a b iomechanically plausible path would require t he arm t o m ove a round t he h ead. W hen sub jects v iewed such st imuli, t heir per ceived pa ths o f m otion cha nged w ith t he st imulus onset a synchrony (SOA) or t he a mount t ime between t he onset of
122
Embodiment, Ego-Space, and Action
Figure 4.4 Apparent h uman mot ion d isplays. T wo f rames d epict a wom an positioning her hand in front of a nd behind her head. At shorter SOAs, her hand appears to t ranslate t hrough he r he ad. A s S OAs i ncrease, he r h and i ncreasingly appears to move around her head.
one p hotograph a nd t he o nset o f t he n ext p hotograph. A t sh orter SOAs, subjects perceived the shortest, physically impossible motion path. W ith i ncreasing SOAs, obs ervers w ere i ncreasingly l ikely t o see apparent motion paths consistent w ith t he biomechanical constraints on human movement (Shiffrar & F reyd, 1990). Conversely, when v iewing p hotographs o f i nanimate co ntrol ob jects, sub jects consistently perceived t he shortest path of apparent object motion at a ll SO As. I mportantly, wh en v iewing p hotographs o f a h uman model positioned so t hat a sh ort path is biomechanically plausible, observers always reported seeing this short path (Shiffrar & F reyd, 1993). Thus, subjects do not simply report the perception of longer paths w ith l onger SOAs. M oreover, obs ervers c an per ceive a pparent motion of nonbiological objects i n a ma nner similar to apparent motion of human bod ies. However, t hese objects must contain a g lobal h ierarchy of orientation a nd position c ues resembling t he entire human form before subjects perceive humanlike paths (Heptulla-Chatterjee, Freyd, & Shiffrar, 1996). This pattern of results suggests t hat human m ovement i s a nalyzed b y m otion processes t hat operate o ver r elatively la rge tem poral w indows a nd t hat t ake i nto account the biomechanical limitations of the human body. This co nclusion i s f urther su pported b y st udies o f po int-light walkers. When observers are asked to detect point-light walkers in a ma sk, w alker de tection per formance i s above cha nce e ven when significant temporal gaps are inserted between the frames (Thor nton, Pinto, & Sh iffrar, 1998). Since t he perceptual i nterpretation of point-light d isplays r equires spa tially ex tended m otion p rocesses, and s ince a pparent m otion d isplays r equire tem porally ex tended motion integration, this result suggests that observers can integrate human motion, but not object motion, over unusually large spatiotemporal extents.
Embodied Motion Perception
123
The studies described above depended upon different methodologies. N onetheless, t he r esults o f t hese beha vioral st udies co nverge with i maging brain d ata (e.g., Virji-Babul, Cheung, Weeks, Kerns, & Sh iffrar, 2 007) t o su ggest t he s ame conclusion; na mely, t hat t he visual analysis of human movement differs from the visual analysis of ob ject mo vement. This difference appears to be profound since it a ffects e arly v isual p rocesses such a s t he i ntegration o f m otion information over d iscontinuous spatial a nd temporal ex tents. One implication of this difference is that the visual perception of human motion can tolerate more noise than the visual perception of object motion. Su ch r obust p erceptual a nalyses of h uman a ction a llow observers t o ex tract co pious i nformation f rom h ighly deg raded depictions of human action. The goal of the next section of this chapter is to examine some possible factors that might give rise to this impressive perceptual ability. Why Do Action Perception and Object Perception Differ? The previous section outlined some of the evidence suggesting that the visual analysis of human motion differs fundamentally from the visual analysis of object motion. This section will address three possible reasons for this difference. First, human motion is the only category of visual motion that human observers can both produce and perceive. Human observers have an action control system t hat can reproduce t he movements of other pe ople, but not t he movements of c rashing w aves or w ind blown t rees. A s a r esult, i nput f rom a n observer’s own motor system might selectively enhance the perceptual a nalysis of human ac tion (see K noblich chapter for more d iscussion on this topic). Second, as inherently social animals, human observers have a lifetime of experience watching other people move. Thus, extensive visual experience with human action might account for d ifferences be tween t he v isual a nalysis o f ob ject a nd h uman motion. F inally, h uman m ovement c arries m ore soc ially r elevant information than object motion. This raises the question of whether social-emotional processes might contribute to the visual analysis of human motion and thereby differentiate human motion perception from object motion perception. Each of t hese fac tors is considered below.
124
Embodiment, Ego-Space, and Action
Motor Expertise Does t he h uman v isual s ystem t ake adv antage o f t he w ealth o f information a vailable i n t he obs erver’s o wn m otor s ystem d uring the perceptual analysis of other people’s actions? If motor processes contribute to t he v isual a nalysis of human movement, t hen motor activity should be found during the perceptual analysis of human movement but not ob ject move ment. R esearch on m irror ne urons in macaques (e.g., Rizzolatti, Fogassi, & G allese, 2001) and humans (e.g., Iacobo ni, Woods e t a l., 1 999) su pports t his p rediction. M irror neurons, first d iscovered i n t he ventral premotor cortex of t he macaque, respond both when an observer performs an action and when that observer watches someone else perform the same action (Rizzolatti et al., 2001). That is, watching another individual perform some action triggers activation of the observer’s motor representation of that action. Increasing evidence suggests that the perception, interpretation, a nd i dentification o f o ther p eople’s a ctions d epend upon activation of the observer’s motor planning system (e.g., Blake & Shiffrar, 2007; Prinz, 1997; Wilson, 2001). Other i maging w ork ha s d irectly co mpared t he per ception o f human motion and object motion. In one such st udy, PET activity was r ecorded wh ile subjects v iewed a pparent motion s equences of human and object movement (Stevens, Fonlupt, Shiffrar, & Dec ety, 2000). A s b efore ( Shiffrar & F reyd, 1 990, 1 993), t his st udy u sed two t ypes of apparent motion stimuli. Human action picture pairs showed a h uman model in different positions in which t he biomechanically p ossible p aths of move ment c onflicted wi th th e s hortest, physically impossible paths (see Figure 4.4). The s econd s et o f picture p airs c onsisted of nonliving objects p ositioned s o t hat t he perception of the shortest path of apparent motion would require one so lid ob ject t o pa ss t hrough a nother so lid ob ject. W hen t he human picture pairs were presented slowly (with SOAs of 400 ms or more), subjects perceived biomechanically possible paths of apparent h uman m otion. U nder t hese co nditions, P ET s cans i ndicated significant bilateral activity in observers’ primary motor cortex and cerebellum. However, when these same picture pairs were presented more rapidly (with SOAs less than 300 ms), subjects then perceived the shortest a nd physically i mpossible paths of human movement, and selective motor system activity was no longer found (Stevens et al., 2 000). C onversely, when t he pictures of objects were presented
Embodied Motion Perception
125
at either fast or slow SOAs, no motor system activation was indicated. Thus, t he obs ervation of p hysically pos sible ac tions t riggers activation o f t he obs erver’s ac tion co ntrol s ystem. This conclusion is consistent with common coding theory (Prinz, 1997) in suggesting that perceptual and motor systems share representations for the same actions. Indeed, much evidence indicates that common motor areas are active during t he observation a nd t he planning of movement (e.g., Dec ety & Gr ezes, 1999). S ince m otor s ystem ac tivation does n ot oc cur d uring t he obs ervation o f b iomechanically i mpossible actions (Stevens et al., 2000), it appears that the ability to plan an observed action is critical (Wilson, 2001). The abo ve n europhysiological findings a re n ot i mmune t o a n alternative interpretation. That is, does motor system activation during action perception actually alter perceptual processes? Or, does it reflect some automatic planning of motor responses to the observed actions? Psychophysical studies indicate that motor processes significantly impact perceptual processes and that this perceptual-motor interaction d ifferentiates h uman m otion p erception fr om o ther categories of visual motion perception. Studies of the two-thirds power law p rovide a cl ear ex ample (e.g., V iviani & S tucchi, 1992). This law describes t he a lgebraic relationship between t he i nstantaneous velocity and radius of curvature for trajectories produced by unconstrained h uman m ovements. A n ex tensive s eries o f ps ychophysical st udies has i ndicated t hat v isual perception is optimal for movements that are consistent with the two-thirds power law. Movements that violate this fundamental principle of human movement are not accurately perceived (Viviani, 2002). Thus, it can be a rgued that the human visual system is optimized for the analysis of human generated movements. This optimization suggests that motor system activation during action perception reflects the impact of motor processes on perceptual processes. Additional support for the hypothesis that motor processes impact perceptual processes during ac tion perception comes f rom st udies of perception by acting, rather than passive, observers. These studies show than the perception of other people’s actions depends upon the ac tions bei ng per formed b y t he obs erver. F or ex ample, wh en observers per form a spe ed d iscrimination t ask t hat r equires t hem to compare the gait speeds of two point-light walkers, their perceptual sensitivity to gait speed depends upon whether they themselves stand, walk, or ride a bicycle during task performance (Jacobs & Shif-
126
Embodiment, Ego-Space, and Action
frar, 2005). Walking observers demonstrated the poorest perceptual sensitivity t o t he spe eds o f o ther pe ople’s g aits. This performance decrease l ikely r eflects co mpeting dema nds f or ac cess t o sha red representations (e.g., Pr inz, 1997) t hat code f or both t he execution and perception of t he s ame ac tion. O ther st udies have shown t hat the per ceptual ab ility t o i nterpret t he w eight o f a bo x bei ng l ifted by another person depends on the weight of the box being lifted by the observer (Hamilton, Wolpert, & Frith, 2004). Thus, moving and stationary observers can perceive human movement very differently. This difference p rovides fu rther s upport f or th e h ypothesis th at motor processes impact the visual analysis of human action. Under r eal w orld co nditions, obs ervers f requently a nalyze t he movements o f o ther pe ople f or t he p urpose o f ac tion coo rdination. This process requires moving observers to compare their own actions w ith t he ac tions o f o ther pe ople. P sychophysical r esearch indicates t hat when obs ervers move, t heir ability to compare t heir own ac tions w ith t he ac tions of a nother person depends upon t he potential for action coordination. When action coordination is possible, visual analyses of gait speed depend upon the observer’s own gait spe ed, ex ertion l evel, a nd p rior w alking ex perience ( Jacobs & Shiffrar, 2 005). C onversely, wh en t he s ame g ait spe ed d iscriminations are performed under conditions in which action coordination is i mpossible, g ait spe ed per ception i s i ndependent o f t he obs erver’s gait speed, effort, a nd prior walking ex perience. Thu s, moving observers perform visual analyses of human movement that are distinct from the visual analyses performed by stationary, noninteractive observers. Finally, re cent re search s hows t hat motor le arning si gnificantly influences ac tion per ception. F or ex ample, obs ervers c an i mprove their perceptual sensitivity to unusual actions by repeatedly executing t hose ac tions wh ile b lindfolded ( Casile & Gi ese, 2 006). Thus, motor learning enhances visual sensitivity to the motor behaviors of other people. Consistent with this, motor system activation is found when ballet and capoeira dancers watch movies of other people performing the dance style that they themselves perform (Calvo-Merino, Glaser, Grezes, Passingham, & Haggard, 2005). Furthermore, studies of patients w ith disorders of motor behavior support t he impact of motor processes on action perception. One such st udy assessed the visual analysis of human action by children with motor impairments resulting from Down’s syndrome (Virji-Babul, Kerns, Zhou, Kapur, & Shiffrar, 2006). In these studies, children with Down’s syndrome
Embodied Motion Perception
127
and matched controls made perceptual judgments of point-light displays of moving people and objects. Children with Down’s syndrome demonstrated significant decrements in their perceptual analyses of point-light actions. Does perception-action coupling require ac tion per formance or simply action representation? Since common motor areas are active during action observation and planning, the ability to plan an action may be sufficient to differentiate action perception and object perception. This issue was addressed when observers born without hands were asked to interpret apparent motion displays of hand rotations (Funk, Shiffrar, & Brugger, 2 005). The perception of apparent hand rotation depended upon whether observers had a mental representation or “ body schema” of their own hands. An individual who was born without hands, and apparently lacking a hand schema, consistently perceived biomechanically impossible paths of apparent hand rotation at a ll SOAs. Conversely, another individual who was born without hands but nonetheless having hand schema (as assessed by the presence of phantom sensations of congenitally m issing l imbs, among other measures) did not differ from “handed” control observers in her perception of paths of apparent hand rotation. That is, at short SOAs, sh e a nd co ntrol obs ervers r eported t he per ception o f physically impossible paths of apparent hand rotation. At long SOAs, she a nd co ntrol obs ervers r eported t he per ception o f b iomechanically possible paths of apparent hand rotation. Evidently, the ability to represent executable actions constrains the ability to perceptually interpret similar actions performed by other people (Shiffrar, 2006). Thus, one need not physically execute an action to alter one’s perception of that same action in others. Instead, the ability to represent an action appears to be sufficient.
Visual Expertise According t o J ohansson ( 1973), o bservers f orm v ivid per cepts o f human m ovement f rom po int-light d isplays bec ause t hey ha ve extensive prior experience watching or perceptually “overlearning” human m ovements. W hile J ohansson’s t heory p roposed t hat t he same grouping principles apply to both human and object motion, he nonetheless argued that the vividness with which point-light displays of human action are perceived results from observers’ greater visual experience with human motion.
128
Embodiment, Ego-Space, and Action
A more recent study supports the visual experience hypothesis of human motion perception (Bülthoff, Bülthoff, & Sinha, 1998). In this experiment, observers viewed point-light displays of human walkers and rated t he degree to which each figure looked human. Displays that retained t heir normal 2D p rojection, even when scrambled i n depth, were rated as highly human. That i s, de spite co nsiderable anomalies in three-dimensional structure, observers still perceived the po int-light h uman w alkers a s h uman. S uch d ata su ggest t hat visual e xperience w ith t he h uman f orm sig nificantly i mpacts t he perceptual organization of human movement. Indeed, visual experience was strong enough to override substantial depth distortions. Eleanor Gibson argued that only behaviorally relevant experience influences perceptual sensitivity (Gibson, 1969). Consistent with this, v isual e xperience i nfluences ac tion per ception u nder beha viorally relevant experimental conditions. For example, in one study, observers v iewed po int-light d isplays o f w alking f riends ( Jacobs, Pinto, & Shiffrar, 2004). Gait type was manipulated such that pointlight f riends per formed co mmonly oc curring g aits a nd r are g aits. Observers’ ab ility t o r eport t he i dentity o f e ach po int-light w alker depended u pon t he f requency o f g ait oc currence. Walker i dentification w as s ignificantly b etter wi th c ommon g aits th an wi th r are gaits. Since observers presumably have more real world experience watching their friends walk with common gaits, such d ata support the hypothesis that visual sensitivity to human movement depends upon v isual ex perience. S ince obs ervers ha ve a l ifetime o f ex perience watching other people move, such e xtensive visual experience with human movement might help to differentiate it from the visual perception of object movement. Consistent with this, imaging data indicate that neural activity in an area known to process human motion, t he poster ior r egion o f t he su perior tem poral su lcus (e.g., Bonda e t a l., 1996; Or am & P errett, 1994), i s m odulated b y v isual experience (Grossman & Bla ke, 2001). Furthermore, computational models have shown t hat numerous a spects of human motion perception c an be ex plained by v isual ex perience a lone (e.g., Giese & Poggio, 2003).
Motor Experience vs. Visual Experience The above studies suggest that the visual analysis of human movement d epends o n bo th v isual ex perience a nd m otor ex perience.
Embodied Motion Perception
129
Which t ype of ex perience ha s t he la rger i mpact on ac tion perception? O ne s tudy e xamined t his q uestion b y p resenting o bservers with point-light movies of their own movements, the movements of their friends, and the movements of strangers (Loula, Prasad, Harber, & Shiffrar, 2005). Every observer has the greatest motor experience with his or her own actions. Observers have the greatest visual experience with saggital views of the actions of frequently observed friends. Since observers have neither specific motor nor visual experience with the actions of strangers, stranger motion can serve as a ba seline co ntrol st imulus. T o t he ex tent t hat m otor ex perience defines the visual analysis of action, observers should be best able to recognize their own movements. If view dependent visual experience is the primary determinate of visual sensitivity to human movement, then observers should be m ost sensitive to t he movements of t heir friends. Finally, the relative impact of motor experience and visual experience on the visual analysis of human motion can be assessed by the relative magnitude of these two effects. To test these predictions, point-light displays were created of participants, their friends, and strangers performing a variety of actions. Participants were recruited so that everyone in each triplet had the same gender and body type to ensure that neither gender (Pollick et al., 2005) nor weight (Runeson & Frykholm, 1983) could serve as the basis for discrimination. During stimulus construction, participants w ere t old t hat t hey w ere a ssisting i n t he c reation o f st imuli for a study of action, rather than actor, perception. As a result, participants naturally mimicked the action styles modeled by the same experimenter. Two to t hree months a fter t he point-light d isplays were created, participants returned to the lab to perform a t wo alternative forced choice identity discrimination task. Each trial consisted of two short movies depicting two different point-light defined actions (e.g., someone walking in movie 1 and someone jumping in movie 2). On half of the trials, the two movies depicted the same person. This person could have been the observer, the observer’s friend, or the observer’s matched st ranger. On t he o ther ha lf o f t he t rials, t he t wo m ovies depicted t wo d ifferent pe ople. A fter v iewing bo th m ovies, obs ervers reported w ith a b utton press whether t he t wo movies depicted the s ame perso n o r t wo d ifferent pe ople. Obs ervers dem onstrated the greatest perceptual sensitivity to point-light displays of their own actions. S ince obs ervers ha ve t he g reatest m otor ex perience w ith their own movements, this result supports the hypothesis that motor
130
Embodiment, Ego-Space, and Action
processes co ntribute t o t he v isual a nalysis o f h uman m ovement (e.g., P rinz, 1 997; S hiffrar & P into, 2002; Viviani & S tucchi, 1992). Importantly, task performance with the friend stimuli was superior to per formance w ith t he st ranger st imuli. This result supports t he hypothesis that visual sensitivity to human movement depends upon visual e xperience (e.g., Bu lthoff e t a l., 1998; Giese & P oggio, 2 003; Johansson, 1973). Lastly, the relative sizes of the effects indicated that motor ex perience i s a s ignificantly la rger contributor to t he v isual analysis of human movement, at least in the case of identity perception. The results of a subs equent series of control studies suggested that this pattern of results depends upon motion processes, stimulus orientation, and action type (Loula et al., 2005). The ability to differentiate self from other generated actions may depend u pon a n obs erver’s ab ility t o p redict t he o utcome o f a n observed action. Indeed, observers are better able to predict the outcomes of their own actions. For example, when participants viewed videos of themselves and strangers throwing darts at a t arget, they were better able to predict the results of their own dart throws than the dart throws of strangers (Knoblich & Flach, 2001). Taken together, these results suggest that motor processes are a major contributor to the visual analysis of human movement. Controlling for Viewpoint Dependent Visual Experience While t he above findings paint a compelling picture of the importance o f m otor ex perience i n t he per ceptual a nalysis o f h uman action, a potentially important factor muddles this picture. Simply put, motor experience is inherently confounded with visual experience. Every time you gesture or walk down the stairs, you see your own actions. This raises the question of whether enhanced perceptual s ensitivity t o o ne’s own ac tions m ight r esult, f ully o r i n pa rt, from t he ma ssive obs ervational ex perience t hat pe ople ha ve w ith their own actions. The frequencies with which one produces and perceives one’s own actions a re na turally co nfounded. V iewpoint ma nipulations offer a m eans o f deco upling t hem. Obs ervers ha ve a l ifetime o f ex perience perceiving their own actions from an egocentric or first-person viewpoint ( Figure 4 .3B). C onversely, a side f rom w atching o neself in a m irror, obs ervers ha ve l ittle ex perience per ceiving t heir o wn actions f rom a n a llocentric or t hird-person v iewpoint. O bviously, the reverse pattern holds for the perception of other people’s actions
Embodied Motion Perception
131
since observers view others, by definition, from a t hird-person perspective (Figure 4.3A). To the extent that viewpoint dependent visual experience defines performance in identity perception tasks, observers sh ould sh ow t he g reatest per ceptual s ensitivity t o first-person views of their own actions. Conversely, to the extent that observers construct representations of themselves with the same neural processes with which they represent other people, observers should show the greatest perceptual sensitivity to third-person views of their own actions and the actions of other people (Jeannerod, 2003). To t est th ese p redictions, p articipants vi ewed p oint-light m ovies of themselves, friends, and strangers performing various actions from first-person and third-person viewpoints. Performance on the same identity discrimination task described above suggests that, at least for t he purpose of identity perception, observers demonstrate significantly greater perceptual sensitivity to their own actions from the th ird-person vi ew th an fr om th e first perso n v iew. Thus even though observers have the most visual experience with egocentric views of their own actions, self-recognition from those views is very poor (Prasad & Sh iff rar, 2008). This result indicates t hat enhanced self-recognition cannot be attributed to visual experience. What about Bodily Form? The proposal that observers use their own motor system to analyze the actions of other people implicitly assumes that observers somehow overlook significant differences between their own bodies and other pe ople’s bod ies. That i s, t he ab ility t o ma p o ne’s own m otor experience o nto so meone el se’s ac tions n ecessitates a ma tching o r alignment o f ex ecutable a nd per ceived ac tions. De velopmental research suggests t hat people may come i nto t he world primed for such e gocentric b ody ma tching ( Meltzoff & M oore, 2 002). P atient research su ggests t hat t he de tection o f a co rrespondence be tween observed motion patterns and the observer’s own body representation triggers motor-based analyses of human motion (e.g., Funk et a l., 2 005). W hen n o co rrespondence c an be f ound be tween a n observer’s representation of his or her own body and that observer’s perception of other people’s bodily actions, those actions appear to be analyzed as objects; that is, without the benefit of motor processes (Funk et al., 2005). Similarly, when observers view point-light depictions of a moving actor in which the actor’s limbs are re-positioned so that they are inconsistent with the normal hierarchical structure
132
Embodiment, Ego-Space, and Action
of h uman bodies, per ceptual s ensitivity t o t hat m otion d rops s ignificantly (Pinto & Shiffrar, 1999). Similar results are found with the perception and representation of static body postures (Reed & Farah, 1995; see also the chapter by Reed and colleagues in this volume). Obviously, d ifferent p eople h ave d ifferently sha ped bod ies. I f a mechanism ex ists t o find co rrespondences be tween a n obs erver’s own body schema and percepts of other people’s actions, then this mechanism must be able to tolerate commonly occurring variations in people’s bod ies. W hile body motion d epends upon body shape, it remains to be s een how observers perceive human actions across commonly occurring variations in body shape. The existence of mirror neurons in macaque monkeys that respond during the monkey’s production of an action and during the perception of a human performing that same action (Rizzolatti et al., 2001) suggests b odily form differences c an be d ismissed. Mac aques a nd humans d iffer significantly i n body h eight, body w eight, a nd l imb proportions. Yet, m irror ne urons app ear c apable of c oding a ction similarities across these body differences. It may be t hat the system that matches an observer’s own body representation with observed actions relies on low spatial frequency cues to global body structure (Heptulla-Chatterjee et al., 1996). If so, this might explain why mirror neurons respond as they do and why, for example, observers can be “fooled” by appropriately positioned rubber hands (Botvinick & Cohen, 1998). This body ma tching process should fa il, i n a g raded fashion, whenever the low spatial frequency content of an observed body d iffers subst antially f rom t he obs erver’s i nternal r epresentations of his or her own body (Cohen, 2002; Funk et al., 2005; Pinto & Sh iffrar, 1999). If the matching process outlined above actually exists, then observers should be able to overlook bodily differences during the perceptual analysis of human actions. Previous research findings su ggest t hat observers c an recognize their own actions in the absence of bodily form cues (e.g., Knoblich & Prinz, 2001) because velocity changes alone may be su fficient for identity perception (e.g., Knoblich & Flach, 2001). If action recognition depends upon an observer’s ability to plan the actions that they observe, then observers should be able to identify their own actions even when those actions are presented on someone else’s body. Sapna Prasad and her colleagues tested this hypothesis with the identity d iscrimination t ask de scribed abo ve, b ut m odified such that a set of d ifferent bod ies were superimposed on t he ac tions of
Embodied Motion Perception
133
the obs ervers, t heir f riends, a nd ma tched st rangers. On e ach t rial of t his t ask, obs ervers v iewed a sh ort m ovie o f t heir o wn ac tions, the actions of their friends, or the actions of their assigned strangers. These actions appeared on either skeletal bodies (containing no form cues to gender or a specific identity), humanoid b odies (containing form cues to gender but not identity), and character bodies (containing form cues to both gender and identity). After watching each m ovie, obs ervers r eported wh o t hey t hought had o riginally produced the action depicted in that movie. Identification performance in this task was found to be independent of body form cues to gender or identity. That is, with all three body types, observers demonstrated the greatest sensitivity to their own actions. Thu s, observers can overlook commonly occurring d ifferences i n body f orm as they map representations of their own executable actions onto their perceptions of the actions performed by other people. Social-Emotional Processes Do social processes contribute to the visual analysis of human movement? Since human action contains more socially relevant information than any other category of motion stimuli, contributions from social processes to the perception of human action might help to differentiate action perception from object perception. While t he q uestion o f wh ether soc ial p rocesses co ntribute t o action perception has been largely ignored in behavioral studies of visual per ception, i t i s i ncreasingly st udied i n t he r apidly em erging field of soc ial neuroscience. For ex ample, ac tivity i n t he superior temporal sulcus (STS) is associated with the visual analysis of human movement (e.g., Grossman & Blake, 2002; Hasson, Nir, Levy, Fuhrmann, & Ma lach, 2004). The STS also plays an important role in soc ial p rocesses ( Iacoboni e t a l., 2 004) such a s t he i nference o f other people’s mental states (Frith & Frith, 1999; Morris, Pelphrey, & McCarthy, 2005). Furthermore, STS activity has been found during social judgments in the absence of bodily motion (Winston, Strange, O’Doherty, & Do lan, 2 002). Thus, t he ST S i s i ncreasingly u nderstood as an area involved in the perceptual analysis of social information (e.g., A llison, P uce, & M cCarthy, 2 000). Human obs ervers readily extract extensive social information such as intent (Runeson & F rykolm, 1 983), soc ial d ominance ( Montepare & Z ebrowitz-
134
Embodiment, Ego-Space, and Action
McArthur, 1988), emotional state (Clarke et al., 2005; Dittrich, Troscianko, Lea, & Morgan, 1996), gender (Kozlowski & Cutting, 1977; Pollick e t a l., 2 005), a nd s exual orientation (Ambady, Ha llahan, & Conner, 1999) from human motion. When considered together, such findings suggest t hat social-emotional processes may contribute to the v isual analysis of human movement. The t wo series of psychophysical studies described below tested this hypothesis. Social Context and Apparent Human Motion To i nvestigate wh ether soc ial p rocesses i mpact t he v isual a nalysis of h uman m otion, obs ervers v iewed t wo-frame a pparent m otion sequences i n wh ich t he s ame h uman ac tions w ere p resented i n social a nd n onsocial co ntexts ( Chouchourelou & Sh iffrar, 2008). This approach is based on the assumption that any differences in motion perception across context variations must be attributable to t he co ntexts s ince t he ac tion i s u nchanged. The i nteractions o f two people were filmed and two frame apparent motion sequences were c reated f rom r esulting m ovies. These picture p airs were f urther edited so that everything but the displaced images of one actor was removed from both pictures, as shown in Figure 4.5. From this, four conditions were rendered. In the no-context condition, the only the moving actor was displayed. In the human context condition, a single stationary picture of a social human partner was added. For the s ingle-object co ndition, a r efrigerator a ppeared i n t he st ationary ac tor’s pos itions. F inally, i n t he ob ject-specific co ndition, a n object closely related to e ach specific action was added. Thu s, each participant v iewed i dentical h uman d isplacements a gainst o ne o f four d ifferent contexts. This also provided a test of the hypothesis that the visual analysis of person directed actions differs from the visual analysis of object directed actions (Jacobs & Jeannerod, 2003). If social processes contribute to the visual analysis of human movement, then assessments of apparent motion strength should be context dependent. Naïve observers in this experiment were told that they were participating in a study of computer monitor quality. Participants were informed of the phenomenon of apparent motion. They then viewed pairs of sequentially presented images of human movements across interstimulus i ntervals r anging f rom 1 0 t o 6 00 ms a nd r ated t he strength of apparent motion on each trial. The results indicated that the same displacements of apparent human motion are experienced
Embodied Motion Perception
135
Human Context
No Context
Fixed Object Context
Object Specific Context Figure 4.5 The two images of the actors were taken from movies of natural social interactions. Across t he 4 pic ture pairs, on ly t he i mage of t he actor on t he r ight changes. A ll ot her i mages a re s tationary. The mov ing a ctor app ears i n a h uman context i n t he top ro w, d evoid of c ontext i n t he s econd row, i n t he c ontext of a refrigerator in the third row, and in the context of an action appropriate punching bag in the bottom row.
very differently as a function of the context. Participants rated human actions d irected toward a nother p erson a s providing more mot ion than the identical actions directed towards objects or nothing. Since physically i dentical d isplacements w ere per ceived differently as a
136
Embodiment, Ego-Space, and Action
function of their social context, these results support the hypothesis that social processes significantly impact human action perception (Chouchourelou & Shiffrar, 2008). In a f ollow-up control study, apparent motion was assessed with two d ifferent s ets o f ac tions: perso n-directed a nd ob ject-directed. These t wo s ets o f ac tions p roduced eq uivalent r atings o f a pparent motion when shown in isolation. However, when a stationary context was added to each, such that the depicted actions were directed toward people or objects, apparent motion ratings diverged. Persondirected actions received significantly s tronger r atings of appa rent motion t han object-directed ac tions. These results f urther support the hypothesis that social processes, per se, facilitate the visual perception of human action. Perceptual Sensitivity to Emotional Actions Extensive n europhysiological d ata po int t o subst antial i nterconnections between the neural areas involved in the visual analysis of point-light displays of human movement (e.g., STS) and t he limbic areas (e.g., amygdala) underlying the analysis of emotion (Brothers, 1997; P uce & P errett, 2 003). These i nterconnections could s erve at least t wo i nformation p rocessing c ircuits. F irst, v isual a nalyses o f human ac tion i n t he ST S co uld be pa ssed o n a fter t hey a re co mpleted to the amygdala for subsequent emotional analysis. According to this model, action detection should be independent of emotional processes since visual processes are completed before emotional processes were initiated. A second possibility is that action analyses in the ST S a re conducted i n i nteractive collaboration w ith emotional processes i n t he a mygdala. F rom t his perspec tive, ac tion de tection should be emotion dependent. Given the role of the amygdala in t hreat de tection ( e.g., A nderson, Ch ristoff, P anitz, De R osa, & Gabrieli, 2003), a ny i nterdependence of ac tion detection a nd emotion should be most evident during the perception of threatening actions. A series of psychophysical studies compared these two hypotheses through an examination of the visual detection of emotional actions (Chouchourelou, Ma tsuka, Ha rber, & Sh iff rar, 20 06). P oint-light movies of walking actors portraying different em otions w ere co nstructed so that each point-light walker’s emotional state was equally recognizable. These st imuli w ere p laced i n spec ially co nstructed
Embodied Motion Perception
137
masks for a walker detection task. On each trial, a point-light walker either was or was not presented w ithin a ma sk of identically moving points. Participants simply reported whether or not they saw a walker. E motion w as n ever d iscussed o r j udged. N onetheless, t he results o f t his st udy i ndicated t hat w alker de tection w as s ignificantly modulated by walker emotion as pa rticipants demonstrated the greatest visual sensitivity to the presence of angry walkers. Thus, emotional body ex pressions c an a ffect t he per ceptual de tection o f human ac tion. Such a depen dence of emotion on ac tion de tection may reflect the existence of an integrated processing circuit between the ST S a nd a mygdala. E nhanced de tection of t hreatening ac tions may represent an important condition under which emotional processes impact perceptual analyses. In sum, emotional processes can define when and how we perceive the actions of other people. Conclusions Taken t ogether, t he ex perimental r esults de scribed abo ve i ndicate that t he v isual per ception o f human m ovement i s a co mplex p henomenon that depends upon multiple factors including motor planning, visual experience, and emotional processes. Such a conclusion directly challenges modular views of the visual system which assume that vision is unaffected by nonvisual processes. Instead, the current results suggest that what we see depends upon what we have seen in the past, how we move, and how people behave socially. These three processes are likely interdependent. For example, visual experience and motor experience naturally covary as observers most frequently see the same actions that they most commonly perform. Furthermore, t he ability t o ma p motor i nformation f rom our own bod ies onto the perceived world likely enables us to become socially attuned beings (see Knoblich chapter). Indeed, we may come into this world ready a nd ab le t o s earch f or s imilarities be tween o ur ac tions a nd those of other people (Meltzoff & Moore, 2002). If so, then the current results can be u nderstood as suggesting that the human visual system is optimized for the organization and analysis of information that matches the observer’s own body. The ultimate result of such a perceptual-motor system is a body-based view of the world (Shiffrar, 2006).
138
Embodiment, Ego-Space, and Action
Acknowledgment This research was supported by NEI grant EY12300. References Adelson, E. H., & Movshon, J. A. (1982). Phenomenal coherence of moving visual patterns. Nature, 300, 523–525. Ahlström, V., Blake, R ., & A hlström, U . (1997). P erception o f b iological motion. Perception, 26, 1539–1548. Allison, T., Puce, A., & McCarthy, G. (2000). Social perception from visual cues: Role of the STS region. Trends in Cognitive Science, 4, 267–278. Ambady, N., Hallahan, M., & Conner, B. (1999). Accuracy of judgments of sexual orientation from thin slices of behavior. Journal of Personality & Social Psychology, 77, 538–547. Anderson, A . K ., C hristoff, K ., Pa nitz, D., De Ro sa, E ., & G abrieli, J. D . (2003). Neural correlates of the automatic processing of threat facial signals. Journal of Neuroscience, 23, 5627–5633. Ash, M. G. (1995). Gestalt psychology in German culture, 1890–1967: Holism and the quest for objectivity. New York: Cambridge University Press. Bertenthal, B. I., & Pinto, J. (1994). Global processing of biological motions. Psychological Science, 5, 221–225. Blake, R ., & Sh iffrar, M. (2007). Perception of human motion. Annual Review of Psychology, 58, 47–74. Bonda, E., Petrides, M., Ostry, D., & Evans, A. (1996). Specific involvement of human parietal systems and the amygdala in the perception of biological motion. Journal of Neuroscience, 16, 3737–3744. Botvinick, M., & Cohen, J. (1998). Rubber hands “feel” touch that eyes see. Nature, 391, 756. Brothers, L. (1997). Friday’s footprint: How society shapes the human mind. London: Oxford University Press. Bülthoff, I., Bülthoff, H., & Sinha, P. (1998). Top-down influences on stereoscopic depth-perception. Nature Neuroscience, 1, 254–257. Burt, P., & Sp erling, G . (1981). T ime, d istance, a nd f eature t rade-offs in visual apparent motion. Psychological Review, 88, 171–195. Calvo-Merino, B., Glaser, D. E., Grèzes, J., Passingham, R. E., & Haggard, P. ( 2005). A ction obs ervation a nd ac quired m otor s kills: A n f MRI study with expert dancers. Cerebral Cortex, 15, 1243–1249. Casile, A., & Giese, M. A. (2006). Non-visual motor learning influences the recognition of biological motion, Current Biology, 16, 69–74. Chouchourelou, A ., Ma tsuka, T., Ha rber, K ., & Sh iff rar, M . ( 2006). The visual analysis of emotional actions. Social Neuroscience, 1, 63–74.
Embodied Motion Perception
139
Chouchourelou, A., & Shiffrar, M. (2008). The social visual system. Manuscript under review. Clarke, T. J., Bradshaw, M. F., Field, D. T., Hampson, S. E., & Rose, D. (2005). The perception of emotion f rom body movement i n point-light d isplays of interpersonal dialogue. Perception, 34, 1171–1180. Cohen, L. R. (2002). The role of experience in the perception of biological motion. Unpublished dissertation. Temple University, Philadelphia. Cutting J. E., & Kozlowski, L.T. (1977). Recognition of friends by their walk: Gait perception without familiarity cues. Bulletin of the Psychonomic Society, 9, 353–356. Cutting, J. E ., Moore, C., & M orrison, R. (1988). Masking the motions of human gait. Perception & Psychophysics, 44, 339–347. Decety, J., & Gre zes, J. ( 1999). Fu nctional a natomy o f e xecution, m ental simulation, observation, and verb generation of actions: A meta-analysis. Human Brain Mapping, 12, 1–19. Dittrich, W. H., Troscianko, T., L ea, S . E . G., & M organ, D. (1996). Perception of emotion from dynamic point-light displays represented in dance. Perception, 25, 727–738. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Frith, C. D., & Frith, U. (1999). Interacting minds—A biological basis. Science, 286, 1692–1695. Funk, M ., Sh iff rar, M ., & B rugger, P. ( 2005). Ha nd m ovement obs ervation by i ndividuals b orn w ithout ha nds: P hantom l imb e xperience constrains visual limb perception. Experimental Brain Research, 164, 341–346. Gibson, E. (1969). Principles of perceptual learning and development. New York: Meredith. Giese, M. A., & Poggio, T. (2003). Neural mechanisms for the recognition of biological movements. Nature Reviews Neuroscience, 4, 179–192. Grossman, E. D., & Blake, R. (2002). Brain areas active during visual perception of biological motion. Neuron, 35, 1157–1165. Hamilton, A ., W olpert, D ., & F rith, U . ( 2004). Y our o wn ac tion i nfluences how you perceive another person’s action. Current Biology, 14, 493–498. Hasson, U., Nir, Y., Levy, I., Fuhrmann, G., & Malach, R. (2004). Intersubject s ynchronization o f c ortical ac tivity d uring na tural v ision. Science, 303, 1634–1640. Heptulla-Chatterjee, S ., F reyd, J., & Sh iffrar, M. (1996). C onfigural processing i n t he p erception o f appa rent b iological m otion. Journal of Experimental P sychology: H uman P erception & P erformance, 22 , 916–929. Hildreth, E . (1984). The me asurement of v isual mot ion. C ambridge, M A: MIT Press.
140
Embodiment, Ego-Space, and Action
Hiris, E., Krebeck, A., Edmonds, J., & Stout, A. (2005). What learning to see arbitrary motion tells us about biological motion perception. Journal of E xperimental P sychology: H uman P erception & P erformance, 31, 1096–1106. Hubel, D., & Wiesel, T. (1968). Receptive fields and functional architecture of the monkey striate cortex. Journal of Physiology, 195, 215–243. Iacoboni, M., Lieberman, M., Knowlton, B., Molnar-Szakacs, I., & Moritz, M., Throop, C . J. e t a l. ( 2004). W atching s ocial i nteractions p roduces dorsomedial prefrontal and medial parietal BOLD fMRI signal increases compared to a resting baseline. NeuroImage, 21, 1167–1173. Iacoboni, M., Woods, R . P., Brass, M., Bekkering, H., Ma zziotta, J. C ., & Rizzolatti, G. (1999). Cortical mechanisms of human imitation. Science, 286, 2526–2528. Jacobs, P., & Jeannerod, M. (2003). Ways of seeing. New York: Oxford University Press. Jacobs, A ., P into, J., & Sh iffrar, M . ( 2004). E xperience, c ontext, a nd t he visual perception of human movement. Journal of Experimental Psychology: Human Perception & Performance, 30, 822–835. Jacobs, A., & Sh iffrar, M. (2005). Walking perception by w alking observers. Journal of Experimental Psychology: Human Perception & Performance, 31, 157–169. Jeannerod, M . ( 2003). The m echanism o f s elf-recognition i n h umans. Behavioural Brain Research, 142, 1–15. Johansson, G. (1973). Visual perception of biological motion and a model for its analysis. Perception & Psychophysics, 14, 195–204. Johansson, G . (1976). Spa tio-temporal d ifferentiation a nd i ntegration i n visual m otion p erception: An e xperimental an d t heoretical an alysis of calculus-like functions in visual data processing. Psychological Research, 38, 379–393. Knoblich, G., & Flach, R. (2001). Predicting the effects of actions: Interactions of perception and action. Psychological Science, 12, 467–472. Knoblich, G ., & P rinz, W. ( 2001). Re cognition o f s elf-generated ac tions from kinematic displays of drawing. Journal of Experimental Psychology: Human Perception & Performance, 27, 456–465. Korte, A. (1915). Kinematoskopische Untersuchungen. Zeitschrift fuer Psychologie, 72, 194–296. Kozlowski, L . T., & Cu tting, J. E . (1977). Recognizing t he sex of a w alker from a dy namic point l ight d isplay. Perception & P sychophysics, 21, 575–580. Loula, F., Prasad, S., Harber, K., & Shiff rar, M. (2005). Recognizing people from t heir move ment. J ournal of E xperimental P sychology: H uman Perception & Performance, 31, 210–220. Marey, E. J. (1895/1972). Movement. New Y ork: A rno P ress/New Y ork Times.
Embodied Motion Perception
141
Marr, D. (1982). Vision: A computational investigation into the human representation an d p rocessing of v isual info rmation. S an F rancisco: W. H. Freeman. Meltzoff, A. N., & Moore, M. K. (2002). Imitation, memory, and the representation of persons. Infant Behavior & Development, 25, 39–61. Montepare, J. M., & Zebrowitz-McArthur, L. A. (1988). Impressions of people created by age-related qualities of their gaits. Journal of Personality and Social Psychology, 55, 547–556. Morris, J. P., Pelphrey, K., & M cCarthy, G. (2005). Regional brain activation e voked w hen app roaching a v irtual human o n a v irtual w alk. Journal of Cognitive Neuroscience, 17, 1744–752. Oram, M. W., & Perrett, D. I. (1994). Responses of anterior superior temporal p olysensory ( STPa) n eurons to “ biological m otion” s timuli. Journal of Cognitive Neuroscience, 6, 99–116. Pinto, J., & Sh iffrar, M. (1999). Subconfigurations of t he human form i n the perception of biological motion displays. Acta Psycholgica, 102, 293–318. Pinto, J., & Sh iffrar, M . ( 2008). A c omparison o f t he v isual a nalysis o f point-light displays of human and animal motion. Manuscript under review. Pollick, F. E ., Ke y, J. W ., H eim, K ., & St ringer, R . (2005). G ender re cognition from point-light walkers. Journal of Experimental Psychology: Human Perception & Performance, 31, 1247–1265. Prasad, & Sh iff rar, M . ( 2008). Vie wpoint a nd t he re cognition o f p eople from t heir movements. Journal of E xperimental Psychology: Human Perception & Performance. Prinz ,W. (1997). Perception and action planning. European Journal of Cognitive Psychology, 9, 129–154. Puce, A., & Perrett, D. (2003). Electrophysiology and brain imaging of biological motion. Philosophical Transactions of the Royal Society B Biological Sciences, 358, 435–445. Pylyshyn, Z. (1999). Is vision continuous with cognition? The c ase o f impenetrability of visual perception. Behavioral and Brain Sciences, 22, 341–423. Reed, C . L ., & F arah, M . J. ( 1995). The ps ychological re ality o f t he b ody schema: A te st w ith n ormal pa rticipants. Journal of E xperimental Psychology: Human Perception & Performance, 21, 334–343. Rizzolatti, G., Fogassi, L., & Gallese, V. (2001). Neurophysiological mechanisms underlying the understanding and imitation of action. Nature Reviews Neuroscience, 2, 661–670. Runeson, S., & Frykholm, G. (1981). Visual perception of lifted weight. Journal of E xperimental P sychology: H uman P erception & P erformance, 7, 733–740.
142
Embodiment, Ego-Space, and Action
Runeson, S., & Frykholm, G. (1983). Kinematic specification of dynamics as an informational bias for person-and-action perception: Expectation, gender recognition, and deceptive intent. Journal of Experimental Psychology: General, 112, 585–615. Shepard, R . N. (1984). E cological c onstraints on internal representation: Resonant kinematics of perceiving, imagining, thinking, and dreaming. Psychological Review, 91, 417–447. Shiff rar, M . ( 2001). M ovement a nd e vent p erception. I n B . G oldstein (Ed.), The Bl ackwell Handbook of P erception (pp. 2 37–272). O xford: Blackwell. Shiff rar, M . ( 2006). B ody-based v iews o f t he w orld. I n G . K noblich, M . Grosjean, I. Thornton, & M. Shiffrar (Eds.), Perception of the human body f rom th e in side out (pp. 1 35–146). O xford: O xford U niversity Press. Shiffrar, M ., & F reyd, J. J. ( 1990). A pparent m otion o f t he h uman b ody. Psychological Science, 1, 257–264. Shiff rar, M., & F reyd, J. (1993). Timing a nd apparent motion path choice with human body photographs. Psychological Science, 4, 379–384. Shiff rar, M., & Lorenceau, J. (1996). Increased motion linking across edges with decreased luminance contrast, edge width and duration. Vision Research, 36, 2061–2067. Shiff rar, M., Lichtey, L ., & H eptulla-Chatterjee, S. (1997). The perception of biological motion across apertures. Perception & Psychophysics, 59, 51–59. Shiff rar, M., & Pinto, J. (2002). The visual analysis of bodily motion. In W. Prinz & B . Hommel (Eds.), Common mechanisms in pe rception and action: A ttention an d pe rformance (Vol. 1 9, pp . 381–399). O xford: Oxford University Press. Stevens, J. A ., F onlupt, P., Sh iff rar, M ., & De cety, J. ( 2000). N ew a spects of motion perception: selective neural encoding of apparent human movements. NeuroReport, 11, 109–115. Thornton, I . M ., P into, J., & Sh iff rar, M . (1998). The v isual perception of human locomotion. Cognitive Neuropsychology, 15, 535–552. Virji-Babul, N., C heung, T., Weeks, D., Ker ns, K ., & Sh iffrar, M. (2007). Neural activity involved in the perception of human and meaningful object motion, Neuroreport, 18, 1125–1128. Virji-Babul, N., Kerns, K., Zhou, E., Kapur, A., & Shiff rar, M. (2006). Perceptual-motor d eficits in children with Downs syndrome: Implications f or in tervention. Down S yndrome Re search an d P ractice, 1 0, 74–82. Viviani, P. (2002). Motor competence in the perception of dynamic events: A tutorial. In W Prinz & B. Hommel (Eds.), Common mechanisms in perception and action: Attention and performance (Vol. 19, pp. 406– 442). Oxford: Oxford University Press.
Embodied Motion Perception
143
Viviani, P., & Stucchi, N. (1992). Biological movements look uniform: Evidence of motor-perceptual interactions. Journal of Experimental Psychology: Human Perception & Performance, 18, 603–623. Wallach, H . (1976). O n p erceived i dentity: I . The d irection o f m otion o f straight lines. In H. Wallach (Ed.), On perception (pp. 201–216). New York: Quadrangle/The New York Times. Wertheimer, M. (1912). Experimentelle stuidien uber das Sehen von Beuegung. Zeitschrift fuer Psychologie, 61, 161–265. Wilson, H ., F errera, V . & Y o, C . ( 1992). A ps ychophysically m otivated model for two-dimensional motion perception. Visual Neuroscience, 9, 79–97. Wilson, M . ( 2001). P erceiving i mitatable s timuli: C onsequences o f i somorphism b etween i nput a nd o utput. Psychological Bul letin, 1 27, 543–553. Winston, J. S ., St range, B. A ., O’Doherty, J., & Dol an, R . J. ( 2002). Automatic and intentional brain responses during evaluation of trustworthiness of faces. Nature Neuroscience, 5, 277–283.
5 The Embodied Actor in Multiple Frames of Reference
Roberta L. Klatzky and Bing Wu
Humans a nd other ac tive organisms operate i n spac e. They orient and l ocomote t o de sired ob jects; i f su itably eq uipped, t hey g rasp and manipulate. The purpose of the Carnegie Symposium reported here was to situate the study of perception, cognition, and action in the full context of an embodied actor, an entity that represents itself within a spatial and social world and behaves accordingly. The present chapter concerns situations in which the actor’s representation o f h is o r h er p hysical l ocation a nd o rientation m ust be r elated t o a nother f ormulation o f spac e, wh ich i s i mposed b y task dema nds. This brings up the problem of aligning frames of reference. The chapter begins with a general formulation of spatial frames of reference, based on an approach originally offered by Klatzky (1998). We will suggest some general features that can be used to distinguish among frames of reference, and we will provide some examples of experiments that allow us to make these distinctions. We will then turn to the situation described by the title of this chapter, where an actor is required to relate his location or body position in the world to ot her, t ask-specified f rames. That i s, t he body-relative a nd t askrelative frames must be aligned. Some of these alignment problems turn out to be very difficult. 145
146
Embodiment, Ego-Space, and Action
It’s i mportant t o n ote t hat m obile, s ensate o rganisms ha ve specialized n eurological m echanisms f or so me p roblems t hat r equire aligning reference frames (e.g., Merriam, Genovese, & Colby, 2003). When o ne’s h ead t urns, t he co ntinued per ception o f a st ationary world arises because changes in object positions within the field of view are coordinated with expectations from the changes in the neck and torso. This is coordination across body-defined frames of reference. When a person moves within the world, sensing of optical flow and proprioceptive cues allow for updating of body position relative to world locations. Whereas these forms of reference-frame coordination oc cur a utomatically a nd w ith l ittle sig n o f cog nitive e ffort, why a re o ther f orms o f coo rdination so d ifficult? We su ggest t hat difficulties arise when a task imposes its own intrinsic frame of representation, which must be made congruent with, or aligned with, a reference frame intrinsic to the person performing the task. Complexities of achieving this congruence will be illustrated by a variety of tasks, including everyday situations. Some of the tasks we describe will demonstrate successful alignment of reference frames; others will be mired in conflict or compromise. Our discussion will suggest two general factors that appear, from the literature, to affect the d ifficulty of t he a lignment process a nd w ill relate t hem t o t he component processes o f a lignment. We c all t hese a llocentric layer and obliqueness. Allocentric layer refers to the separation of the taskdefined frame from the task performer a long a s chematic ordering that progresses from the immediate environment, through virtual and imagined environments, to tasks in which the operative frame is s pecified b y ob jects, a nd st ill f urther ex tending t o m ore co mplex object arrangements. The second factor producing difficulty in alignment, obliqueness, points to situations in which the self-defined frame and the task-defined frame have an angular separation.
What Is a Frame of Reference? Broadly considered, a f rame of reference is a means of representing the spatial position of an entity. It defines a s et of parameters that, when bound t o speci fic v alues l ocalize a nd orient a n entity i n t he space. What may come to mind immediately upon hearing the term frame of reference are the commonly used polar and Cartesian coordinate systems. On a plane, the r and θ of the polar system and the x,y of the Cartesian system are parameters.
The Embodied Actor in Multiple Frames of Reference
147
There are several types of spatial parameters. One parameter set is used to define a point location in space, for example, <x,y,z> i n Euclidean s pace. M ore p arameters c ome i nto pl ay w hen we move from a po int l ocation t o a n o bject, wh ich i s a co llection o f po ints that c an be o rganized a round principal a xes. A perso n i n t he 3 -D world has three intrinsically defined axes: frontal, sagittal, and gravitational. Full localization of an object in space requires parameters that specify the angular values of its axes relative to the axes of the space. In 3-D space, these are sometimes called pitch, roll, and yaw. In an airplane, pitch changes as the nose goes up and down; roll corresponds to rotation around the fuselage; and yaw is rotation around the gravitational axis. For a mobile object on a plane, the yaw parameter corresponds to the angle of turning left and right; it is sometimes referred to as heading. W hile p eople ge nerally fe el c omfortable rot ating a round t heir gravitational axes to change heading, they find rotations around the sagittal axis (as in cartwheels) or the frontal axis (somersaults) more unsettling. As we shall see, oblique orientation of a person’s intrinsic axes relative to a task-defined space can also produce uneasiness. When t here a re t wo po int l ocations i n spac e ( which m ight be the c enters o f t wo ob jects), i nter-location r elations c an be pa rameterized by di stance and b earing. These pa rameters a re co mputed from t he l ocation a nd o rientation pa rameters b y so me r ule. F or example, Euclidean distance can be computed from Cartesian coordinates. The parameters that define location and orientation in the frame of reference constrain the computation and its outcome. For example, if we consider a city-block space, where a location must be “snapped i nto” t he vertices of blocks, d istance m ight be co mputed by summing the blocks that separate two locations or by a Euclidean distance computation that uses the block separation as input. The bearing from point A to point B is generally defined as the angular separation between a vector from A to B and a vector from A pointing along a reference direction. Clearly, then, the choice of reference direction will affect the value of the bearing. Our definition of a frame of reference emphasizes the parameters that localize points and objects in space and define interobject relations. It does not specify the processes that establish the parameters or operate on them; for example, to compute distances or orientations. Euclidian distance, for example, may be computed by algebra or by mental scanning. Processing outcomes are not necessarily veridical. For ex ample, per ceptually r epresented spac e ma y be a nisotropic;
148
Embodiment, Ego-Space, and Action
that is, its metric properties may vary with direction (e.g., Wagner, 1985). A lthough t hese a spects o f m ental spa tial r epresentation a re important a nd fa scinating, t hey l ie apart f rom t he spec ification of the parameters themselves.
Allocentric and Egocentric Frames To illustrate the role of parameters in defining a frame of reference, consider a co mmon distinction t hat is made be tween t wo k inds of frames that might be used to locate an actor in the world—and that will be i mportant to t he c urrent d iscussion. The so -called egocentric frame is centered on the actor, whereas the allocentric frame is centered on external features of the world. This distinction is essentially ba sed o n t he coo rdinate s ystem t o wh ich spa tial pa rameters are referred. In the egocentric frame, the parameter values are filled with reference to a location on the actor’s body, akin to a personal center of mass. The location on the body that grounds the egocentric frame of reference is called the “egocenter.” The location of the egocenter in the body ha s been investigated and found to vary with the sensory modality a nd o ther t ask-related fac tors. The egocenter established by binocular vision, for example, has been found in some tasks to lie between t he e yes (e.g., Enright, 1998). The egocenter for touch ha s been f ound t o v ary w ith post ure; f or ex ample, t he ha ptic eg ocenter for points within reach on a table-top plane is shifted away from midline toward t he shoulder of t he reaching a rm (Haggard, Newman, Bl undell, & A ndrew, 2 000). These v ariations n otwithstanding, we will refer to any frame of reference that binds its values with respect to a location on the actor as egocentric. The second frame of reference that can be used to locate the actor in the world, the allocentric frame, is parameterized by an extrinsically defined coordinate system. Maps, for example, refer directions to magnetic North/South and orthogonal axes. When seeing objects in a room, people may be induced to define an allocentric frame by the objects’ layout or by the walls of the room (Shelton & McNamara, 2001). Figure 5 .1 ( based o n K latzky, 1998) i llustrates t hese d ifferences between the egocentric and allocentric frames of reference. The figure r efers t o a n o rganism c alled “ Ego” t raveling i n a flat s pace of
The Embodied Actor in Multiple Frames of Reference
149
Figure 5.1 Egocentric (left) and allocentric (right) frames of reference. Adapted from Klatzky, 1998, Fig. 1, with kind permission of Springer Science and Business Media.
point objects. Ego has an intrinsic axis of orientation, which serves to de fine a ngles i n a po lar coo rdinate s ystem, a nd a n eg ocenter, which serves as the origin of a polar vector. Ego also exists in a world in which there is a reference direction called North. This direction, along with an extrinsically defined origin, serves as the axis for an allocentric frame of reference that uses polar coordinates. Under these assumptions, we can define a set of measures for each frame of reference, as follows. In the egocentric frame, any point in space is localized relative to Ego’s egocenter by its d istance a nd its angle relative to Ego’s a xis. These are called t he egocentric distance and the egocentric bearing of the point. As Ego moves within space, these parameters must continually be updated. If there are two points in the space, say, A and B, the direction from one to another of them is referred to as Ego’s current a xis d irection, producing something called the ego-oriented bearing from A to B. It is the angle between Ego’s intrinsic axis and a vector from A to B. In the allocentric frame, any point in space is localized relative to the a rbitrary origin a nd t he reference d irection, producing a n allocentric d istance and allocentric bear ing. These a re t he co nventional polar coordinates. Ego, being a po int in t he space, can be l ocalized by these same coordinates. If one wants to portray the direction from A to B, an angular difference is constructed between the vector from point A t o B a nd t he r eference d irection, p roducing a n a llocentric bearing. Finally, a special term, the allocentric heading, is reserved for the angle of Ego’s axis relative to the reference direction. The ego-oriented bearing defined above is the difference between the allocentric bearing and the allocentric heading. The reader is referred to Klatzky’s 1998 paper for further discussion of these and derived parameters.
150
Embodiment, Ego-Space, and Action
By v irtue o f t heir pa rameterization, a llocentric a nd eg ocentric frames of reference presumably play d ifferent behavioral roles. The egocentric f rame is useful for d irecting Ego’s ac tions i n t he world, as it tel ls Ego about t he c urrent locations of per tinent objects a nd hence i nforms t he requisite ac tions needed to reach t hem. A s Ego moves, the continuous updating of these parameters supports corrective responses, either closed-loop or feed-forward. Using the egocentric f rame, Ego can ma ke a s accade to a t arget or turn its body toward it, reach for it or locomote to it. The allocentric frame, on the other hand, allows Ego to represent aspects of space independently of its current position and orientation within it. Ego can compute the directional and distance relations between external objects, consider self-to-object r elations f rom i magined v iewpoints, a nd m entally reorient objects, as we will consider later. A strategy for determining what frame of reference is being used in a particular task emerges from the present approach. Specifically, one conducts an experiment that is informative about how space is parameterized. For example, to examine the distance parameter, one might manipulate the positions of objects on a grid of varying block sizes, so that city-block metrics are invariant but Euclidean distance changes, or vice versa, and see which metric correlates with distance responses. This approach has often been very useful in determining what frame of reference is being used in a particular task. However, the approach is not always successful. In particular, as K latzky (1998) po inted o ut, t here i s a n i nherent a mbiguity i n discerning whether people use a llocentric or egocentric f rames of reference for a task, by using purely behavioral measures. The ambiguity a rises bec ause pe ople c an t ake i magined perspec tives. They can co nvert a t ask t hat c alls f or a llocentric j udgments t o a n eg ocentric judgment, by mentally aligning themselves with an allocentric reference frame. For example, if asked to report the angle ABC, where these are three objects in the environment, they can mentally imagine being at A, facing B, and pointing to C. Conversely, people can convert a seemingly egocentric judgment to an allocentric one by ex ternalizing t he spac e a nd representing t hemselves a s a n oriented object within it. In p rinciple, t hese co nversions ma y add t o t ime a nd er ror, b ut when these behavioral markers are not present, the data will be inconclusive. Some mental transformations may occur without inducing error and without sensitivity to experimental manipulations.
The Embodied Actor in Multiple Frames of Reference
151
Despite t hese limitations, positive findings in a number of tasks have supported insights into how spatial structure is parameterized. Behavioral data have been used to address questions such as: Is one frame of reference operative or two? How is the frame of reference aligned? Is t here a n o rigin f or r eporting d istances a nd d irections, and where is it located with respect to the actor’s physical location? How does t he orientation of t he ac tor during learning a ffect these aspects o f spa tial pa rameterization? We n ow t urn t o t he m ethods that have been used to address such questions. Methods to Identify Parameters of the Operative Reference Frame The i ssue o f wha t f rame o f r eference i s u sed f or a t ask ha s be en addressed in a broad array of behavioral studies. We can divide the techniques used by these studies into three general approaches. The first a sks what i s spontaneously a nd readily reported when pe ople are asked about spatial parameters. The second asks what people find easiest to report, in terms of low error and fast responses. The third asks about the specific pattern of errors that are made when people report t he values of spatial pa rameters. We w ill describe examples of each. What Is Reported? A st udy by K latzky, L oomis, a nd a ssociates (1998) i s i llustrated i n Figure 5.2. A subject stood at a fi xed location and was asked to imagine himself walking along two legs of a triangle. That is, he walked a leg A, made a turn α (in Figure 5.2, 90°), and then walked a leg B. The instructions clearly stated that at the end of the imagined locomotion, the subject was to make the turn that he would have to make in order to face home, had t he walk actually been performed (βς. It is important to note that the instructions stressed rapid responding, in order to assess the most potent response. Almost uniformly, subjects made a very striking error: They rotated s o th at th ey a dopted th e b earing t oward th e o rigin th at would have been adopted by the imagined walker. But, in so doing, they produced an overrotation by an amount equal to the value of
152
Embodiment, Ego-Space, and Action
Figure 5.2 Walking task used by Klatzky et al., 1998. Adapted by permission.
the turn between the outbound legs (α). To understand this, consider the simple case where α=90°, and the two legs A and B are equal in length. A walker along this path should rotate β=135° to face the origin. In order to assume the same resultant direction, a person standing at t he origin and continuing to f ace along leg A must turn 135° + 90°. And that is what our subjects did, instead of rotating 135° as the walker would. A nother way to st ate t his i s to s ay t hat subjects adopted the correct bearing from the end of the path to the origin, but in so doing, made the wrong turn. Since t he original K latzky e t a l. st udy, we a nd our f riends have performed this test hundreds of times on individuals and groups. It appears that most people (including the late Herb Simon, no stranger to the Carnegie Symposium) make the error. The effect is as great or even st ronger when subjects w atch t he ex perimenter w alk a nd a re asked to take her perspective. Often, when confronted with the error, they are incredulous: “How can you say I should have turned toward the Southeast, when the answer is clearly the Southwest?” This very incredulity i llustrates h ow firmly t hey r epresent t heir o rientation as a ligned w ith L eg A , n ot L eg B. W hen t he ma tter i s ex plained, subjects may say, “But that’s semantics! You didn’t explain the task sufficiently!” O ur response i s, “ The s emantics l ies i n y our i nternal representation, not the instructions.”
The Embodied Actor in Multiple Frames of Reference
153
To sh ow t hat t he i nstructions a re n ot t he p roblem, a la ter v ersion of this task, conducted by Marios Avraamides, used a single-leg pathway (Avraamides, Klatzky, Loomis, & Golledge, 2004), as shown in Figure 5.3. The subjects watched the experimenter walk forward, stop, a nd ma ke a t urn—conditions t hat had i nduced t he er ror i n the t wo-leg pathway. The i nstructions were identical to t hose used previously—make t he t urn t he ex perimenter would have to ma ke, in order to face the origin. The one-leg situation, however, makes it clearly absurd to commit the previously observed error. To understand why, first consider that in the two-leg case, the correct response turn, β, is determined by both α (the turn between A and B) a nd by t he r atio be tween legs A a nd B. F rom t rial to t rial, different responses can be engendered by varying the A:B ratio, even if people ignore α. Because of this potential for response variability even when the turn α is ignored, subjects don’t notice that ignoring it produces an error. But in the single-leg case, if α is ignored, turning toward the origin would invariably require a response of turning 180°. For example, if the experimenter goes forward, turns 90°, and stops, and the subject ignores the turn, he would have to behave as if the experimenter had not turned at all. In order for the experimenter to face the origin, having not turned at all, she would have to turn 180°. S ubjects i n t his o ne-leg t ask n ever co nsistently t urned 1 80°; indeed, t hey never made t he ig nore-α error. The y clearly observed the e xperimenter’s turn , s ubtracted i t f rom 1 80, a nd turn ed t he residual amount—thus making the correct response. In other words, when they were blocked from the error by the c larity with w hich it was revealed, they showed that they understood the instructions
Figure 5.3 Contrast between the one-leg and two-leg task used by Avraamides et al., 2004.
154
Embodiment, Ego-Space, and Action
perfectly. But, as we have noted, these were the same instructions as those used in the two-leg paths, where the error of ignoring the turn between legs was made. What d oes t his ex periment tel l u s about t he f rame of r eference used by the subjects? It tells us, first, what parameter value subjects reported: t he be aring f rom t he en d o f t he pa thway t o t he o rigin, relative to their original facing direction (along leg A). That is what Figure 5.1 refers to as the ego-oriented bearing, where Ego’s orientation is along Leg A. Ego’s representation of its intrinsic axis apparently remains aligned with Ego’s physical facing direction along leg A, whereas under the instructions to imagine actually walking the pathway, it should switch alignment to Leg B. Is t he subject’s frame of reference allocentric or egocentric? That we cannot tell from the experiment; we can only say that the operative reference direction is Ego’s initial physical orientation in space along Leg A, not the imagined direction along Leg B.
What Parameter Values Are Easiest to Report? A second approach to identifying reference-frame parameters is to use t he relative accuracy or response time ex hibited when subjects are asked to report values of different parameters. This question may seem the same as the previous one, but it is not. In the first approach, we a ssess sub jects’ i nitial, fluent r eport o f a pa rameter v alue a nd determine what that parameter is. In this second approach, we ask subjects to report a particular parameter value and see how well they do. For a n elegant ex ample of t his approach, we t urn to a st udy by Shelton and McNamara (2001), using what we will call “triad” judgments: I magine bei ng at a l ocation A i n spac e, fac ing B , a nd t hen point t o C . Triad j udgments a re g enerally t aken t o i nterrogate a n allocentric frame of reference, because the A to B a xis, which must be used as a reference, is arbitrarily defined relative to Ego’s intrinsic axis. (Note, however, that even this task can be construed as egocentric, if we assume that Ego mentally adopts an orientation along the A-B axis and then computes C’s direction relative to its egocenter.) Shelton and McNamara had sub jects learn a la yout of objects in a rectangular f rame w ithin a r ectangular room, a s schematized i n Figure 5.4. They learned the objects either from a perspective aligned
The Embodied Actor in Multiple Frames of Reference
155
Figure 5.4 Triad judgments of Shelton and McNamara, 2001. Adapted with permission from Elsevier.
with one of the room’s principal axes, or from a co rner at an angle oblique to t he walls (135 deg). Subjects were t hen moved to a n ew room to make the triad judgments. In this task, responses were much faster when the reference axis in the test question was aligned with the walls of the room (imagined heading=0). Importantly, this was the outcome not only for subjects who learned i n a r oom-aligned orientation, but t hose who learned in an orientation oblique to the room. It seems that regardless of the subjects’ orientation at learning, the operative reference axis for the task w as a ligned w ith t he s alient en vironmental c ues, such a s t he walls of a room or a square on the floor. This st udy ex emplifies h ow t he e ase o f r eport i ndicates wha t spatial pa rameters c an d irectly be r ead o ut f rom t he u nderlying representation. Are Errors Spatially Biased by a Frame? We n ext t urn t o t he spa tial pa ttern o f er rors a s i ndicative o f t he parameterization of the underlying representation. An example is the tendency of people to report geographical boundaries as more aligned w ith c ardinal d irections t han t hey u sually a re (T versky, 1981). Even Californians recall the Western border of the state more
156
Embodiment, Ego-Space, and Action
vertically than they should, suggesting it is represented within a geographical frame with a major axis running North/South. In a st udy t hat i s m ore r elevant t o o ur t heme o f a n ac tor p lacing himself in physical surroundings, Flanders and Soechting (1995) asked people to hold a cylinder at some angle relative to the environment, wh ile t he orientation of t heir a rm v aried f rom t rial to t rial. They f ound t hat sub jects co uld r eadily d o t his wh en t he r equired angle w as en vironmental h orizontal o r v ertical. H owever, wh en asked to orient the cylinder at a 45° angle between the horizontal or vertical directions, they tended to take the orientation of the forearm into account, producing responses that were a compromise between the arm axis and gravitational axis, being pulled about 25% toward the a rm. C onversely, i f a sked to orient t he c ylinder a long t he a rm axis, while it was held oblique to the room, subjects were pulled about 20% toward the gravitational axes. Notably, subjects did not increase in variability in the 45° task, suggesting they had ready access to the compromise response. One may be tempted to conclude from these results that subjects adopted a n ew f rame o f r eference, i n wh ich t he r eference a xis f or orientation w as a co mpromise be tween en vironmental a nd body based frames—the relative weighting of the two axes depending on the orientation specified f or t he r esponse. Fla nders a nd S oechting suggested, alternatively, that the two frames of reference continually coexist and that behaviors such a s reaching and grasping are codetermined by them at the time of execution. Implications of Studies on Reference-Frame Parameterization The t hree ex amples t hat w e ha ve de scribed a re q uite d ifferent in implication: The first situation, imagined locomotion, indicated that people who watched or imagined another person walking reported a parameter (orientation relative to the origin of travel) erroneously, according to t ask dema nds. The second situation, a llocentric judgments, revealed that subjects chose a reference frame defined by locations outside of their own body at the time of learning and response. The t hird s ituation, orienting a n object o blique t o t he r oom, i ndicated that subjects failed to coordinate the axis of their arm within the axis of the room.
The Embodied Actor in Multiple Frames of Reference
157
What did these situations have in common? In each case, the task required accessing a f rame of reference discrepant from that of the actor’s own body. The discrepancy was imposed by imagined transformation in case 1; by using object-defined coordinates i n c ase 2; and by pitting t he orientation of t he body pa rt (arm) against environmentally cued coordinates in case 3. All three cases exemplify the general situation addressed posed in this chapter, where parameters defined by actor-relative and task-relative frames must somehow be brought into congruence, or aligned. We now consider the process of alignment more generally. Aligning Reference Frames The process of alignment allows parameter values within one frame of reference to be mapped into values within another. Alignment comprises t wo processes: (1) pa rameter remapping a nd (2) coordinate transformation. These processes operate as follows. 1. For the parameter values of one frame of reference to b e aligned with t he pa rameter v alues i n a nother, t he f rames m ust sha re the s ame de finition of space. That is, they must be analogously parameterized, w hich ma y re quire remapp ing. A spac e de fined egocentrically, in terms of joint angles, would use fundamentally different parameters than one defined allocentrically by the principal axes of a room, although both parameter sets could be used to de signate t he en dpoint o f a re aching ac tion. A lignment ma y initially require, then, that the parameters of one reference frame be brought into correspondence with the values of another. In a behavioral task, alignment is followed by re sponse computation, which may further require remapping of parameters, as explained further below. 2. When two frames have the same parametric defi nition of space, and hen ce ha ve a nalogous pa rameters, t hose pa rameters ma y differ in value according to transformations of position, orientation, a nd s cale. Giv en c orresponding pa rameters, a p rocess o f coordinate transformation may need to be performed, in order to map the values from one framework into the other. This computation, as performed by the brain rather than by pure mathematics, presumably involves mental rotation, translation, and rescaling.
158
Embodiment, Ego-Space, and Action
The po int o f a ligning r eference f rames b y pa rameter ma pping and transforming coordinates is to allow a response to be computed. Response computation takes place in a reference frame preferred for that purpose. But response computation is not necessarily the same as responding. For example, a computed response may be an internal analog representation of an angle, such as the angle that an imagined walker must turn to face the origin. If this is to be verbally reported, the r ead-out o f t he a ngle i s co nverted t o a v erbal de scription (this is not pa rameter remapping, but rather, verbal encoding). But if a n action response is required for a t ask, such a s turning t he body t he amount of the response angle, the required output must be defined with respect to the frame of reference that governs action. If the frame of reference used for the computation is not the action frame, the output must be remapped again into a specification within that frame. We now turn to how t he process of a lignment is required as a n embodied actor encounters tasks in which his or her intrinsic frame of reference conflicts with a reference frame defined by the task itself. Three ex emplar s ituations w ill be de scribed i n de tail a nd u sed t o motivate a co nception of t wo general principles governing t he d ifficulty of alignment. The Embodied Actor in Multiple Frames of Reference The embodied actor confronts multiple frames of reference essentially continuously while perceiving and acting. As was noted earlier, people automatically and effortlessly perform behaviors like maintaining a st ationary w orld wh ile m oving t he h ead, u pdating wh ile p hysically navigating, or monitoring the movements of an object in space. While these behaviors occur quickly, automatically, with little error, and with little evidence of cognitive load, other forms of coordination lead people to perform slowly, with error, and even with something that seems like mental pain. We offer three quite different examples, from the domains of locomotion, spatial thinking, and acting in near space under combined perceptual and cognitive control. Changes in Egocentric Parameters Under Imagined Locomotion The em bodied ac tor, c all i t Eg o, m oves w ithin i ts spa tial en vironment. A s i t d oes so , i ts r elation t o ob jects a nd o thers i n t he en vi-
The Embodied Actor in Multiple Frames of Reference
159
ronment s hifts, r equiring u pdating o f t he eg ocentric pa rameters. (For ease of exposition, we will consider Ego to be representing its physical position in an egocentric frame of reference, although it is possible to formulate t hese points in a llocentric terms.) When Ego physically moves, only a s ingle frame of reference is operative, and this moves along with Ego. That is, Ego’s intrinsic axis may be moving i n physical spac e, but it i s s eamlessly u sed a s t he ba sis for t he current computation of egocentric parameters. Updating of t his sort i s potentially ba sed on t wo t ypes of c ues: Current positional cues can come from t he objects in t he environment; f or ex ample, f rom perspec tive o r ster eo i n v ision o r r everberant sound in audition. Ongoing movement cues can come from Ego itself, through proprioception and motor commands, and from the en vironmental flow, i n t he f orm o f o ptical, acoustic, o r t actile or olfactory gradients. A body of research on perceptually directed action demonstrates that people can perform self-to-object updating even in the absence of continuous visual feedback as they physically move in space (Klatzky, Lippa, Loomis, & G olledge, 2003; Loomis, Da Silva, Fujita, & Fukusima, 1992; Rieser, 1989; Thom son, 1983). The foregoing refers to Ego’s updating relative to the world when it physically moves. In contrast, when Ego mentally shifts, as in taking an imagined perspective, all the available physical cues indicate that Eg o i s n ot m oving a nd t hat i ts r elationship t o t he en vironment remains unchanged. Hence there are two frames of reference invoked: one in which Ego’s axis is defined by its physical orientation, a nd one defined by its i magined orientation. Ego’s cha nge i n viewpoint u nder i magined locomotion must t ake place w ithin t he second frame of reference, while suppressing the first. Conceivably, Ego could entirely forget its physical frame of reference and essentially hallucinate the imagined shift as real, but it seems more likely that t he p hysical pa rameters a re ma intained wh ile t he i magined ones are updated. To describe imagined changes in position or orientation, it is insufficient to say that the egocentric parameters are updated. Two kinds o f r epresentation a re co nceivable, r oughly co rresponding t o Ego’s mentally looking out its eyes or looking at itself from a bird’seye view. (1) Ego could maintain a frame of reference aligned with its intrinsic axis as it mentally moves, just as it shifts with Ego’s physical movements. In t his case, Ego must ma intain a r ecord of where it i s r elative t o i ts p hysical l ocation. I t co uld, f or ex ample, r epresent its physical self as just another point in the imaginally defined
160
Embodiment, Ego-Space, and Action
egocentric spac e. ( 2) The eg ocentric f rame co uld r emain a ligned with Ego’s physical position. In this case, with each movement, Ego must u se s ome k ind of mome nt-to-moment c omputation to d etermine what the relationships of the objects in the world are, relative to its imagination-determined position and orientation. Ego could, for example, imagine itself as an oriented point changing position in the reference f rame defined by its static physical location. In other words, does Ego’s reference frame follow its imagined movement or its physical position? To address this question, let’s go back to the task described in Figure 5.2, where the subject imagines walking along two legs of a t riangle, then is asked to make the turn toward the origin. As we have seen, sub jects ten d t o ad opt t he r equired r esponse be aring r ather than making the required response turn; but in doing so, they overturn by the angle between the two outbound legs, acting as if they never perceived the turn between them. There are two interpretations of this error, corresponding to the two versions of imagined updating presented above. They are shown in Figure 5.5. (1) One possibility is that the subject retained a frame of reference grounded in her static physical location and orientation, and sh e r epresented t he movement of t he w alker a s i f a n ex ternal object was shifting and changing orientation in space. At the point of the response, the two representations would coexist, as proposed by Flanders and Soechting. If the physically defined orientation dominated the response, subjects would adopt a response bearing toward the origin as if facing forward along leg A—exactly as they tend to do. (2) The alternative is that the subject mentally moved along the path and brought her egocentric reference frame with her. If so, she
Figure 5.5 Two interpretations of the error in imagined updating (cf. Figure 5.2). Left: maintaining a physical and imagined frame; right: failure to update turn.
The Embodied Actor in Multiple Frames of Reference
161
Figure 5.6 Verbal response vs. turning response data of Avraamides et al., 2004, supporting the two-frame hypothesis. Adapted by permission.
must have failed to update during the turn, perhaps because it was not accompanied by vestibular signals. That was, in fact, the explanation offered originally by Klatzky et al. (1998). Which interpretation is correct? A study with Marios Avraamides (Avraamides et al., 2004) clearly supported the first one, dominance of the physical frame. In a critical condition of that study, the original experiment was replicated but with a verbal response: The subject was to report an angle in degrees corresponding to the turn that the imagined walker must make, in order to face the origin. The results, shown in Figure 5.6, were unambiguous. Subjects responding with body turns replicated the original findings. Subjects responding verbally made no error. We reasoned that the physical turn response— but not the verbal one—was bound to the physically defined frame of reference. The verbal response then revealed the existence of an imaginal frame, in which the walker was reoriented by the turn between the legs of the path. Spatial Thinking Through Action Imagined perspec tive-taking, w e ha ve see n, p roduces r eferenceframe c onflict, w hich profou ndly a ffects performance. This arises not only when we consider moving our whole body through space, but when we imagine limb movements. Imagined limb movements are i nvoked i n a ped agogical context, when physics st udents learn the right-hand rule. This is a ubiquitously taught mnemonic to help students de termine t he d irection o f a c ross p roduct v ector. Cr oss
162
Embodiment, Ego-Space, and Action
products arise at multiple points in the theory of magnetism. For example, there is a force on a charged particle moving in a magnetic field. The force is given by the magnitude of the charge, multiplied by the cross product of two vectors, one representing the velocity of the particle and the second representing the magnetic field. Students need to learn the direction of the force; that is, the cross product. It is always perpendicular to both vectors—but in which direction? To determine t he direction of t he cross product, physicists invoke t he right-hand rule. Consider the following description of the rule from a text (italics, underlining, capitalization, and bold are used here to indicate different frames of reference; they are not in the original): Take your right hand and point the fingers to the right (SOUTH for this problem)...rotate your hand until the palm points down. Now curl your fingers from the velocity direction to the magnetic field direction...[Your thumb] s hould p oint into the page. But t he e lectron h as a ne gative charge, a nd t hat re verses t he d irection, so t he magnetic force i s out of the page, which is WESTWARD. (Barrett, 2006, p. 178)
There are multiple frames of reference involved in the right-hand rule. First, there is the frame of reference defined by the three vectors of interest: velocity, magnetic field, and force (indicated in underline above). This f orms a n ob ject-centered f rame i n t hree d imensions, entirely cognitively defined. Second, there is the frame of reference defined by the page of the text (bold above), which is used to ground the object-centered f rame i n t he physical world. The aut hor of t he above quotation has chosen to define t his f rame by cardinal d irections (N orth, S outh, i ndicated b y c apitals), wh ich d o n ot r efer t o their ma gnetic equivalents but r ather to up-down a nd left-right in page spac e. F inally, t here i s t he f rame o f r eference defined b y t he student’s hand in relation to the body (thumb in, thumb out, italics above). The st udent must so mehow a lign t hese f rames i n o rder t o use the rule. As researchers attending this symposium attest, there is a g rowing body of research which indicates that invoking the body’s frame of reference during spatial thinking leads to activation of cortical motor representations (Petit, Pegna, Mayer, & Ha uert, 2003; Sirigu & D uhamel, 2 001). M oreover, t he d ifficulty o f t he spa tial-cognitive process reflects t he u nderlying biomechanical complexity. The right-hand r ule c an require ex tremely complex a nd awkward pos-
The Embodied Actor in Multiple Frames of Reference
163
Figure 5.7 Hand gesture needed for the right-hand rule.
tures, wh ich, b y i nference, cr eate co rresponding co mplexities i n spatial t hinking. Even i n t he relatively st raightforward c ase where the c ross p roduct o f t wo co mponent v ectors po ints i nto t he p lane of the page (see Figure 5.7), the student must mentally perform the following movement: Abduct the arm at the shoulder while supinating the shoulder and wrist joint; flex the wrist; flex the fingers, and extend the thumb. At the same time, one text advises the reader that if the hand posture doesn’t feel possible, it must be that the rule is not being used correctly! Other factors contribute to students’ difficulty with the right-hand rule as well. If they don’t remember that the component vectors form a plane, they will be st ymied at figuring out how to align with one and curl into another. Although the canonical figure in texts shows two v ectors a ttached a t t heir t ails, p roblems often s eparate t hem arbitrarily in space without regard to their directions. In this case, if students don’t understand that the cross product is invariant over translation of the component vectors in their plane, they may try to configure the imagined hand across the spatial divide. Students also need to be reminded that the right-hand rule attempts only to make a binary distinction—is the cross product pointing into or out of the plane defined by its components? They may try to modify their hand position i n order to represent t he vector’s ma gnitude a s well a s its
164
Embodiment, Ego-Space, and Action
Figure 5.8 Instructions that improve understanding of the right-hand rule.
direction, and they may confuse the orientation of the cross product relative to t he ve ctor-defined p lane w ith i ts o rientation r elative t o their frontal-parallel plane. With Robert Swendsen of t he C arnegie Mellon Physics Depa rtment, t he first a uthor o f t his cha pter co nstructed a b rief s eries o f instructions a imed a t h elping st udents w ith t he r eference-frame coordination required to u se t he r ight-hand r ule. The instructions (Figure 5.8), shown as slides, stated in essence: The two component vectors define a plane. Imagine looking at them on the plane. If their tails are not tied together, translate the vectors so they are. Rotate the tied vectors together on the plane so that one of them points up. Now invoke the right-hand rule in one of two postures, pointing into or out of the plane, as required. Swendsen and Klatzky assessed the efficacy of these instructions by comparing per formance on a s et of i n-class quiz questions t hat tested knowledge of the direction of the cross product. Thr ee identical questions were asked over two successive cohorts taking electricity and magnetism, one receiving a conventional explanation of the right-hand rule and the second presented with the new instructions. Students i mproved f rom 76% correct to 9 6% (relative to cha nce of ~20%), t hus ach ieving cl ose t o per fect per formance w ith t he n ew materials. In physics pedagogy, this is a startling result. But is the right-hand rule necessary at all? Although it may provide a u seful mnemonic, particularly early in learning, alternatives are possible. Students could be told that once the vectors are tied at their tails, if the shorter distance from A to B is clockwise, the cross product points into the plane; if the reverse, it points out of the plane. This would avoid involving the body at all and would eliminate the demand to imagine or per form movements t hat are unpreventably awkward.
The Embodied Actor in Multiple Frames of Reference
165
Cognitively Mediated Action vs. Perceptually Guided Action The final example we will provide of the embodied actor in multiple frames of reference deals with actions in near space. It comes from the domain of ultrasound-guided surgical intervention. Ultrasound is increasingly used for an array of procedures like guiding the insertion of a catheter into a vein or performing a biopsy on breast, thyroid, or liver tissue. The ultrasound image is used to help the surgeon track the catheter or biopsy needle in relation to the targeted tissue. Any cognitive scientist who has ever had an ultrasound examination must have been struck by the anomaly of the operator’s looking a way f rom t he pa tient d uring t he p rocedure. On e ha nd h olds the ultrasound wand and explores the patient’s anatomy, while the eyes are directed for the most part at a remote screen. In ultrasoundguided surgery, the surgeon’s free hand penetrates the patient’s body while the surgeon looks away. Consider a practice often per formed by a r adiologist—guiding a needle to an ultrasound target while viewing its progress within the slice i lluminated b y t he u ltrasound. This is c alled i n-plane n eedle guidance. This f orm o f u ltrasound-guided su rgical i ntervention requires coo rdination ac ross a t l east f our r eference f rames: o ne i s the frame of reference provided by the patient’s body, another is the operator’s egocentric frame of action, a third is defined by the ultrasound scan, and a fourth by the screen displaying the results of the scan. The first frame, defined by the patient’s body, can be perceptually registered b y v ision, a lthough t he t arget o f su rgery i s n ot d irectly visible. The point of entry of a n eedle may lie at a c ertain distance above the patient’s wrist and at the midpoint of the arm width, for example. The second frame, egocentric action space, locates the fingers h olding t he n eedle r elative t o t he su rgeon’s body. Eg ocentric parameters a re c ued b y t he p roprioceptive s ystem, pa rticularly b y kinesthetic sensors in muscles, tendons, and joints, and by efference copy (monitoring of the motor command). The fusion of these first two frames—perceptually defined space and action-defined space— is the essence of visually guided action. The addition of the ultrasound further requires that the perception/action composite frame be aligned with two more frames of reference. One is provided by t he u ltrasound transducer, a ha nd-held device pressed against the patient’s body. Transducers vary in shape
166
Embodiment, Ego-Space, and Action
but generally have a T form: a cylindrical section that is held by the user and a larger cross-section that holds the sensors that image the sound waves. The imaged area is a roughly triangular-shaped plane or “slice” emanating from the tip of the transducer and oriented in the plane of t he sensor a rray. The t arget of su rgery, for ex ample, a lesion to be biopsied, lies within this plane. The location of the target is t hus defined by a t hird f rame of reference, formed by t he t ransducer plus t he sl ice. W hile its position i s constant i n t he f rame of reference d efined b y t he pa tient’s body, t he t arget pos ition i n t he transducer-plus-slice frame depends on how the surgeon holds the transducer and where it is placed on the surface of the body. Tilting the transducer or sliding it on the skin will shift the target location in the image. In order to understand the data being imaged, the surgeon needs to represent the transducer position and orientation relative to t he patient’s body, u sually w ithout ma intaining continuous sight of the transducer. Additional complexity arises because one more frame of reference is employed for displaying the ultrasound image: The slice is shown on a m onitor t hat, w ith co nventional u ltrasound, i s a rbitrarily located in the workspace. The rectangle formed by the monitor thus provides a fourth frame of reference. The target’s coordinates within the x/y axes of the monitor depend on how the transducer was held, because t he x /y a xes i n t he m onitor d isplay co rrespond t o a xes aligned with the transducer’s principal axes. Moreover, although the axes map invariantly, the scales do not, as the image can be zoomed or reduced by the user as desired. In short, for conventional ultrasound, the problem is this: In the user’s egocentric space of action, there is a target within the patient’s body, t he position of which is represented in a m onitor coordinate frame that depends on the orientation of a transducer-defined frame (and a rbitrary s caling). The u ltrasound op erator m ust i ntegrate information ac ross t he d isplay, t he t ransducer, a nd t he body so a s to ultimately register the i mage with the patient. Only then is the resulting representation subject to interventional action by the hand holding the surgical tool. Clearly, in addition to perceptual processing, the cognitive load of this task is high. With George Stetten and Damion Shelton, we have conducted a series of st udies t hat attempts to pick apart t he cog nitive a nd perceptual components of this complex task and to identify sources of error. An important tool that we use is a form of ultrasound dis-
The Embodied Actor in Multiple Frames of Reference
167
play invented by Stetten (2003). The device, called the Sonic Flashlight (SF), functions by superimposing an ultrasound image exactly onto the anatomy being scanned. In this way it creates an illusion of “X-ray vision” for doctors: a c ross-sectional ultrasound view of the inner t issue i s s een d irectly a s i f t he pa tient’s body i s t ranslucent. This novel visualization is a form of augmented reality. Unlike conventional ultrasound, it shows the ultrasound slice with reference to the patient’s body, rather than the probe, and thus removes the necessity of aligning disparate frames of reference. Moreover, the patient, the ultrasound image, the instrument, and the operator’s hands are merged into one v isual environment. A ll can be per ceived directly by vision and registered using a single, unified frame of reference. Figure 5 .9 sh ows t he t ask a nd r esults ( previously u npublished) from a study of in-plane insertions by novice subjects with the SF or conventional ultrasound (CUS). Subjects were asked to first localize a t arget be ad placed i n opaque fluid in a plastic t ank by u sing
Figure 5.9 Left is conventional ultrasound and right is the Sonic Flashlight. (a) Mental representations invoked by use of imaging tool, from Wu et al., 2005, copyright IEEE. (b) Representative spatial distributions in the target plane of terminal positions of a ne edle t hat s ubjects at tempted to g uide to a n i maged t arget ( grey sphere) (unpublished data from experiment reported in Wu et al., 2005).
168
Embodiment, Ego-Space, and Action
ultrasound, t hen j udge wh ere t he u ltrasound sl ice w as, a lign t he needle w ith it, a nd finally direct the needle toward the target with visual guidance from ultrasound (in-plane insertion). Figure 5.9(b) shows the distributions of terminal needle locations in the target plane. It is apparent that the success rate (i.e., the percentage of ending points w ithin t he t arget a rea), was much h igher when u sing t he SF r ather t han CUS (85.0% v s. 58 .3%). A lso, most of the variability is in the x-axis of the figure, while errors in the yaxis a re mini mal. The latter represents t he elevation er rors, wh ich could be e asily detected and corrected by the subjects if the needle and t arget co uld be s een s imultaneously i n t he u ltrasound i mage. The x-axis variability represents users’ accuracy at staying within the width of the ultrasound slice. This variability is notably lower with the SF than CUS. We attribute the difference to d ifferent processes that are involved in locating the ultrasound slice. With the SF, the ultrasound slice is directly visible; both needle and slice can be visually a ligned. I n contrast, CUS u sers must cog nitively v isualize t he slice, as opposed to direct perception. To “see” the slice, they need to observe how t hey are holding t he ultrasound wand, infer t he location of the slice, and then mentally map the image onto that location. The alignment of the needle and slice, essentially, is more cognitive than perceptual, as a result of the necessity of aligning multiple separated frames of reference. In another study which examined subjects’ ability to locate targets using the two ultrasound devices, we also saw clear evidence of the costs of cognitively aligning the multiple frames (Wu, Klatzky, Shelton, & S tetten, 2005). Subjects were a sked to localize a t arget bead placed in opaque fluid in a plastic tank. They did so by first imaging the target with the wand, then pointing to its location from three sites distributed around the edge of the tank. We used a 3 -D triangulation algorithm to determine the perceived location of the bead. The results showed that subjects using conventional ultrasound systematically u nderestimated t arget dep th, wh ereas t hose u sing t he augmented-reality device were as accurate as when they had direct sight of the target in an empty tank. When the subjects were subsequently asked to guide a needle to ultrasound-imaged targets, their errors were completely consistent with the mislocalizations observed earlier. Subjects using conventional ultrasound first a imed the needle too high, and then, when they realized their error, corrected by pushing it downward through the fluid. Of course, this form of error
The Embodied Actor in Multiple Frames of Reference
169
correction would not be possible in a clinical context where it would produce tissue damage. The advantage of the sonic flashlight clearly lies in its affordance of perceptually guided localization, as opposed to cognitively guided action. C onventional u ltrasound, w ith i ts dema nds f or r eferenceframe coordination, introduces both systematic and variable error.
The Common Process: Alignment The previous section of the chapter provided three examples in which people must coordinate body-defined and task-defined frames of reference. The tasks were imagined changes of position and orientation, understanding vector cross products, and constructing a representation of target location from disjoint visual input. Common to these tasks is t hat t hey required bridging from t he actor’s representation of self in the physical world to other coordinate systems. The person who watches another person walk and is asked to take his perspective must place herself i n t he walker’s f rame, wh ich moves de spite her own physical constancy. The student who studies physics must form a vector-defined reference f rame a nd t hen represent her own limbs relative t o it. The i nterventional r adiologist v iewing a b lood vessel on a remote screen must translate the frame of the screen into the body- and environment-defined frames of action. How a re t hese t ransformations ach ieved? The r esearch o f pe ople a t t his s ymposium, a long w ith t hat o f ma ny o thers, ha s l ed t o increased understanding of biological mechanisms for coordinating multiple physical frames of reference, interpreting the limb postures of other animate beings as if they were our own, and mentally placing ourselves in remote environments with coordinate systems differing from our egocentric ones. We k now something about where these f unctions a re located i n t he brain. We have ideas about how they might be mediated by the activity levels of single neurons and population-level statistics. What remains elusive are intermediate levels of description t hat specify t he mechanisms by wh ich reference f rames snap i nto congruence—or fail to do so. The problem of coordinating across spatial frameworks may be quite different when cognition comes into play, in co mparison t o wha t ha ppens wh en d ifferent fr ames within the body are coordinated, such as a headcentric frame and a retinocentric
170
Embodiment, Ego-Space, and Action
frame. The examples reviewed in t his chapter a re cha racterized by cognitively ba sed coo rdination. M ore g enerally, t he s ituations w e have been describing require what is commonly called spatial thinking. A ctors ne ed to think spa tially wh en t hey r elate t heir p hysical frame o f r eference ( or m ultiple p hysical f rames) t o t ask-induced frames of reference (or multiple frames). A c ritical a spect o f t he s ituations t hat w e ha ve de scribed i s t he complex process of alignment, which was described at the beginning of this chapter. Recall that alignment allows parameter values within one frame of reference to be mapped into values within another. In so doing, it invokes component processes of parameter remapping and coordinate t ransformation. Parameter remapping i s i nvoked when a task-defined f rame of reference a nd t he actor’s intrinsic f rame of reference u se d ifferent spa tial de scriptions. A f rame o f r eference imposed on an object like an ultrasound wand parameterizes space in wand-relative coordinates, whereas t he space on t he screen t hat displays t he u ltrasound is pa rameterized i n screen-relative coordinates. These parameters must be brought into correspondence before the computation of a response can be per formed; further reparameterization may then be needed to execute the response. Coordinate transformation is invoked when two frames of reference have analogous parameters but differ in position, orientation, or scale, requiring translation, rotation, or rescaling. Errors and failures can arise as a result of failures of any of these processes. C onsider h ow er rors m ight ha ve oc curred i n t he t hree tasks that were described here. Imaginal Walking The task is to imagine walking the path specified by a verbal description or by watching another person walk, then to make the final turn necessary to fac e t he origin. The per former of t his t ask must (1) t ransform t he coo rdinate spec ification o f hi s p hysical l ocation and orientation into an imagined person; (2) update that specification according to the instructions; (3) compute the required trajectory to face home according to the updated values; and (4) map the trajectory information into the frame that controls the turn that is required. The failure in performing this task appears to arise because, as long as t he response is a body t urn, t he initial f rames a re never truly a ligned. The walker’s location is updated in a walker-defined
The Embodied Actor in Multiple Frames of Reference
171
frame, defining a homeward bearing, but orientation remains specified as in the person’s body-defined frame. Thus the person turns as defined by his body orientation. Right-Hand Rule v
v
The task is to i ndicate t he cross product of t wo vectors, U × V . The v v performer must (1) create a r eference f rame i n wh ich U and V lie on a common plane at an arbitrary location in self-defined space; (2) treat the hand as an external object within the vector-defined frame and map hand-centered coordinates into that frame so that the wrist lies a t t he v ector i ntersection, t he ha nd’s c ross-section i s a ligned v v with U , and V is closer to the palm than the back o f the hand; (3) observe t he d irection o f t he t humb i n spac e a nd a lign i t w ith t he third dimension in the vector-defined coordinate s ystem. Di fficulties arise here both in creating the vector-defined frame of reference and, more importantly, in mapping the hand into that frame in the required posture.
Ultrasound The t ask i s t o p robe a su rface w ith a n u ltrasound w and, r ead t he depth of a target on a remote screen, and then guide a needle to the target. The performer must (1) form a frame of reference defined by the v ertical a nd h orizontal a xes ( i.e., ob ject-centered a xes) o f t he ultrasound transducer and projected into the attached slice (which is n ot d irectly s een); ( 2) r ead t he dep th f rom t he s creen-defined frame of reference a nd map it i nto t he t ransducer/slice f rame; a nd (3) a lign t he target location parameters from t he transducer frame into a frame that affords action. Problems can arise here in mapping the t arget i nto t he t ransducer, ma pping t he s creen dep th i nto t he transducer-defined frame, and projecting the transducer frame into the action space.
A Taxonomy of Alignment Load The present ex amples of t he cha llenges fac ing t he em bodied ac tor are pa rt of a subst antial body o f research t hat collectively suggests
172
Embodiment, Ego-Space, and Action
ways to order the difficulty of alignment. That is, this research points to the relative load imposed by problems that the actor faces when attempting to align herself with task-defined frames of reference. In doing so, this literature suggests two general factors that can be considered. We will term these allocentric layer and obliqueness. Allocentric layer conceptualizes the separation between the body frame and the task-defined frame in terms of a series of layers centered on the body. We suggest here a general framework for organizing these layers, and we use hallmark studies to illustrate and suggest the ordering. The nearest layer in which to define a task, according to our framework, is the salient geometry that surrounds the body. The Shelton and McNamara study described earlier indicates that people form a strong reference frame from the geometric cues in their immediate vicinity, such as the walls of a room or a square on the floor. The frame provided by the geometry surrounding the body p resumably co ntributes t o t he co nflict be tween r eal a nd i magined perspectives. A s we have described ea rlier, people c an update sel fposition relative to the local environment when physically navigating, e ven i n t he absence of v ision. The st udies of K latzky, L oomis, and associates with imagined locomotion (Klatzky et al., 1998, Avraamides et al., 2004) indicate that there is a conflict between the unchanging pa rameter v alues i n t he body -in-environment f rame and t he cha nging v alues o f t he i magined f rame, wh ich add s t o the d ifficulty of mental updating. Moreover, t he deg ree of conflict appears to reflect the salience of a person’s environmental frame of reference. Salience is promoted, for example, by sight of the environmental surroundings. May (1996) found that when subjects were to mentally update by rotating within a room they had previously seen, disorienting t hem by arbitrary rotations in advance of t he mental rotation i mproved per formance. H e ha s de veloped a m odel t hat presents t he conflict as arising between spatial parameterization at sensory-motor and cognitive levels, resulting in interference during response selection (May, 2001, 2004). Virtual reality (VR) appears sometimes to exemplify the environmental layer, but sometimes to be more removed, suggesting further layering. V R st udies have proved to produce a n i nteresting m ix of the ease of physical updating and the difficulty of mental updating. Riecke a nd a ssociates (2006) f ound t hat pe ople co uld ma ke r apid mental jumps to familiar locations that were displayed in wide-
The Embodied Actor in Multiple Frames of Reference
173
screen. Others have found t hat navigating i n i mpoverished, sma llscreen V R-defined w orlds i s h ighly er ror p rone, e specially w ith regard to rotations (Klatzky et al, 1998). Objects w ithin t he en vironmental g eometry a round t he body may t hemselves de fine f rames o f r eference f or a t ask, p roducing a still more remote allocentric layer. Simply locating objects relative to the body appears to be subject to an accessibility ordering, such that left/right is more difficult to j udge t han top /bottom o r f ront/back (Franklin & Tversky, 1990). Complexity increases when the axes of the object a re considered a s t he ba sis for a n object-centered f rame of reference, particularly when rotational transforms are required. Wraga, C reem, a nd P roffitt (2000) demonstrated t hat it w as e asier to u pdate a v iewer a fter imagined rotation than it was to update a rot ated ob ject, a n e ffect t hey a ttributed t o d ifficulty i n m entally maintaining a coherent object-defined frame. We t urn n ow t o t he s econd fac tor i n o ur s cheme o f a lignment difficulty, namely, obliqueness. The preceding paragraphs have indicated that in many tasks, angular disparity between the body-defined frame a nd t he task-defined f rame impairs per formance. When t he actor’s body is to be located and oriented relative to principal axes of the environment, such a s gravity or the walls of a room, alignment appears to be facilitated by a priori congruence between the intrinsic axes of the body—gravitational, frontal, and sagittal—and the environmental axes. Angular misalignment, on the other hand, impairs performance. R ieser (1989) was one of t he earliest to show t he difficulty of updating during imagined movement when people had to imagine rotating the body r elative to an array of learned objects in the physical environment. The implication is that accommodating to angular disparities between self and task frames requires a p rocess like mental rotation, with high cognitive load. The Sh elton a nd M cNamara st udy de scribed e arlier i ndicates that learning from a viewpoint where the frontal and sagittal planes of t he body a re ob lique t o t he bo undaries o f a la yout r esults i n a fragile f rame of reference t hat quickly defaults to a n environmentally defined frame when one becomes available. A number of studies involving tilts of the body relative to the gravitational axis show deficits due to the oblique relation (Gaunet & B erthoz, 2000; Mast, Ganis, Christie, & Kosslyn, 2003). Maintaining an o bject-defined f rame a ppears t o beco me e ven more d ifficult wh en t he ob ject’s i ntrinsic a xes a re o riented a t a n
174
Embodiment, Ego-Space, and Action
oblique a ngle r elative t o t he v iewer a nd/or t he en vironment. P ani and associates (1993; Pani & Dupree, 1994, Pani, Jeffres, Shippey, & Schwartz, 1996) st udied t he mental rotation of a rbitrarily oriented objects a round a rbitrary a xes. O perative f rames o f r eference w ere defined by the object’s salient axes, the axis of rotation, and the person. They found that task difficulty was directly related to the number of frames of reference that were misaligned. At an extreme, it was essentially impossible for people to predict the effect of a r otation when all three frames had different orientations, as when a planar form was tilted with respect to a rotation axis, which was itself oriented oblique to the viewer. One can only imagine the additional load of tilting the person within the world! Flanders and Soechting (1995), i n t he st udy de scribed e arlier, s imilarly f ound t hat pe ople could easily orient the principal axes of an object relative to the conventional body-in-world frame, but not to an oblique frame. In sh ort, a co nsiderable body o f r esearch, o nly a sma ll f raction of which is described here, has documented difficulties with alignment. F igure 5 .10 s chematizes t he t wo ma jor fac tors su ggested b y this w ork, a llocentric la yer a nd ob liqueness. C ongruence o f bodydefined and task-defined frames is proposed to be an anchor point on the allocentric-layer dimension (although one can conceive of situations in which it would be difficult to determine where one’s limbs were). Complexity increases when the body must be related to environmental coordinates, which in turn is easier than body relative
Figure 5.10 Schema for factors affecting alignment, with difficulty of representative experimental tasks shown corresponding to height in the plane in the twofactor space.
The Embodied Actor in Multiple Frames of Reference
175
to ob ject-in-world. A lignment i nvolving multiple ob jects o r ob ject parts is proposed to be more demanding still. The second dimension in the figure represents the proposal that angular mismatches with the body further impede alignment with any task-defined layer. The t wo fac tors w e ha ve i dentified f rom t he li terature as c ontributing t o a lignment d ifficulty—allocentric la yer a nd ob liqueness—can be related to the two components of alignment described above—remapping and coordinate transformation. We propose that allocentric layering affects remapping, and obliqueness affects coordinate transformation. That is, the process of remapping between the body and task-defined systems presumably becomes more difficult, the more removed the task-defined layer is from the body. The process of coordinate transformation presumably is affected by angular disparities, because they call into play the rotational transformation between coordinate systems, which is implemented by mental rotation. Translation and rescaling also call for mental coordinate transformation, but the literature is equivocal on their impact relative to rotation (e.g., Rei ser, 1989, but see L arsen & B undesen, 1978; May, 2004). The schema in Figure 5.10 is speculative, to be sure. However, to the extent that it identifies progressive constraints on the embodied actor’s ability to deal with multiple frames of reference, it suggests paradigms that might be em ployed for further study. Hence it may contribute t o t he goal of u nderstanding a lignment more f ully at a process level, particularly when an actor must align a body -centric frame of reference with those defined by other tasks. References Avraamides, M ., K latzky, R . L ., L oomis, J. M ., & G olledge, R . G . (2004). Use of cognitive vs. perceptual heading during imagined locomotion depends on the response mode. Psychological Science, 15, 403–408. Barrett, T. E . (2006). Introductory physics with calculus as a second language. Hoboken, NJ: Wiley. Enright, J. T . ( 1998). O n t he “ cyclopean e ye”: S accadic a symmetry a nd the rel iability o f p erceived s traight-ahead. Vision Re search, 38( 3), 459–469. Flanders, M., & Soechting, J. F. (1995). Frames of reference for hand orientation. Journal of Cognitive Neuroscience, 7, 182–195.
176
Embodiment, Ego-Space, and Action
Franklin, N., & T versky, B . ( 1990). S earching i magined en vironments. Journal of Experimental Psychology: General, 119(1), 63–76. Gaunet, F., & Berthoz, A. (2000). Mental rotation for spatial environment recognition. Cognitive Brain Research. 9, 91–102. Haggard, P., Newman, C., Blundell, J., & Andrew, H. (2000). The perceived position of the hand in space. Perception & P sychophysics, 6 8(2), 363–377. Klatzky, R . L . (1998). A llocentric a nd e gocentric spa tial re presentations: Definitions, d istinctions, a nd i nterconnections. I n C . F reksa, C . Habel, & K . F. Wender ( Eds.), Spatial c ognition—An inte rdisciplinary a pproach to r epresentation an d p rocessing of s patial kno wledge (Lecture N otes i n A rtificial Intelligence 1404) (pp. 1–17). Berlin: Springer-Verlag. Klatzky, R. L ., Lippa, Y., Loomis, J. M ., & G olledge, R. G. (2003). Encoding, learning and spatial updating of multiple object locations specified by 3 -D sound, spatial language, and vision. Experimental Brain Research, 149, 48–61. Klatzky, R. L., Loomis, J. M., Beall, A. C., Chance, S. S., & Golledge, R. G. (1998). Spatial updating of self-position and orientation during real, imagined, and virtual locomotion. Psychological Science, 9, 293–298. Larsen, A ., & Bu ndesen, C . (1978). Si ze s caling i n v isual pa ttern re cognition. Journal of E xperimental P sychology: H uman P erception an d Performance, 4, 1–20. Loomis, J. M ., Da Si lva, J. A ., Fujita, N., & Fu kusima, S. S. (1992). Visual space perception and visually directed action. Journal of Experimental Psychology: Human Perception and Performance, 18, 906–921. Mast, F. W. G anis, G ., C hristie, S ., & K osslyn, S . M . (2003). F our t ypes of visual mental imagery processing in upright and tilted observers. Cognitive Brain Research, 17, 238–247. May, M . (1996). C ognitive a nd emb odied modes of spatial i magery. Psychologische Beitraege, 38, 418–434. May, M . ( 2001). M echanismen r äumlicher P erspektivenwechsel [ Mechanisms of spatial perspective switches]. In R. K. Silbereisen & M. Reitzle (Eds.), Psychologie 2000 (pp. 627–634). Lengerich: Pabst. May, M. (2004). Imaginal perspective switches in remembered environments: Transformation versus interference accounts. Cognitive Psychology, 48, 163–206. Merriam, E. P., Genovese, C. R., & Colby, C. L. (2003). Spatial updating in human parietal cortex. Neuron, 39, 361–373. Pani, J. R. (1993). Limits on the comprehension of rotational motion: Mental imagery of rotations with oblique components. Perception, 22(7), 785–808.
The Embodied Actor in Multiple Frames of Reference
177
Pani, J. R ., & D upree, D. (1994). Spatial reference systems in the comprehension of rotational motion. Perception, 23(8), 929–946. Pani, J. R ., Jeffres, J. A ., Shippey, G. T., & S chwartz, K. (1996). Imagining projective transformations: Aligned orientations in spatial organization. Cognitive Psychology, 31(2), 125–167. Petit, L. S., Pegna, A. J. Mayer, E., & Hauert, C.-A. (2003). Representation of anatomical constraints in motor imagery: Mental rotation of a body segment. Brain and Cognition, 51(1), 95–101. Riecke, B . E ., Von Der H eyde, M ., & Bü hltoff, H . H . (2005). Vi sual c ues can be sufficient for triggering automatic, reflexlike spatial updating. ACM Transactions on Applied Perception, 2, 183–215. Rieser, J. J. (1989). Access to knowledge of spatial structure at novel points of observation. Journal of E xperimental Psychology: Learning, Memory, and Cognition, 15(1), 157–1 165. Shelton, A . L ., & M cNamara, T. P. (2001). Systems of spatial reference i n human memory. Cognitive Psychology, 43, 274–310. Sirigu, A., Duhamel, J. R. (2001). Motor and visual imagery as two complementary but neurally dissociable mental processes. Journal of Cognitive Neuroscience, 13(7), 910–919. Stetten, G. (2003). U.S. Patent no. 6,599,247. Washington, D.C.: U.S. Patent and Trademark Office. Thomson, J. A. (1983). Is continuous visual monitoring necessary in visually guided locomotion? Journal of Experimental Psychology, Human Perception and Performance, 9, 427–443. Tversky, B. (1981). Distortions in memory for maps. Cognitive Psychology, 13(3), 407–433. Wagner, M. (1985). The metric of visual space. Perception and Psychophysics, 38(6), 483–495. Wraga, M. J., Creem, S. H., & Proffitt, D. R. (2000). Updating displays after imagined object a nd v iewer rotations. Journal o f E xperimental Ps ychology: Learning, Memory, and Cognition, 26, 151–168. Wu, B ., K latzky, R . L ., Shel ton, D ., & Ste tten, G . (2005). P sychophysical evaluation of in-situ ultrasound visualization. [Special Issue: Haptics, Virtual and Augmented Reality], IEEE Transactions on Visualization and Computer Graphics 11, 684–699.
6 An Action-Specific Approach to Spatial Perception
Dennis R. Proffitt
People have conjectured about spatial perception for millennia, and have studied it in earnest for well over a h undred years. As can be seen in current perception textbooks, spatial perception is typically viewed as a general-purpose representation of the environment’s layout. The perception of surface layout is generally thought to be unaffected by people’s bodies, what they might be doing, or their internal physiological states. Textbooks divide spatial perception into topics defined by distal environmental properties such as the perception of distance, size, and shape. In this view, spatial perception is specific to the environmental properties that are perceived. This cha pter p rovides a d ifferent a pproach. E xtending Gibso n’s (1979) t heoretical a pproach, spa tial per ception i s h ere v iewed a s a biological adaptation that supports our species’ ways of life. From this perspective, spatial perception is action-specific a nd the subdivision of the field is made relative to the actions that spatial perception su pports, such a s r eaching, g rasping, w alking, a nd t hrowing. Perception relates spatial layout to one’s abilities to perform intended actions and a lso to t he inherent costs associated w ith t heir performance. In essence, it is proposed that people see the world as “reachers,” “graspers,” “walkers,” and so forth. By this account, perception relates and is influenced by three factors: the visually specified environment, the body, and purpose. 179
180
Embodiment, Ego-Space, and Action
The Visual Specification of the Environment For a moving observer in a natural setting, the environment’s spatial layout is well specified by optical and ocular-motor variables (Proffitt & Caudek, 2002; Sedgwick, 1986). Viewed in isolation, a great deal is known about the human sensitivity to each of the myriad of visual variables that specify environmental properties. On the other hand, understanding how these variables are combined when information is redundant—the problem of c ue i ntegration—has proven to be a tough problem to solve. There are almost no studies that attempt to model the combination of more than two variables. This is because, as the number of specifying variables increases, the number of possible c ombinations of t hese v ariables t hat wou ld ne ed to b e i nvestigated becomes prohibitively la rge (Cutting & V ishton, 1995). The problem of cue integration has relevance for t he current argument because of the possibility that the combination of visual information is action-specific. Most current models of cue integration rely upon some variant of weighted averaging in which each specifying variable is used to derive an estimate of the relevant environmental property, each estimate is then weighted by its prior reliability, and then a weighted average is taken (cf. Landy, Maloney, Johnston, & Young, 1995). An especially intriguing a lternative ha s be en proposed by Domini, C audek, a nd Tassinari (2006). I n t heir model, i nformation is combined d irectly without first deriving the environmental property to which it relates. In contrast to weighted averaging models, in Domini et al.’s model, environmental properties are derived only after the information has been combined. The intractability of cue integration in natural environments bears upon a f undamental issue. Currently, it is not k nown whether c ue integration is influenced by what the perceiver is doing. It is possible that the processes that weight or combine specifying variables do so differently, depending upon what the perceiver is trying to do. This is one of many possible mechanisms by which purpose and action may influence perception. Another possible mechanism der ives f rom t he fac t t hat specifying variables are sampled differently depending upon the perceiver’s goals. It has long been know that eye movements are strongly influenced b y purpose (Yarbus, 1967). I n h is r eview of e ye m ovements and the control of actions, Land (2006) wrote,
An Action-Specific Approach to Spatial Perception
181
One of the main conclusions from this review is that eye movement strategies are very task-specific. They principally involve the ac quisition o f i nformation n eeded f or t he e xecution o f motor a ctions i n t he s econd or s o b efore t hey a re p erformed and in the checking of the execution of each action. (p. 322)
Land reminds us that where people look depends upon what they are attempting to do, and thus, the sampling of visual information is action-specific. Visual attention is also action-specific. Employing a visual search paradigm, B ekkering a nd N eggers ( 2002) p resented pa rticipants with physical arrays of blocks that varied in both orientation and color. A block’s orientation influenced the hand posture that would be required to grasp it. On each trial, participants attended to a fixation d ot a nd were i nstructed t o ma ke a s accade t o a b lock having a spec ified orientation a nd co lor. Following t he s accade t hey were instructed to either point or grasp the block. It was found that there were fewer erroneous saccades to blocks having the wrong orientation when participants were intending to grasp the block as opposed to when they were intending to point to it. The number of saccades to blocks of the wrong color was unaffected by the intended-action manipulation. This st udy sh owed t hat i ntentions t o per form a n action such as grasping, which must accommodate to an object’s orientation, can influence the visual processing of object orientation.
The Body The body ha s a n ex terior a nd a n i nterior. The ex terior co nsists o f the body’s form, which enables a behavioral potential as determined primarily b y t he skel eton a nd skel etal m uscles. The ex terior body performs ac tions i n t he ex ternal environment. The body’s interior consists of the plethora of organs, glands, and physiological systems that sustain life. A p rincipal f unction of the brain is to control the body so as to achieve desired states in both the external environment and the body’s internal environment. Studies in behavioral ecology show that the behavior of organisms is p rimarily g overned b y e nergetic an d r eproductive im peratives (Krebs & Davies, 1993). With respect to energy, organisms have been shaped b y e volution to f ollow beha vioral st rategies t hat o ptimize obtaining en ergy ( food), co nserving en ergy, del ivering en ergy to
182
Embodiment, Ego-Space, and Action
their young, and avoiding becoming energy for predators. To meet these ends, species have evolved behavioral strategies for achieving desired outcomes i n t he ex ternal physical environment wh ile concurrently maintaining desired states in the internal environment of the body. The c urrent ac count su ggests t hat spa tial per ception p romotes effective and efficient behavior by directly relating the visually specified environment to t he possibilities a nd costs of intended actions. To ach ieve t his, per ception m ust be ac tion-specific. F or ex ample, when people intend to walk, they see the world as “walkers.” Thei r perception will reveal where walking is possible in relation to their walking-relative physiological potential as well as the energetic costs associated with walking. If instead, people view the same scene as “throwers,” then perceiving the possibilities and cost of locomotion become irrelevant. People can throw a ball over a gorge that does not afford walking. A concrete example from behavioral ecology is here provided to illustrate a potential advantage of perceiving the environment relative to one’s action potential. An animal’s assessment of the risk of an approaching predator can be determined by measuring how close the p redator c an co me bef ore t he a nimal i nitiates flight (Stankowich & Bl umstein, 2 005; Ydenberg & Di ll, 1986). A n ig uana w ith a cool body temperature w ill flee f rom a p redator at a g reater d istance t han w ill one w ith a w armer body ( Rocha & B ergalo, 1990). Because t he reptile i s more metabolically efficient at w armer body temperatures, its maximum escape speed and body temperature are positively correlated. Thus, the risk of being caught by a predator is a function of both the distance of the predator and the iguana’s body temperature. It is, of course, not known what iguanas perceive, but two possibilities come to mind. It could be t hat the iguana sees the distance to the approaching predator as being the same regardless of body temperature. The iguana might be supposed to have a generalpurpose d istance perception system t hat is u naffected by i ntended action or physiological state. In this case, when deciding whether to flee, the reptile would have to relate the predator’s apparent distance to its body temperature and the implications of this metabolic state to i ts r unning spe ed. A nother pos sibility, i n l ine w ith t he c urrent approach, i s t hat ig uanas s ee predators a s bei ng cl oser wh en t heir bodies are cool as compared to when they are warm. In this case, the iguana flees whenever it sees the predator’s proximity as falling
An Action-Specific Approach to Spatial Perception
183
within an invariant flight-specifying distance. The iguana does not have to relate perception to its potential for action because this has already been achieved in perception. The m echanisms b y wh ich p hysiological st ate m ight i nfluence visual processing are many. The brain resides in a ch emical milieu in which many aspects of the body’s interior state are directly manifested i n h ormones, n eurotransmitters, a nd v arious d imensions of b lood’s co mposition. Giv en t hat t he n eural co rrelates o f v isual awareness are associated with very late processing in the temporal lobe (Koch, 2004), there are also numerous opportunities for neural influences from both visual and nonvisual areas.
Purpose The argument t hat perception is action-specific dema nds a f undamental role for pu rpose; p erception i s s pecific to t he ac tion t hat is intended. People in the same situation will see the world differently depending upon what they are intending to do. People see the world as “walkers” only if they intend to walk or “throwers” only if they intend to “throw.” As will be discussed later, a manipulation that influences the effort required to walk but not to throw, will influence people’s perception of distance if they intend to walk but not if they intend to throw (Witt, Proffitt, & Epstein, 2004). The r emainder o f t his cha pter w ill de scribe spa tial per ception from an action-specific perspec tive. It w ill be sh own, for ex ample, that objects within reach appear c loser th an those th at a re out of reach, and that since reachability is extended by holding a tool, apparent d istances a re i nfluenced, ac cordingly: Ob jects t hat a re w ithin reach when holding a tool, but out of reach when the tool is not held, appear closer when the tool is held and the “reacher” intends to use it (Witt, Proffitt, & Epstei n, 2005). Other studies show that egocentric extents are expanded when walking is made more effortful due to the wearing of a h eavy backpack (Proffitt, Stefanucci, Banton, & Epstein, 2003). Such effects cannot be accommodated by approaches that conceptualize distance perception as a g eneral-purpose representation of t he environment. Ha nd t ools i nfluence “reaching d istance,” whereas backpacks influence “walking distance.” Perceptions are here viewed as being action-specific, as opposed to being specific to distal environmental properties such as distance.
184
Embodiment, Ego-Space, and Action
Reaching Near spac e i s defined b y t he ex tent o f a perso n’s r each o r sl ightly beyond, and thus, it is an instance of a d imension of spatial layout that ha s a n ac tion-specific definition. Others have referred to this region as personal space (Cutting & Vishton, 1995) or peripersonal space ( Lavadas, 2 002). N ear spac e c an be ex panded b y p roviding people with a hand tool that extends their reach. When this is done, previously out of reach objects will fall within near space, and as a consequence, these objects will appear closer than they did before the tool was held (Witt, Proffitt, & Epstei n, 2005). Being reachable has consequences for an object’s visually perceived distance. In t he Witt et a l. st udies, pa rticipants sat at a t able upon wh ich targets were projected by a d igital projector in the ceiling. On e ach trial, a t arget w as p rojected a nd pa rticipants j udged i ts eg ocentric distance u sing a v isual-matching t ask. A fter ma king t his d istance judgment, participants reached out and touched the target if it was within reach and pointed to its location if it was not. The experimental manipulation was defined by whether or not the participants held a conductor’s baton that extended their reach. It was found that targets that were out of reach without the baton, but within reach when it was held, were perceived to be closer when the baton was held, as if judgments of proximity incorporated reachability. In another study, Witt et al. showed that the influence of holding the baton is entirely dependent upon whether participants intended to reach with it. The previously described experimental design was repeated except that, after making the distance judgments, participants never reached out to touch the targets. In this study, holding the baton had no effect on the apparent distance to the targets. Two conclusions can be drawn from the Witt et al. studies. First, the apparent distance to objects is influenced by whether or not they can be t ouched. E xtending o ne’s r each w ith a t ool d iminishes t he apparent distance to objects that become touchable only through its use. Second, reachable space is not rescaled if a tool is held with no intent to use it. Perception is influenced by the behavioral potential to perform intended actions. Interesting pa rallels t o t he W itt e t a l. findings c an be f ound i n the el ectrophysiology a nd cog nitive n euroscience l iteratures. I riki, Tanaka, and Iwamura (1996) found that the macaque monkey possesses visual neurons in the intraparietal sulcus that fire when a rai-
An Action-Specific Approach to Spatial Perception
185
sin is in its near space. These cells fire when a visible raisin could be grasped and eaten but not when it was seen to be out of arm’s reach. Iriki et al. then trained monkeys to use a rake to acquire raisins that were be yond t heir g rasp w ithout i t. N eurons t hat had p reviously not fired to r aisins be yond a rm’s reach now fired to raisins w ithin rake’s reach. This study indicates that macaque monkeys—and most likely people—possess visual neurons that code for the reachability of objects and that these cells rescale the spatial range of reachability when a tool is held and used. Research w ith n eglect p atients h as a lso s hown t hat n ear s pace can become rescaled through tool use. Neglect patients ignore much of what is present in the left side of their visual field. A co mmon diagnostic a ssessment for neglect i s to a sk patients to bisect a l ine presented in the frontal plane before them. People with neglect will indicate a pos ition o n t he l ine t hat fa lls fa r t o t he r ight o f ac tual center, t hereby i ndicating t hat t hey ha ve n eglected a ll o r m ost o f the l eft s ide o f t he l ine. W ith r espect t o t he s ymptoms o f n eglect, a d ouble d issociation be tween n ear a nd fa r spac e ha s be en found. Some pa tients sh ow n eglect o nly f or l ines i n n ear spac e ( Halligan & Ma rshall, 1 991), wh ereas o thers sh ow n eglect o nly f or fa r l ines (Cowey, Small, & Ellis, 1994). Patients who show neglect only in near space will respond accurately on the bisection task if they use a laser pointer to indicate the center of a line that is beyond reach. However, if a st ick is used t hat a llows t hem to indicate t he line’s center by touching it, t hen neglect w ill a gain be ex hibited (Berti & F rassinetti, 2000; Pegna et al., 2001). These latter findings show that far space can become remapped into near space through tool use, and that this remapping has an influence on the perceptual processing of these patients. Specifically, physical contact, even if indirect, seems to invoke the mechanisms underlying neglect, whereas distal localization alone does not. Together, the studies reviewed in this section indicate that reachability has visual consequences. Behavioral studies show that objects in near space appear closer than those that are not. With tool use, more distant objects become reachable, a nd consequently, t hey are perceived to be closer. The electrophysiological studies with macaque monkeys show t hat v isual neurons ex ist, wh ich code f or reachable objects, a nd t hat t hese cells w ill rescale reachable space a s a r esult of learning to use a tool. Finally, studies of patients, who experience neglect on ly i n ne ar s pace, i ndicate t hat t he ne ural me chanisms
186
Embodiment, Ego-Space, and Action
responsible for t heir neglect a re spec ific to reachability a nd not to absolute distance.
Grasping Grasping objects re quires t hat p eople re ach to a n object’s lo cation and ach ieve a n a ppropriate a rm a nd ha nd post ure t o g rasp a nd manipulate the object. If the to-be-grasped object is a hand tool with a ha ndle, t hen t he orientation of t he ha ndle relative to t he g rasper can make the tool more or less easy to pick up. Consider a hammer. If its handle is pointed to the grasper’s right, then the hammer can be easily grasped with the right hand, but not with the left. It would be w orthwhile, a t t his po int, f or t he r eader t o p lace o n a t able a n elongated object—a pen will do—and pretending that it is a hammer, notice how easy it is to pick up with the right hand when the handle points to t he r ight a s opposed to pointing to t he left. When doing this demonstration, be sure to pick up the pretend hammer in a way that is appropriate for its use; that is, the grasping posture must be one that affords hammering—the hammer’s head must be above the hand, not below. The ease with which a hand tool can be grasped affects its apparent distance, but surprisingly, only for right-handed people (Linkenauger, Witt, Stefanucci, & Proffitt, 2006). In these studies, participants sat at a table. Directly in front of the participants was a small dot on the edge of the table, which served as the near endpoint when making distance judgments. On each trial, an experimenter placed a hand tool at varying d istances i n f ront of t he pa rticipant, w ith t he ha ndle pointing either to the left or t o t he r ight. A d ot w as a ffixed to the center of gravity of each tool. Participants were told to imagine picking up t he tool w ith t heir r ight ha nd i n a ma nner appropriate for its use, after which they indicated its apparent distance—the distance between the dot before them on the table and the dot on the tool—using a v isual matching task. Finally, they picked up the tool and gave it to the experimenter. It was found that, for right-handed participants, tools appeared nearer when the handle was pointed to the r ight a s opposed to t he left, i ndicating t hat t he tools appeared closer when they were easier to grasp and pick up. Another e xperiment replicated t he above design e xcept t hat t he right-handed participants were instructed to use their nondominant
An Action-Specific Approach to Spatial Perception
187
left hand. The results for handle orientation reversed. The tools were now seen to be nearer when their handles pointed to the left rather than to the right, a finding which is again consistent with the notion that apparent grasping-distance is influenced by ease of grasp. A t otally u nanticipated finding i n t he L inkenauger e t a l. st udies w as t hat n one o f t he r esults w ith r ight-handers g eneralized t o left-handed participants. Left-handers saw the tools as being equally far away regardless of the pointing direction of the tool’s handle or which ha nd w as u sed t o p ick i t u p. L eft-handers a re k nown t o be more ambidextrous (Gonzalez, Ganel, & Goodale, 2006), and this may be a r eason for why they were unaffected by the orientation of a tool’s handle. In everyday circumstances, if left-handers see a tool with its handle pointed away from their dominant hand, then they would be more likely than a right-hander to pick it up with their nondominant hand. In addition, left-handers have had a l ifetime of experience coping with such tools as scissors, can openers, and writing desks, which have been designed for right-handed people. In summary, right-handers see tools as appearing closer when their handles are oriented in a d irection that makes the tool easy to pick up with the intended dominant or nondominant hand. Leftha nders see the world differently; the orientation of the tool’s handle does not influence t heir grasping-distance perception, perhaps because t hey are more ambidextrous than right-handers. There ex ists a n ex tensive l iterature o n t he n europhysiology o f grasping (see Castiello, 2005, for a r eview.) The literature i ndicates that the visual guidance of grasping engages the dorsal stream of visual processing including the anterior intraparietal sulcus and networks of other nearby parietal areas. A human fMRI study by Valyear, Culham, Sharif, Westwood, and Goodale (2006) showed that a region i n t he poster ior portions of t he i ntraparietal su lcus showed strong activations associated with changes in the orientation of hand tools, but not t o cha nges i n t he t ool’s identity. Identity cha nges of tools evoked strong activations in the temporal lobe’s fusiform gyrus but n ot i n pa rietal r egions. These r esults a re co nsistent w ith M ilner and Goodale’s (1995) proposal that the ventral visual processing stream is r esponsible f or sha pe per ception a nd o bject r ecognition, whereas the dorsal stream controls visually guided actions. Grasping a tool must conform to the orientation of its handle, and this orientation sensitivity was seen in parietal but not temporal activations. Identifying a tool does not require viewpoint-specific encoding, and
188
Embodiment, Ego-Space, and Action
thus, temporal regions showed sensitivity to changes in an object’s identity but not to changes in its orientation. To anticipate the last section of this chapter, “Putting What, Where, a nd How Together,” for conscious spatial perception to be action-specific, aspects of both dorsal a nd ventral processing must be combined. Ventral processing is required to identify a ha mmer as bei ng a ha mmer, a nd d orsal p rocessing i s r equired t o t ake i ts orientation i nto account when picking it up. Interestingly, patients with ventral stream damage may not be able to identify an object as being a ha mmer and, although they can pick it up, they may do so in a ma nner that is inappropriate for its use; they may grasp the handle w ith t he ha mmer’s head below t he ha nd (Carey, Ha rvey, & Milner, 1996). Creem and Proffitt (2001b) showed that people have a strong tendency to pick up tools by their handles in a manner that is appropriate for their use, even if the handles are pointing away from them and appropriate grasping is difficult. However, if participants are r equired t o d o a nother t ask t hat p uts a h eavy l oad o n s emantic processing, which interferes w ith concurrent object recognition processing, th en th ey b ehave l ike th e p atients wi th v entral d amage. They p ick u p t ools w ith t he e asiest g rasp, e ven i f t his r esults in a post ure that is inappropriate for the tools’ use (Creem & Pr offitt, 2001b). These studies indicate that grasping a tool appropriately requires both ventral and dorsal processing. The initial processing of what, where, and how may be functionally and anatomically distinct (Creem & P roffitt, 2 001a), b ut t he ac tion-specific na ture o f spa tial perception manifests contributions from all three functions. Walking Perceiving the surface layout of the ground is of primary importance f or w alking. V isual per ception p rovides i nformation abo ut where w alking i s pos sible a s w ell a s t he d ifficulty associated with any chosen path. With respect to the geometry of its spatial layout, the g round p lane ha s t wo w alker-relative pa rameters, eg ocentric distance and slant. Both of these parameters are influenced by t he energetic costs associated with walking. A recent review provides an in-depth summary of studies showing energetic influences on perceiving t he g round’s la yout ( Proffitt, 2006). The ba sic findings are that hills appear steeper and egocentric distances farther, following
An Action-Specific Approach to Spatial Perception
189
manipulations of the anticipated metabolic energy costs associated with walking an extent. With r espect t o g eographical sla nt per ception, pe ople g rossly overestimate the slant of hills in all circumstances. Five-degree hills are t ypically judged to be abo ut 20° a nd 10° hills appear to be 3 0° (Proffitt, B halla, G ossweiler, & M idgett, 1995). A la rge i ncrease i n this overestimation occurs following manipulations of the metabolic energy r equired t o a scend h ills ( Bhalla & Pr offitt, 1999). In these studies, the energetic costs associated with walking were experimentally manipulated by having people wear a heavy backpack or become physically tired by taking an hour-long exhausting run. Other studies selected people based upon t heir physical fitness, age or health. Creem-Regehr, Gooch, Sahm, and Thompson (2004) used a harness to ma nipulate w alking e ffort i n a v irtual en vironment a nd f ound that i ncreased e ffort w as a ssociated w ith a n i ncrease i n per ceived geographical sla nt. O verall, i t wa s f ound t hat h ills a ppear steeper when people are encumbered by a backpack or harness, tired, of low fitness, elderly, and in declining health. The overestimation of slant—both normative and experimentally induced—occurs in c onscious a wareness. These o verestimations are obvious to anyone looking at a hill with knowledge of its actual slant. P articipants a re su rprised a nd i ncredulous wh en, f ollowing an experiment they are told that the hill that was judged by them to be 2 0° is, i n fac t, only 5°. The conscious perception of sla nt was assessed with both verbal reports and a v isual match task. Another assessment used a visually guided action measure that is dissociated from conscious awareness. Participants placed their hand flat on a rotating palmboard and, while looking at the hill but not their hand, attempted to ma ke t he boa rd pa rallel w ith t he sla nt o f t he h ills. These adjustments a re quite accurate a nd u naffected by a ny of t he energetic manipulations (Bhalla & Proffitt, 1999). The dissociation between the measures of explicit awareness and the v isually g uided ac tion measure may reflect t he t wo st reams of visual processing, which Milner and Goodale (1995) proposed. The explicit measures may entail ventral processing, whereas the visually guided action may be controlled by the dorsal stream. At present, no direct evidence exists for this account of slant perception. The overestimation of slant in conscious awareness is thought to promote effective long-term planning of locomotion, whereas accuracy i n v isually g uided actions promotes e ffective behaviors i n t he
190
Embodiment, Ego-Space, and Action
immediate proximal environment. It is obvious why actions directed at the immediate environment should be accurate. People would be clumsy creatures indeed if, whenever they encountered a 5° hill, they lifted their foot to accommodate a 20° incline. The functional utility of overestimation in conscious awareness takes a bit of explaining. The normative bias to overestimate geographical slant is an instance of psychophysical response compression that is found in many magnitude estimation tasks—see Proffitt (2006) for an expanded discussion of w hy re sponse c ompression le ads t o ove restimation i n s lant perception. Another example of response compression occurs in the human sensitivity to light intensity. When asked to indicate when a change i n brightness has occurred, dark-adapted people i n a co mpletely dark environment can detect t he presence of a f ew photons of l ight. On t he o ther ha nd, i n a w ell-illuminated en vironment, i t takes a n i ncrease of orders-of-magnitude more l ight before pe ople can notice the change. This is an instance of psychophysical response compression. I ts v irtue i n t he c ase o f l uminance de tection i s t hat people have a h igher s ensitivity to cha nges i n l ight i ntensity when ambient light is low compared to when it is high. Similarly for geographical slant perception, normative overestimation allows people to be more sensitive to changes in small slants, for example, noticing the difference between 5° and 6° compared to detecting a difference between 75° and 76°. Seeing differences in the former has real consequences for planning locomotion, whereas there are no behavioral consequences that depend upon detecting the difference between the latter two inclines. There a re t wo adv antages a ssociated w ith t he i ncrease i n sla nt overestimation that occurs when the effort required to ascend hills is i ncreased. First, i ncreased overestimation i mplies a n i ncrease i n sensitivity to small geographical slants. This means that as the metabolic costs of a scending h ills i ncreases, people become more sensitive to h ill s lants. S econd, w hen c hoosing w alking s peed, p eople need not relate their current ability to expend walking energy to the apparent sla nt of t he h ill. I nstead, t his relationship i s i mmediately apparent in perception. Recall that the discussion of how an iguana’s body temperature influences its flight-distance for a n approaching predator. If the iguana sees an invariant flight-distance in which its body tem perature a nd r esulting flight spe ed i nfluence t he ap parent distance to the predator, then the iguana does not have to relate these variables when it is deciding when to flee. Similarly for people
An Action-Specific Approach to Spatial Perception
191
viewing hills, by seeing their potential to expend energy in the perceived sla nt o f i nclines, pe ople c an dec ide h ow fa st t o w alk ba sed upon how steep the hill appears. They do not have to relate slant to their current physiological state because this has already been done in perception. Regarding energetic influences on spatial perception, the research findings for egocentric distance perception are much the same as for slant. Distances to targets appear greater when people are encumbered by a backpack or have just gotten off a treadmill—an experience that causes an adaptation in which the visual/motor system learns that it takes forward walking effort to go nowhere, and consequently, that it takes more effort to walk a prescribed distance (Proffitt, Stefanucci et al., 2003). Across these studies, a variety of dependent measures were used i ncluding verbal reports, v isual match t asks, a nd blind walking, in which participants view a target, don a blindfold, and attempt to walk to the target’s location without sight. It has a lso been found t hat apparent distances on steep hills are expanded (Stefanucci, Proffitt, Banton, & Epstein, 2005). Steep hills require more energy to ascend. The hills that were assessed in these studies could not be a scended w ithout considerable difficulty. The finding of increased perceived distance on hills presents a g eometrical pa radox. Giv en t hat t he sla nt o f h ills i s o verestimated, wh en people l ook u p a h ill, t he a pparent d istance t o a t arget sh ould be underestimated. Given that the angular elevation to the target does not change, the steeper the apparent hill, the short must be the egocentric extent along the ground to its location. This is what geometry requires. S uch findings o f g eometrical in consistencies in p erception have been found in earlier studies on perceiving spatial layout (Epstein, 1977; Epstein, Park, & Casey, 1961; Sedgwick, 1986). Manipulations that influence the effort required to walk may not affect t he effort required to per form other d istance-relative behaviors. F or ex ample, w alking o n a t readmill w ithout ex periencing optic flow causes an adaptation in which more effort i s a ssociated with walking to a t arget, and consequentially, its apparent distance increases. H owever, t readmill w alking do es not a ffect t he e ffort required to throw a be anbag to a t arget location. In accord with an action-specific approach to spatial perception, when pe ople v iew a target—following a period of treadmill walking—with the intention of throwing a beanbag to its location, then the treadmill adaptation has no effect on their distance judgments (Witt, Proffitt, & Epstein,
192
Embodiment, Ego-Space, and Action
2004). Perceiving distances in these cases is specific to what a person is intending to do next. Walking adaptation influences perception if a person is a “walker” but not a “thrower.” That perceived ex tent ha s be en found to be ac tion-specific is in accord with prior studies on perceptual-motor adaptation conducted by Rieser, Pick, Ashmead, and Garing (1995). In their experiments, Rieser et al. had participants walk on a treadmill that was placed on a trailer being pulled across a field by a tractor. Through this means, the rate of optic flow was decoupled from the rate that participants were w alking. Fol lowing t his a daptation, p articipants we re s hown targets, and after being blindfolded, they attempted to walk to the target locations. Participants whose treadmill-walking rate was greater than t he t ractor’s spe ed w alked too fa r, a nd conversely, t hose who walked at a slower speed than that of the tractor walked too short a distance. Of particular relevance to the action-specificity argument, other pa rticipants who attempted to t hrow ba lls to t he location of targets were unaffected by the treadmill-walking adaptation. As with slant perception, the advantage of perceiving distances in terms of walking energy is that long-term motor plans can be based upon perception as opposed to requiring that perception and physiological state be combined during the planning process. Recall that the body ha s both a n ex terior a nd a n i nterior. Most of t he behaviors performed in the environment with the body’s exterior have as a goal the maintenance of a desired state in the body’s interior, a good example bei ng ma intaining a de sired r ate o f en ergy ex penditure. Both aspects of the body are related in walking-specific perception: In t he a pparent su rface la yout o f t he g round, pe ople s ee bo th t he possibilities and the associated energetic costs for walking. Consider a nother ex ample f rom beha vioral eco logy. A r ecent study u sed G PS t o t rack t he m ovement o f el ephants i n n orthern Kenya (Wall, Douglas-Hamilton, and Vollrath, 2006). It was found that elephants almost never ascended steep hills even when there was rich vegetation to be had. Wall et al. proposed that a principal reason for t his r eluctance t o a scend h ills w as t hat, bec ause o f t heir body weight, elephants incur an enormous energetic cost when climbing. It would cost the elephants more calories to climb the hills than they would obtain by consuming t he vegetation t hat could be ob tained there. Wall et al. stated, “We conclude that megafauna probably take a rather different view of their surroundings than more light weight animals. This is especially true if the heavyweights, like elephants,
An Action-Specific Approach to Spatial Perception
193
are herbivores for which energy replenishment is so much more time consuming t han it is for c arnivores” (pp. R 528). We, of course, do not know what elephants perceive, but from the current perspective it seems likely that their apparent topography would be highly exaggerated so a s to enhance their sensitivity to geographical slant and to relate the possibilities and associated energetic costs for obtaining food.
Throwing When throwing balls to targets, people perceive the distance to the targets relative to the effort associated with throwing (Witt, Proffitt, & Epstein, 2004). In these studies, participants viewed sports cones in a la rge o pen field a nd t hrew either l ight or h eavy ba lls t o t heir locations. After throwing the ball, participants reported the apparent distance to the cone and then threw the ball again. Depending upon t he ex periment, ei ther v erbal r eports o r a v isual-match t ask were u sed a s depen dent m easures. I n bo th c ases, d istances w ere judged to be g reater by t hose pa rticipants t hrowing heavy ba lls a s opposed to light ones. In another experiment, Witt et al. also showed that the influence of throwing was contingent upon viewing the target cone with the intention o f t hrowing. The ex periment had t wo g roups a nd bo th threw the heavy ball at targets. After throwing the ball, each group made a distance judgment. The groups differed in what they did next; one group attempted to throw the ball to the target while blindfolded, whereas the other group attempted to blind walk to the target’s location. Thus, when viewing the target, one group anticipated throwing and the other anticipated walking to the target location. The “throwers” viewed the target cones to be farther away than did the “walkers.” Throwing heavy ba lls ma kes t argets appear fa rther away, but only if one is about to throw again. Recall that Witt et al. also showed the converse of t hese findings; t readmill-walking adaptation i nfluenced apparent distance only when participants anticipated walking to a target and not when they anticipated throwing a beanbag to its location, instead. A final set of studies was conducted to assess whether these findings were due to changes in perception itself, or to some action-specific post perceptual p rocess (W itt, Pr offitt, & Epstei n, 2 006). Two
194
Embodiment, Ego-Space, and Action
groups of participants were adapted to walking on a treadmill. Both groups t hen v iewed a t arget, b ut w ith d ifferent ex pectations. On e group anticipated that after donning a blindfold, they would attempt to walk to the target’s location. The other group expected to throw a beanbag to the target location while blindfolded. Thus, one group viewed the target as “walkers,” whereas the other group viewed it as “throwers.” Both groups put on the blindfolds and those in the walking group attempted to blind walk to the target location, as expected. After donning their blindfolds, the “throwers” were told that a mistake had been made in the instructions and that, in fact, they were to attempt to walk to the target location while blindfolded. Those participants in the walking condition were influenced by the treadmill walking adaptation and walked farther than those participants i n the throwing condition. The “walkers” viewed the target relative to the energy required to walk to it, and thus, they were influenced by the treadmill adaptation. The “throwers” had experienced the same treadmill ad aptation; h owever, bec ause t his ex perience i nfluenced the e ffort a ssociated w ith w alking b ut n ot t hrowing, t he f ormer being a n ac tion t hey had n ot a nticipated, t hey were u naffected by the adaptation. In other words, it seems to be anticipated effort that induces r e-calibration. I n a co ntrol ex periment, t his ex periment’s design w as r epeated ex cept t hat t he t readmill-walking ad aptation was eliminated a nd, i n t his case, t he g roups d id not d iffer in their blind w alking. N ote t hat, i n t he i nitial ex periment, bo th g roups did exactly the same thing; both were adapted to treadmill walking and bo th a ttempted t o b lind w alk t o a t arget. The o nly d ifference between the groups was their behavioral intention when they viewed the target. The results indicate that each group’s spatial perceptions, as c alibrated b y r equired e nergy, w ere speci fic t o t hese beha vioral intentions.
Falling Falling is an inherent danger associated with human locomotion. An adult could be injured by a slip and fall. Body size matters: A 2 m tall man, when tripping, will have a kinetic energy upon hitting the ground 20-100 times greater than a small child who learns to walk. This explains why it is safe for a child to learn to
An Action-Specific Approach to Spatial Perception
195
walk; whereas adults occasionally break a bone when tripping, children never do. (Went, 1968, pp. 407)
The cost o f i njury i ncreases w ith l ocomotion spe ed, a nd i n t he case o f fa lling f rom a h eight abo ve t he g round, w ith a ltitude. A s locomotion speed and attitude increase, fear of falling becomes palpable. All of these factors—speed, altitude, and fear of falling—have been found to influence spatial perception. Stefanucci, Proffitt, and Clore (2005) investigated how the risk of falling on a steep hill at a high speed might affect geographical slant perception. Participants viewed a steep sidewalk from the top, either standing on a skateboard or on a box of equivalent height. Given the steepness and extent of the sidewalk, descending on the skateboard would be very fast and risky. As in prior studies, explicit awareness of geographical slant perception was assessed with verbal reports and a visual matching task. The visually guided palmboard was also used. In addition, participants provided rating-scale judgments about how fearful they were of descending the sidewalk. It was found that the sidewalk appeared steeper—as assessed by explicit awareness measures—for those participants who were standing on the skateboard and reported feeling frightened compared to those who stood on the box a nd r eported l ittle o r n o f ear. The v isually g uided pa lmboard adjustments were u naffected by both the skateboard manipulation and r eported l evels o f f ear. Al though p articipants k new th at th ey would not actually have to ride the skateboard down the hill, fear of an action seems to elicit the same kind of processing as anticipating its performance. Being at the edge of a high drop off, such a s a cl iff or b alcony, makes pe ople u neasy a s t he pena lty for fa lling could entail s evere injury or worse. Jackson and Cormack (in press) found that people overestimate v ertical d istances an d, m ost i mportantly, th at th eir overestimation is much greater when the height is viewed from above than from below. Jackson and Cormack concluded that the greater overestimation that is exhibited when heights are viewed from above is a consequence of an evolved adaptation, which through perceptual exaggeration, motivates people to avoid falling off heights. Stefanucci and Proffitt (2006) similarly found t hat h igh vertical extents a re overestimated much more f rom t he t op t han f rom t he bottom. P articipants u sed a v isual matching t ask t o judge vertical extent. They ei ther st ood a top a 26 -foot ba lcony l ooking d own o r
196
Embodiment, Ego-Space, and Action
they stood at the bottom looking up. The height of the balcony was overestimated by about 60% from the top and slightly less than 30% from the bottom. In addition, participants provide rating scale judgments of their fear of falling. It was found that the assessed anxiety related to falling was positively correlated with distance estimations. These st udies suggest t hat a n emotion, i n t his c ase fear, i nfluences spatial perception. Other emotional influences on spatial perception have a lso been demonstrated (Riener, Stefanucci, Proffitt, & Clore, 2003). Hitting and Putting People w ho pl ay s ports often report t hat t he spatial d imensions of balls, goals, hurdles, swimming pools, and so forth appear to be influenced by how well t hey are performing. Baseball, which has a r ich journalistic tradition, provides many examples of apparent ball size being influenced by hitting performance. When describing a massive home run, Mickey Mantle said, “I never really could explain it. I just saw the ball as big as a grapefruit” (Ultimate New York Yankees, n.d.). George Scott of the Boston Red Sox said, “When you’re hitting the ball [well], it comes at you looking like a grapefruit. When you’re not, it looks like a blackeyed pea” (Baseball Almanac, n.d.). During a slump, Joe “Ducky” Medwick of the St. Louis Cardinals said he felt like he was “swinging at aspirins” (ESPNMAG.com, n.d.). Witt a nd P roffitt (2005) found t hat batter’s h itting per formance does, i n fac t, i nfluence t heir recollection of t he ba ll’s size. S oftball players we re appro ached a fter co mpleting a g ame a nd w ere a sked to indicate the size of a softball by selecting one of many differently sized c ircles, wh ich were d isplayed on a poster boa rd. A fterwards, their ba tting a verage f or t he co mpleted g ame w as ob tained. The recalled size of a softball was found to be positively correlated with player’s batting average. Golf is another sport in which reports abound of apparent spatial distortions. Golfers w ill claim t hat when t hey are putting well, t he hole looks as big as a ba sket, and when their putting is off, the hole can look as small as a dime. Witt, Linkenauger, Bakdash, and Proffitt (2006) tested golfers after they had completed an 18-hole round. Similar to the study with softball players, the golfers were asked to choose, from among many circles, the one that was the size of a golf
An Action-Specific Approach to Spatial Perception
197
hole. Following this, other information about their round and golfing ab ility w as obtained. It w as found t hat a pparent h ole s ize w as negatively correlated with the golfers’ scores for the 18-hole round. Because, in golf, low scores are good, this implies that the recollected hole size was positively related to performance. Of particular interest, it was found that apparent hole size was not related to how good a player was as assessed by his or her handicap. This implies that good players do not always see the hole as bigger, but rather that anyone will see the hole as being bigger on days when he or she is playing better. Finally, it w as found t hat a pparent hole-size w as correlated with putting performance on the last hole but not to overall score on the last hole, suggesting that these effects are specific to the relevant task, which was putting. Both th e s oftball a nd p utting st udies ob tained s ize j udgments from memory. Participants had completed play a nd were not looking at the ball or putting hole. Thus, the results of these studies could be due to performance influences on perception, memory, or both. There are, however, other results that suggest that perception could have be en a ffected. Wesp, Cichello, Gr acia, a nd Davis (2004) conducted a st udy o n d art t hrowing a nd per ceived t arget s ize. They found that participants who were more successful in hitting the target viewed it to be bigger than did participants who performed less well. In their study, the target was visibly present when the size judgment was made. Putting What, Where, and How Together Current views on the two cortical visual systems ma ke claims, not only about the anatomical localization of visual functions, but also about consciousness (Goodale & M ilner, 2004; M ilner & G oodale, 1995). With respect to function, the ventral stream has been defined as the “what” system and is responsible for object identification. The dorsal stream is responsible for processing “where” and “how,” terms that refer to spatial localization a nd t he v isual g uidance of actions (Creem & Proffitt, 2001a). The two cortical pathways have also been implicated i n acco unts o f t he n eural co rrelates o f co nsciousness. Milner and Goodale (1995) proposed that conscious awareness was associated w ith v isual processing i n t he ventral but not t he dorsal stream. They suggested that the term, perception, should be applied
198
Embodiment, Ego-Space, and Action
to conscious visual awareness, whereas the visual guidance of action should be co nsidered t o be a d istinct, u nconscious v isuomotor process. The e vidence f or M ilner a nd G oodale’s p roposal i s w ell k nown (Goodale & M ilner, 2 004; M ilner & G oodale, 1995). P atients w ith brain d amage i n t he tem poral l obe ma y ha ve v ery l imited sha pe awareness, and yet, they can accommodate their grasp when picking up objects as well as fully-sighted persons. Conversely, patients with parietal damage have unaffected shape perception abilities, but have difficulty g rasping ob jects e ffectively. S ome ha ve g one so fa r a s to describe the dorsal stream as a “zombie,” an instance of a system that is capable of guiding behavior, but without any attendant consciousness or will (Koch, 2004). It is difficult to imagine, however, how spatial perception could be action-specific w ithout d orsal p rocesses co ntributing t o co nscious experience. C onsider t he i nfluences of re achability a nd g raspability on distance perception. Objects appear closer when they can be touched with a tool compared to when no tool is held and the objects are out of hand’s reach. Visually guided reaching is a function of the dorsal stream. The cells in macaque monkeys that responded to reachable raisins are found in the dorsal stream (Iriki et al., 1996). With respect to grasping, tools appear closer to right-handed people when the handles are oriented toward the right as opposed to the left hand, and consequently, are easier to grasp (Linkenauger et al., 2006). The human brain areas that show sensitivity to the orientation of tools in fMRI studies are in the dorsal stream (Valyear et al., 2006). This chapter ha s a rgued t hat spatial perception relates to a nd is influenced by the visually specified environment, the body, and purpose. It has been suggested that people see spatial layout in terms of the ac tions t hat t hey i ntend to per form a nd t he bod ily opportunities and costs of these actions. Anticipated actions are formative in how spa tial r elationships a re per ceived. Rec all t he st udy i n wh ich people walked on a treadmill, and thereby, acquired a v isual/motor adaptation in which the effort associated with walking an extent was increased. F ollowing t his ad aptation, t argets a ppear fa rther a way if pa rticipants a nticipate walking to its location but not if t hey a re about to throw a beanbag to it (Witt, Proffitt, & Epstein, 2004). The adaptation acq uired t hrough t readmill w alking a ffects “walkers” but not “throwers.” The world is seen relative to the behavior that is about to be performed.
An Action-Specific Approach to Spatial Perception
199
The what, where, and how of the two visual streams are coordinated and combined in perception in accordance with the choice of purposive behaviors. By choosing to reach, grasp, walk, or throw, a person s ees t he world i n ter ms of t heir body ’s abilities to per form these intended actions and also in relation to the inherent costs associated with their performance.
References Baseball A lmanac (n.d.). Retrieved May 18, 2004, f rom http://www.baseball-almanac.com/players/player.php?p=scottge02 Bekkering, H., & Neggers, S. F. W. (2002) Visual search is modulated by action intentions. Psychological Science, 13, 370–374. Berti, A., & F rassinetti, F. (2000). When far becomes near: Remapping of space by tool use. Journal of Cognitive Neuroscience, 12, 415–420. Bhalla, M., & Proffitt, D. R. (1999). Visual-motor recalibration in geographical sla nt per ception. Journal of E xperimental P sychology: H uman Perception and Performance, 25, 1076–1096. Carey, D. P., Harvey, M., & Milner, A. D. (1996). Visuomotor sensitivity for shape and orientation in a pa tient with visual form agnosia. Neuropsychologia, 34, 329–337. Castiello, U. (2005). The neuroscience of grasping. Nature Reviews Neuroscience, 6, 726–736. Cowey, A ., Sma ll, M., & E llis, S . (1994). L eft v isuo-spatial neglect c an be worse in far than in near space. Neuropsychologia, 32, 1059–1066. Creem, S. H. & Proffitt, D. R. (2001a). Defining the cortical visual systems: “What,” “where,” and “how.” Acta Psychologica, 107, 43–68. Creem, S. H., & P roffitt, D. R. (2001b). Grasping objects by t heir handles: A n ecessary i nteraction b etween c ognition a nd ac tion. J ournal of Experimental P sychology: H uman P erception an d P erformance, 2 7, 218–228. Cutting, J. E., & Vishton, P. M. (1995). Perceiving layout and knowing distances: The i ntegration, rel ative p otency, a nd c ontextual u se of d ifferent i nformation a bout de pth. I n W. Eps tein & S . Rog ers ( Eds.), Perception of s pace an d mot ion (pp. 6 9–117). S an D iego, C A: A cademic Press. Domini, F., Caudek, C., & Tassinari, H. (2006). Stereo and motion information are not independently processed by the visual system. Vision Research, 46, 1707–1723. Epstein, W. (1977). Stability an d c onstancy in v isual pe rception: M echanisms and processes. New York: Wiley.
200
Embodiment, Ego-Space, and Action
Epstein, W., Park, J., & Casey, A. (1961). The current status of the size-distance hypotheses. Psychological Bulletin, 58, 491–514. ESPNMAG.com (n.d.). Retrieved May 18, 2004, from http://espn.go.com/ magazine/vol5no11ichiro.html Gibson, J. J. ( 1979). The ecol ogical appr oach t o v isual pe rception. B oston: Houghton Mifflin. Gonzalez, C. L., Ganel, T., & Goodale, M. A. (2006). Hemispheric specialization for the visual control of action is independent of handedness. Journal of Neurophysiology, 95, 3496–3501. Goodale, M. A., & M ilner, D. (2004). Sight unseen: An e xploration of c onscious and unconscious vision. Oxford: Oxford University Press. Halligan, P. W., & Ma rshall, J. C . (1991). Left neglect for near but not far space in man. Nature, 350, 498–500. Iriki, A ., T anaka, M ., & I wamura, Y . ( 1996). C oding o f m odified body schema during to ol u se by mac aque p ostcentral neurons. NeuroReport, 7, 2325–2330. Jackson, R. E. & Cormack, L. K. (in press). Evolved navigation theory and the descent illusion. Perception & Psychophysics. Koch, C . (2004). The quest for consciousness: A ne urobiological approach. Englewood, CO: Roberts. Krebs, J. R., & Davies, N. B. (1993). An introduction to behavioural ecology (3rd ed.). Malden, MA: Blackwell. Land, M. F. (2006). Eye movements and the control of actions in everyday life. Progress in Retinal and Eye Research, 25, 296–324. Landy, M. S., Maloney, L. T. Johnson, E. B., & Young, M. J. (1995). Measurement a nd m odeling o f de pth c ue c ombination: I n def ense o f w eak fusion. Vision Research, 35, 389–412. Lavadas, E . (2002). Functional and dy namic properties of v isual peripersonal space. Trends in Cognitive Sciences, 6, 17–22. Linkenauger, S. A., Witt, J., Stef anucci, J., & P roffitt, D. R. (2006). Ease to grasp an o bject affects per ceived d istance. Journal o f V ision, 6 (6), 724a. Milner, D. A., & Goodale, M. A. (1995). The visual brain in action. Oxford: Oxford University Press. Pegna, A. J., Petit, L., Caldara-Schnetzer, A.-S., Khateb, A,. Annoni, J.-M., Sztajzel, R., et al. (2001). So near yet so far: Neglect in far or near space depends on tool use. Annals of Neurology, 50, 820–822. Proffitt, D . R . (2006). E mbodied p erception a nd t he e conomy o f ac tion. Perspectives on Psychological Science, 1,110–122. Proffitt, D. R . & C audek, C . (2002). De pth p erception a nd p erception o f events. In A. F. Healy & R. W. Proctor (Vol. Eds.), I. B. Weiner (Editor-in-Chief), Handbook of psychology: Vol. 4. Experimental psychology (pp. 213–236). New York: Wiley.
An Action-Specific Approach to Spatial Perception
201
Proffitt, D. R., Bhalla, M., Gossweiler, R., & M idgett, J. (1995). Perceiving geographical slant. Psychonomic Bulletin & Review, 2, 409–428. Proffitt, D. R., Stefanucci, J., B anton, T., & Eps tein, W. (2003). The role of effort in perceiving distance. Psychological Science, 14, 106–112. Riener, C. R., Stefanucci, J. K., Proffitt, D. R., & Clore, G. (2003). An effect of m ood o n p erceiving spa tial l ayout. [Abstract]. Journal o f V ision, 3(9), 227a. Rieser, J. J., Pick, H. L., Ashmead, D. H., & Garing, A. E. (1995). Calibration of human locomotion and models of perceptual-motor organization. Journal of E xperimental P sychology: H uman P erception an d P erformance, 21, 480–497. Rocha, C . F. D., & B ergalo, H. G. (1990). Thermal biology a nd flight distance of Tropidurus oreadicus (Sauria Iguanidae) in an area of Amazonian Brazil. Ethology, Ecology, and Evolution, 2, 263–268. Sedgwick, H. (1986). Space perception. In K. R. Boff, L. Kaufman, & J. P. Thom as (Eds.), Handbook of perception and human performance (Vol. 1, pp. 1–57). New York: Wiley. Stankowich, T., & Blumstein, D. T. (2005). Fear in animals: A meta-analysis and review of risk assessment. Proceedings of th e Royal Society, 272, 2627–2634. Stefanucci, J. K., & Proffit t, D. R. (2006). Looking down from high places: The roles of a ltitude and fear in t he perception of height. Journal of Vision, 6(6), 723a. Stefanucci, J. K., Proffitt, D. R., Banton, T., & Epstein, W. (2005). Distances appear different on hills. Perception & Psychophysics, 67, 1052–1060. Stefanucci, J. K., Proffitt, D. R., & Clore, G. (2005). Skating down a steeper slope: The effect of fear on geographical slant perception. Journal of Vision, 6(6), 723a. Ultimate New York Yankees (n.d.). Retried May 18, 2004, from http://www. ultimateyankees.com/MickeyMantle.htm Valyear, K . F ., Cu lham, J. C ., Sha rif, N., W estwood, D ., & G oodale, M . A. ( 2006). A do uble d issociation b etween s ensitivity to c hanges i n object identity and object orientation in the ventral and dorsal visual streams: A human fMRI study. Neuropsychologia, 44, 218–228. Wall, J., Douglas-Hamilton, I., & Vollrath, F. (2006). Elephants avoid costly mountaineering. Current Biology, 16, R527–R529. Went, F. W. (1968). The size of man. American Scientist, 56, 400–413. Wesp, R ., Ci chello, P., Gr acia, E . B ., & Da vis, K . (2004). Obs erving a nd engaging i n purposeful ac tions w ith objects i nfluences est imates of their size. Perception & Psychophysics, 66, 1261–1267. Witt, J. K ., L inkenauger, S . A ., B akdash, J. Z ., & P roffitt, D . R . ( 2006). Golf p erformance c an m ake t he hole lo ok a s bi g a s a bu cket or a s small a s a d ime. U npublished ma nuscript, U niversity o f Vi rginia, Charlottesville.
202
Embodiment, Ego-Space, and Action
Witt, J. K., & Proffitt, D. R. (2005). See the ball, hit the ball: Apparent ball size is correlated with batting average. Psychological S cience, 16, 937–938. Witt, J. K., Proffitt, D. R., & Epstein, W. (2004). Perceiving distance: A role of effort and intent. Perception, 33, 577–590. Witt, J. K ., Proffit t, D. R., & Epstein, W. (2005). Tool use affects perceived distance but only when you intend to use it. Journal of Experimental Psychology: Human Perception and Performance, 31, 880–888. Witt, J. K., Proffitt, D. R., & Epstein, W. (2006). Effects of effort and intention on p erception: The lo cus o f t he e ffect. Journal of Vi sion, 6 (6), 721a. Yarbus, A., (1967). Eye movements and vision. New York: Plenum Press. Ydenberg, R. C., & Dill, L. M. (1986). The economics of fleeing from predators. Advances in the Study of Behavior, 16, 229–249.
7 The Affordance Competition Hypothesis A Framework for Embodied Behavior
Paul Cisek
Introduction In r ecent y ears, t he ter m embodiment has appeared with increasing f requency i n ps ychological d iscussions. L ike ma ny new ter ms, its m eaning i s n ot g enerally a greed u pon, a nd wha t i s i mplied b y embodiment i n one context does not a lways apply to a nother. The 34th Carnegie Symposium on Cognition defined embodiment as a “representation of the s elf i n the world” and of the ability of an imals to use a representation of their own bodies to aid in perceiving, understanding, and acting (this volume). These approaches suggest that many of the internal representations we employ to understand the world are expressed with reference to the body, and that the body itself may sometimes s erve a s a m edium of representation. I n t his sense, embodiment is seen as an important aspect of “cognition,” of the representational schemes with which we know our world. Some views on embodiment, however, see it as much more. The brain has always existed within a body, which has always defined the activities i n wh ich t he brain m ight eng age. From a de velopmental and evolutionary perspective, it has been a rgued t hat bod ily i nter203
204
Embodiment, Ego-Space, and Action
actions a re t he f oundation w ithin wh ich cog nition i tself em erges (Clark, 1 997; H endriks-Jansen, 1 996; Thelen, S chöner, S cheier, & Smith, 2001), and that our ability to perceive the world and to understand its meaning is g rounded w ithin t he context of sensorimotor interaction ( Piaget, 1 963). I n o ther w ords, em bodiment i s n ot a n aspect of cognition; cognition is an aspect of embodiment. Indeed, the term can be s een as being so f undamental that its use becomes redundant. Obviously, all behavior is embodied. What then is the use of the term embodiment? Why do we state that which is obvious? Perhaps part of the reason is that the field of psychology, in its desire to understand the mind, has for a long time been neglecting the body, and in recent years this neglect has been recognized as a fault that needs to be remedied. In this chapter, I will review p roposals wh ich su ggest t hat n ot o nly d oes t he body n eed to be brought back i nto psychological discussions, but that psychological theories should be reconstructed upon a foundation of bodily interaction. Probably t he ma jor r eason wh y ps ychology ha s n eglected t he body is an historical one. There ha s for a l ong t ime ex isted a w ide conceptual g ap be tween t he st udy o f ps ychological a nd b iological phenomena, a g ap ma intained b y t he d ifference i n sub ject ma tter as well as by the specialization of research programs and university faculty. Here, I take a brief look at the history of psychology and what I perceive as some of the reasons behind its neglect of embodiment (Cisek, 1999). The study of psychology is rooted in philosophy, which had for a long time been inextricably entwined with religion and the nearly unquestioned distinction between the physical body and a spiritual soul. This foundation had forced philosophers into the position now known as “dualism” (or more precisely, “substance dualism”), which suggests that the mind and the brain are different entities belonging to different realities. To dualists, the interaction between the material world and the immaterial mind is achieved through two kinds of interfaces: Perception, which communicates the status of the world to the mind; and Action, which plays out the mind’s wishes onto the world (Fig. 7.1a). It was a gainst t his backd rop of dualism t hat ps ychology, t he study of t he “psyche,” was first defined as a s cience in 1879 by Wilhelm Wundt (1832–1920). The first major pa radigm i n psychology w as Edward Titchener’s (1867–1927) “structuralism,” a strict introspective methodology for discovering the building blocks
The Affordance Competition Hypothesis
205
of t he m ind. This w as a n at tempt at a s cientific a nd r eductionist explanation of the mind, undaunted by its nonmaterial nature. While philosophers and psychologists were grappling with understanding the mind, biologists were making progress into understanding the body. From their perspective the brain was an organ like any other, and could be seen as a mechanism operating by known physical principles. After physiologists such as Edward Thor ndike (1874– 1949) a nd I van P avlov ( 1849–1936) f ormalized la ws o f l earning, the idea t hat e ven complex behavior m ight be ex plained i n purely (a)
(b)
(c)
stimuli
stimuli stimuli
Perception
Perception Perception
Mind
Cognition Action
Action
Action responses
responses
responses
(d)
Environment
Information Processing System
Memory Receptors Processor Effectors
Newell & Simon, 1972, Fig 2.1
Figure 7.1 A comparison of the schematic functional architectures used in psychological t heories. (a) The s cheme of d ualism, i n w hich p erception a nd a ction function as interfaces between the physical world and the non-physical mind. (b) The b ehaviorist s cheme, w hich d iscards t he c oncept of a non-ph ysical m ind. (c) The cognitive scheme, in which the mind is replaced by a pu rely physical process called c ognition. (d) The f unctional ar chitecture o f an “ information p rocessing system” proposed by Newell & Simon (1972) (reprinted with permission).
206
Embodiment, Ego-Space, and Action
physical ter ms bec ame i ncreasingly persu asive. These v iews l ed t o the development of behaviorism, credited to Watson (1913). Behaviorism insisted that psychology must be a purely objective branch of biology, and that the study of subjective phenomena which cannot be observed has no place in a science of behavior. The subject matter for psychology was no longer the discovery of the constituent elements of consciousness, but rather the study of the direct linkage between Perception and Action and of the learning laws which establish that linkage (Fig. 7.1b). This attitude dominated psychology for the first half of the 20th century, and a g reat deal of progress was made o n discovering the laws of learning. However, a s p sychologists i nvestigated more a nd more c omplex behavior, t hey m ore a nd m ore often c ame u p a gainst p henomena which r esisted a beha viorist ex planation. A cla ssic ex ample i s t he behavior of a r at r unning t hrough a ma ze. Tolman (1948) showed that wh en a fa miliar pa th t hrough a ma ze w as b locked, r ats w ere able to navigate through another path which also led to the reward, even when that path had never been previously explored and never reinforced. This result suggested that the rats built abstract “cognitive maps” of their surroundings, and that they were able to use such maps to solve the maze-running problem. To explain the rat’s behavior, an internal state had to be proposed. More significant shortcomings o f beha viorism w ere r evealed wh en i t w as a pplied t o h uman language. B. F. Sk inner’s (1957) book entitled Verbal B ehavior proposed to explain a ch ild’s acquisition of language using behaviorist principles such a s o perant co nditioning a nd per ceptual d iscrimination. H is ac count, h owever, m et w ith po werful c riticisms, e specially that of Noam Chomsky (1959), who argued persuasively that Skinner’s theory could not possibly account for the great flexibility of language. Chomsky concluded that mental operations must play a role in the performance of linguistic utterances. Mental states had to exist, but what form were they to take? In the early part of the 20th century, several new concepts were emerging i n fields s eemingly u nrelated t o ps ychology. F irst, A lan Turing’s pioneering work in machine theory resulted in a formal definition of “ computation”: Ac cording t o Turing (1936), a ll c omputation is formally equivalent to the manipulation of symbols in a temporary buffer. Second, research aimed at the development of more efficient telephone communication resulted in a formal definition of “information”: According to Shannon and Weaver (1949), the informational content of a signal is inversely related to the probability of
The Affordance Competition Hypothesis
207
that signal arising from randomness. Although these advances were made in engineering, they had a profound impact on psychology. In particular, they led to a v ery influential metaphor: that the brain is like a computer, a device which processes information (Block, 1990; Johnson-Laird, 1988; Pylyshyn, 1984). The co mputer m etaphor w as v aluable t o ps ychologists bec ause with i t, a nswers t o s everal ex isting d ilemmas a ll f ell i nto p lace. First, it suggested a language capable of describing internal states and internal processes in purely physical terms. The m-configurations of Turing’s computing machine (Turing, 1936) were analogous to m emories, a nd p rograms w ere a nalogous t o i nternal p rocesses which co uld l ink per ception (input) w ith ac tion (output). S econd, these co ncepts co uld be used t o g enerate p recise h ypotheses t hat could be te sted in t he laboratory. For example, information t heory could be u sed t o q uantify t he c apacity o f h uman s ensorimotor processing (Fitts, 1954) a nd working memory (G. A . M iller, 1956). Third, t he c omputer metaphor proposed a f unctional architecture for behavior (Fig. 7.1c), based on the concept that perception is like input processing, action is like output, and cognition is like computation (Pylyshyn, 1984). The architecture of a general problem solver (Newell & Simon, 1972; see Fig. 7.1d) has been inherited as the conceptual foundation of ma ny ps ychology tex tbooks (e.g. B est, 1986; Dodd & White, 1980). Finally, the computer metaphor addressed the biology-psychology gap by explaining why biological phenomena should be expected to be different from psychological phenomena. The analogy was simple: biology st udies t he ha rdware, a nd ps ychology st udies t he software (Block, 1995). It w as sh own t hat t he i nformation processing f unctions of computers are independent of their hardware implementations, a p roposal k nown a s t he Ch urch-Turing t hesis ( Hofstadter, 1979). Through t he a nalogy be tween co mputers a nd b rains, t his motivated t he a ttitude t hat t he co mputational a nd a lgorithmic aspects o f beha vior c an be st udied i ndependently o f t he st udy o f their b iological subst rates ( Marr, 1 982). To ex plain a m ental p henomenon, it is sufficient to express it as a computation. This doctrine essentially gave psychologists the mandate to study mentality without worrying about its implementation in the brain. Founded on the assumption that cognition is like computation, cognitive psychology developed for a long time in deliberate isolation from biological data (Best, 1986; Block, 1995; Marr, 1982; Pinker, 1997). The gap between biology and psychology was allowed to widen again.
208
Embodiment, Ego-Space, and Action
In recent years, t he trend is being reversed. Advances in neuroscience have made it possible to study neural activity in behaving animals and to noninvasively detect activity in the human brain. Biological phenomena are increasingly interpreted in terms of psychological concepts such as attention, working memory, and perceptual categorization. This proje ct of br inging biolo gy a nd p sychology together a gain ha s come to be k nown a s “cognitive neuroscience,” an effort to “map elementary cognitive functions onto specific neuronal systems” (Albright, Kandel, & Posner, 2000, p. 613). Of course, as the name reveals, cognitive neuroscience is heavily influenced by the currently dominant view in psychology: cognitive psychology. In particular, it has inherited cognitive psychology’s emphasis on representations of knowledge which are abstract and disembodied. Here, I would raise a concern. Should we expect that a conceptual framework wh ich de veloped for ma ny years i n del iberate isolation of biology would provide the most promising foundation for interpreting neural data? Most cognitive neuroscience research assumes the answer to be y es (Albright et al., 2000; Gazzaniga, 2000). Consequently, ma ny n europhysiological ex periments a re de signed t o explicitly test specific concepts from cognitive psychology. However, many of those concepts were originally developed to explain a very limited set of behavioral capacities, such as human problem solving, and were n ot n ecessarily i ntended t o be a pplied t o a ll of behavior (Newell & S imon, 1972; P ylyshyn, 1984). Because advanced cog nitive abilities evolved quite recently, t hey could not have influenced the evolution of t he f undamental neuroanatomical organization of the b rain ( which i s r emarkably co nserved a mong a ll ma mmals). Indeed, re cent ne urophysiological work often s eems q uite a t odd s with these concepts. For example, studies of the cerebral cortex have encountered difficulties in interpreting neural activity in terms of distinct perceptual, cognitive, or motor systems. Visual processing diverges in the cortex into separate systems sensitive to object identity and spatial location (Ungerleider & Mishkin, 1982), with no single representation of the world (Stein, 1992), leading to t he question of how t hese d isparate systems are bound together to form a u nified percept (Reynolds & Desimone, 1999; Singer, 2001; von der Ma lsburg, 1996). Individual neurons in the posterior parietal cortex appear to reflect a m ixture of sensory (Andersen, 1995; Colby & G oldberg, 1999), motor (Snyder, Batista, & A ndersen, 1997), and cognitive information (Platt &
The Affordance Competition Hypothesis
209
Glimcher, 1 999), l eading t o pers istent deba tes o n t heir f unctional role. A recent review of data on the parietal cortex has suggested that “current h ypotheses c oncerning p arietal f unction m ay not b e the actual dimensions a long which the parietal lobes are functionally o rganized; o n t his v iew, wha t w e a re lack ing i s a co nceptual advance t hat l eads u s to te st be tter h ypotheses” (Culham & K anwisher, 2001, pp. 159–160). In other words, perhaps the concepts of separate perceptual, cognitive, and motor systems, which theoretical neuroscience inherits from cognitive psychology, are not appropriate for bridging neural data with behavior. In the late 1980s, Patricia Churchland (1987) suggested that traditional p hilosophical a nd ps ychological a pproaches fac e a ma jor challenge from the growing knowledge of neurophysiology, and that some of the central concepts of current theories may require serious reevaluation. Importantly, the brain is not a general problem solver, as envisaged by Newell and Simon (1972), but a system that evolved to m eet t he pa rticular n eeds o f i nteractive beha vior ( Hardcastle, 1995). S terelny (1989) a nd H endriks-Jansen (1996) ha ve su ggested that the basic tenets of cognitive psychology are not compatible with the brain’s evolutionary heritage, and thus are not biologically plausible. Many similar critiques continue to emerge from diverse directions, and are periodically compiled into books and special issues of journals (e.g., Núñez & Freeman, 2000; Still & Costall, 1991). Many of t hese c ritiques em phasize a c entral r ecurring t heme—that t he function of the brain must be viewed in the context of how it controls behavior. In other words, psychological theories can and should be based upon a foundation that emphasizes interactive, embodied behavior instead of one t hat emphasizes passive acquisition of disembodied knowledge. This central theme has led to several recent viewpoints, known at different times as “embodied cognition” (Clark, 1997; Thelen et a l., 2001; Thompson & Varela, 2001), the “dynamical theory” (Adams & Mele, 1989; Beer, 2000), and “situated robotics” (Brooks, 1991; Harvey, Husbands, & Cliff, 1993; Hendriks-Jansen, 1996). All of these are really modern incarnations of several lines of thought that are much older (Ashby, 1965; Gibson, 1979; Maturana & Varela, 1980; Mead, 1938; M erleau-Ponty, 1945; Powers, 1973) i n so me c ases b y o ver a hundred y ears ( Bergson, 1896; De wey, 1896; J ackson, 1884). M ost of t hese v iewpoints em phasize t he p ragmatic a spects o f beha vior (Gibson, 1979; Millikan, 1989; Piaget, 1963), a t heme that underlies
210
Embodiment, Ego-Space, and Action
several proposals regarding representation (Dretske, 1981; G allese, 2000; H ommel, M üsseler, A schersleben, & Pr inz, 2 001), m emory (Ballard, Ha yhoe, & P elz, 1 995; Gl enberg, 1 997), a nd v isual co nsciousness (O’Regan & Noë, 2001). Below, I outline a theoretical framework which is based on many of these proposals, and discuss several examples of neural data which I bel ieve to be co nsistent w ith concepts of embodiment. The main purpose is to demonstrate that these concepts may in fact be far more conducive to the functional interpretation of neural data than the much more widely accepted tenets of cognitive psychology. This is certainly not meant to deny t hat abst ract k nowledge acquisition occurs within the brain, but simply to suggest that a large portion of neural organization may be better understood as serving interactive behavior. There i s c ertainly a p lace f or abst ract r epresentations i n any theory of behavior including those which, like the present one, emphasize pragmatic issues of interactive control.
An Embodied Framework for Behavior The theoretical framework described below is based on two central concepts. The first is a definition of “representation” which emphasizes p ragmatic dema nds, a s o pposed t o cog nitive dema nds. The second is a functional decomposition of behavior which is an alternative t o t he t raditional deco mposition i nto per ceptual, cog nitive, and m otor s ystems. I n t his s ection, t hese co ncepts a re i ntroduced in an informal way, and in t he following section, t hey are used to interpret diverse neural data.
Descriptive and Pragmatic Representations An i mportant d istinction wh ich w ill be u sed h ere i s o ne be tween “descriptive” and “pragmatic” representations. Descriptive representations are those usually discussed in cognitive psychology and neuroscience. They a re pa tterns o f ac tivity wh ich co nvey i nformation about t he world or t he organism, preserving some k ind of relation between the representation and whatever it encodes (the “referent”). This a pplies t o i nternal i magelike r epresentations wh ich p reserve topological or c onfigurational relations in t he world, to h ierarchical symbolic models of shapes, and to more abstract representations
The Affordance Competition Hypothesis
211
which st and f or l ogical p ropositions such a s “ it i s d aytime.” The emphasis here is on descriptive accuracy, or the extent to which the representation makes explicit some information about its referent. In co ntrast, “ pragmatic r epresentations” a re n ot p rimarily co ncerned with descriptive accuracy at all. Instead, they are most concerned with the efficiency of the behavior to which they contribute. For example, i ntermediate i nternal states i nvolved i n t he g uidance of a voluntary reaching movement need not explicitly correlate with any s ingle f eature o f t he m ovements i nvolved (jo int a ngles, t arget location) as long as the behavior they help to guide is effective in accomplishing i ts g oal; t hat i s, a s l ong a s t he pa ttern co nveyed t o downstream systems leads those systems to perform the movement correctly (Fetz, 1992). Although this will require some implicit correlation with external features, it does not preclude mixtures of such features. For example, variables correlating w ith t he d irection of a reaching movement can be mixed with variables correlating with the reward associated w ith t he reach target (see below). As long as t he mixture of features leads to adaptive behavior, t he confounding of variables w ithin a si ngle ne ural p opulation i s p erfectly a cceptable and will be supported by natural selection. Clearly, t he d istinction be tween “ descriptive” a nd “ pragmatic” representations i s n ot abso lute a nd v arious i ntermediate k inds o f internal st ates a re conceivable. I n fac t, bec ause t he bo ttom l ine of natural s election i s su rvival, wh ich u ltimately depen ds so lely o n overt behavior, then every descriptive representation must ultimately serve some pragmatic role in the system. Likewise, a pragmatic representation d oes ha ve t o a t l east i mplicitly co rrelate w ith ex ternal variables ( possibly s everal a t o nce). H owever, ma king t he d istinction allows one to vary the emphasis of one’s interpretations. Below, I discuss examples of neural activities which have been interpreted as descriptive representations, but which may be be tter understood in terms of pragmatic representations. Of course, any concept of a representation can only be meaningfully di scussed w ithin t he c ontext o f t he f unctional ar chitecture within which it is used. The next section addresses this issue.
The Affordance Competition Hypothesis Behavior i s co mmonly b roken d own i nto t hree g eneral k inds o f processes: perceptual processes which build an internal descriptive
212
Embodiment, Ego-Space, and Action
representation of the world and the agent in the world; cognitive processes wh ich u se per ceptual r epresentations a nd st ored m emories to build knowledge, make judgments, and decide upon a course of action; and motor processes which implement the motor plan (Figure 7.2 c,d). The functional decomposition proposed here is different, and rests on a distinction between two simple but fundamental pragmatic questions faced by every behaving creature at every moment: “what to do,” and “how to do it.” We can refer to these, respectively, as t he proble ms of action s election and action s pecification (C isek, 2001, 2007; Cisek & Kalaska, 2001b; Cisek & Turgeon, 1999; Kalaska, Sergio, & Cisek, 1998). At e very m oment, t he na tural en vironment p resents a n a nimal with ma ny opportunities a nd dema nds for action. The presence of food presents an opportunity to satiate hunger, while the presence of a predator demands caution or evasion. Usually, an animal cannot perform a ll of these behaviors at the same time because they often share the same effectors. You only have two hands, and you can only transport yourself in one direction at a t ime. Thus, one fundamental issue faced by every behaving creature is t he question of action selection. That question must be resolved, in part, by using external sensory information about objects in the world, and in part, by using internal information about current behavioral needs. Furthermore, the animal must tailor the actions it performs to the environment i n wh ich it is situated. Grasping a f ruit requires accurate guidance of the hand to the location of the fruit, while evading a predator requires one to run in an unobstructed direction that leads away from the threat. The specification of the parameters of actions is a second fundamental issue faced by behaving creatures. This specification also must use sensory information from the environment. In particular, it requires information about the spatial relationships among objects and surfaces in the world, represented in a coo rdinate frame relative to the orientation and configuration of the animal’s body. Traditional cog nitive t heories p ropose t hat t hese t wo q uestions are resolved in a serial manner, that we decide what to do before planning how to do it. In contrast, t he claim made h ere is t hat the processes o f a ction se lection and s pecification o ccur si multaneously, and continue even during overt performance of movements. That is, sensory information arriving from the world is continuously used to specify several currently available potential actions, or what Gibson (1979) c alled “ affordances”. Meanwhile, other kinds of information
The Affordance Competition Hypothesis
213
are collected to select from among these the one that will be released into o vert ex ecution ( Cisek, 2 006, 2 007; Ci sek & K alaska, 2 001b, 2005; Gl imcher, 2 001; G old & Shad len, 2 001; K alaska e t a l., 1998; Kim & Shadlen, 1999; Platt, 2002). From this perspective, behavior is viewed as a constant competition between conflicting demands and opportunities which an animal is presented with. Hence, the framework described here is called the “affordance competition” hypothesis (Cisek, 2007). Before developing t hese proposals f urther, I p resent a s implified example of how t he processes of spec ification a nd s election m ight operate in the context of visually-guided reaching movements (Fig. 7.2). Suppose that several graspable objects are present within reach (Fig. 7.2a). The spatial properties of t hese objects define re gions of activity in a retinotopic neural map (Fig. 7.2b), roughly corresponding to their spatial layout. In such a ma p, each neuron is tuned to a specific combination of parameters (in this case, retinal location) and its activity reflects the likelihood that something of interest occupies that location in retinal space. Attentional processes enhance or suppress c ertain p arts o f th e m ap, f avoring th e fu rther s ensorimotor processing of information from the attended locations. The surviving information is transformed onto another neural map, for example, biasing factors activity
spatial attention visual space (a)
(b)
(c)
(d)
initial dir of mov ection e m e nt
reti not op
ic
ego cen t
ric
join t- b a sed
Figure 7.2 Schematic d iagram of pro gressive s pecification a nd s election of parameter regimes for a reaching movement. Information about the spatial layout of reachable objects (a) is mapped onto a neural population representing potential reach target locations in retinotopic coordinates (b). Subregions of this activity map are enhanced by at tention a nd transformed onto a nother map of p otential reach directions in egocentric coordinates (c). A competition between distinct reaching actions is influenced by various biasing factors, and the winning parameter region is transformed further onto a joi nt-based population representation of t he initial direction of movement (d). Once released into execution, this parameter region is updated and fine-tuned by feedback and predictive feedforward information.
214
Embodiment, Ego-Space, and Action
one representing potential reach parameters in extrinsic coordinates related to particular body parts (Fig. 7.2c). In such maps, contiguous regions of activity above some threshold define individual potential reaching actions. Because averaging across nonoverlapping regions is undesirable, selection is needed as soon as separate islands appear. This s election ma y i nvolve r eciprocal i nhibition a nd de scending biasing factors influenced by information gathered about the objects occupying the corresponding parts of space. When the competition is resolved such that only a single contiguous region of activity survives i n the neural map, that activity i s t ransformed fu rther onto another map; for example, a n i ntrinsic joint-space map (Fig. 7.2d). The peak of activity in this map specifies the parameters of the initial direction of the reaching movement. Once this movement is released into ex ecution, t he t rajectory i s generated online t hrough i nternal feedback w ithin t he f ronto-parietal n etwork ( Bullock, Ci sek, & Grossberg, 1998; Burnod et a l., 1999), through predictive “ forward models” p roviding co mpensation f or m ovement dy namics (W olpert, Gha hramani, & J ordan, 1995), a nd t hrough o vert v isual a nd proprioceptive feedback automatically adjusting the movement as it unfolds (Desmurget et al., 1999; Pisella et al., 2000). Figure 7.3 p rovides a r ough ske tch o f h ow t he a ffordance competition h ypothesis ma y ma p o nto t he p rimate c erebral co rtex. I n particular, it i s s uggested th at th e process of t ransforming s patial information i nto representation of potential ac tions oc curs w ithin a d istributed network of a reas i n t he poster ior pa rietal a nd c audal frontal cortex (dark l ines). I n a ll of t hese a reas, simultaneous representations o f po tential ac tions co mpete a gainst e ach o ther f or further processing, a nd t his competition i s biased by a n umber of influences (double-line a rrows). The se influences arrive from subcortical st ructures such a s t he ba sal g anglia a nd co rtical r egions such as the prefrontal cortex, which receive information pertinent to action selection from regions including the temporal lobe. While the diagram on Figure 7.3 is a v ery simplified sketch, it may nevertheless be useful for interpreting neural activity in a diverse set of brain areas, as will be discussed below. To summarize, the “affordance competition” hypothesis suggests that interaction with the environment involves a continuous process of t ransforming spatial sensory i nformation to spec ify a nd update the parameters of possible and ongoing actions. This is analogous to
The Affordance Competition Hypothesis
pa
rt l co rieta
ex
prem otor
potential actions
215
corte x
cognitive decision-making
attention
dors al
strea m
prefrontal cortex
predicted feedback
basal ganglia
behavioral biasing
ventral stream
specification
object identity cer
visual feedback
ebe llum
selection temporal cortex
motor command
Figure 7.3 Hypothetical neural substrates of specification a nd s election i n t he context of v isually-guided re aching move ments. I nformation i nvolved i n s pecifying potential actions is shown as solid arrows, and selection influences biasing competition among potential actions are shown as double-line arrows. In emphasizing visually-guided reaching, this basic sketch makes several important omissions re garding t he h ypothesis pre sented i n t he t ext. For e xample, it do es not illustrate ho w t he v isual i nformation i n t he dor sal s tream d iverges i nto re gions specialized for different kinds of actions, and does not show the integration of proprioceptive information in the parietal cortex. (Reprinted from Cisek (2007) with permission.)
the proposal that part of perception is the interpretation of sensory information in terms of potential actions made possible by the environment and the animal’s place within it (Fadiga, Fogassi, Gallese, & Rizzolatti, 2000; Gibson, 1979; Kalaska et al., 1998). Multiple potential actions available at a given time are specified simultaneously, in various degrees of abstraction, and continuously compete for further processing a nd f or o vert ex ecution. A ction s election i n t his b road sense encompasses phenomena such as spatial attention, contextual modulation, and decision-making processes. It is important to note that from this perspective, many cognitive phenomena may be seen as evolutionary elaborations of action selection. A major accomplishment of primate evolution may have been the expansion of t he brain regions responsible for collecting information for making smarter and more sophisticated decisions about actions. We will return to this proposal at the end of this chapter.
216
Embodiment, Ego-Space, and Action
Interpreting Neural Data from a Pragmatic Perspective In t his s ection, I r eview d iverse n eural d ata w ithin t he co ntext o f the hypothesis outlined above. Most of the experiments I will discuss have b een designed to test ideas developed w ithin t he f ramework of modern cognitive neuroscience. Consequently, their results have been i nterpreted f rom t hat perspective, emphasizing de scriptive representations a nd acquisition of k nowledge about t he world, and have been based on a functional decomposition into perception, cognition, and action. Here, I instead interpret these results from the pragmatic perspective outlined above. Visual Processing Philosophical a nd ps ychological t heories often a rgue t hat our perception of the world is the result of a co mputational process which uses sensory inputs to construct an internal representation of the external world (Marr, 1982). O ften, it i s a ssumed t hat i n order for this internal representation to be useful for building knowledge and making decisions, it must be both unified (linking diverse information into a centrally available form) and stable (reflecting the stable nature of the physical world). To date, however, neural data do not support the existence of such an internal representation. Indeed, the most studied sensory system, the visual system, appears to be neither unified nor stable. Ungerleider & M ishkin (1982) r eviewed a natomical a nd p hysiological data indicating that visual information in the cerebral cortex d iverges i nto t wo pa rtially d istinct st reams o f p rocessing: a n occipito-temporal “v entral st ream” i n wh ich c ells a re s ensitive t o information per taining to t he identity of objects; a nd a n oc cipitoparietal “dorsal stream” in which cells are sensitive to spatial information. W ithin e ach of t hese, i nformation d iverges f urther. Ther e are separate visual streams for processing color, shape, and motion (Felleman & Van Essen, 1991) as well as separate representations of space (Colby & G oldberg, 1999; S tein, 1992). F rom t he t raditional cognitive perspective, the ventral stream builds a r epresentation of “what” is in the environment, while the dorsal stream builds a representation of “where” things are, and these systems and all of their substreams must somehow be bound together to form a unified rep-
The Affordance Competition Hypothesis
217
resentation of the world. However, how this binding occurs remains unresolved, despite a v ibrant research effort (Engel, Fries, & Singer, 2001; Singer, 2001). Furthermore, ac tivity i n a ll o f t hese v isual s ystems a ppears strongly i nfluenced b y a ttentional m odulation ( Boynton, 2 005; Moran & Desimone, 1985; Treue, 2001). This is usually exhibited as an enhancement of neural activity from the regions of space to which attention is directed, and a suppression of activity from unattended regions. S uch a ttentional m odulation i s f ound i n bo th t he v entral and dorsal streams a nd increases as one ascends t he v isual hierarchy (T reue, 2 001). C onsequently, t he n eural r epresentation o f t he visual world “is dominated by the behavioral relevance of the information, rather t han designed to provide a n accurate a nd complete description of it” (Treue, 2001). Because the direction of attention is frequently shifting from one place to another, the activity in visual regions is constantly changing, even if one is looking at a completely motionless scene. To su mmarize, c ognitive ps ychology’s a ssumption o f a u nified and stable internal representation does not appear to be well-supported by the d ivergence of the visual system and the widespread influence of attentional modulation. In contrast, both of these properties are naturally compatible with the framework of the affordance competition hypothesis. F or ex ample, t he d ivergence o f t he v isual streams i nto v entral a nd d orsal s ystems pa rallels t he d istinction made abo ve be tween p rocesses f or ac tion spec ification and a ction selection (Kalaska et al., 1998; Passingham & Toni, 2001). Goodale & Milner (1992; Milner & Goodale, 1995) proposed that while the ventral stream subserves what is usually thought of as conscious visual perception, t he d orsal st ream p rovides i nformation f or g uiding movement. Clearly, the visual front-end of a system for action specification must retain current information on spatial locations, size, and motion, expressed in a reference frame that is egocentric—in short, it must behave like the dorsal visual system. As discussed below, the further divergence of the dorsal stream into separate fronto-parietal subsystems suggests that it is strongly involved in processing information for specific k inds of movements (Calton, Dickinson, & Snyder, 2 002; C olby & D uhamel, 1996; Matelli & L uppino, 2 001). The visual front-end of a system for action selection should collect clues useful for selecting actions, such as identity of objects—in short, it should behave like the ventral visual stream. In primitive animals,
218
Embodiment, Ego-Space, and Action
the v entral st ream ma y ha ve s imply co llected i nformation u seful for c ueing t he r elease o f pa rticular ac tions, wha t e thologists c all “sign stimuli” (Hinde, 1966). In more advanced creatures, this may have evolved into full-fledged object recognition mechanisms, even ones which capture abstract knowledge. Thus, a pragmatic perspective may be u sed to interpret the processing in both visual streams (Cisek, 2001, 2007; Kalaska et al., 1998; Passingham et al., 2001), and even to address the question of how their operation is unified toward common goals (Cisek & Turgeon, 1999). Likewise, t he a ffordance co mpetition h ypothesis s uggests a n explanation of why there is such a strong modulatory effect of attention throughout visual cortex (Boynton, 2005; Moran & Desimone, 1985; Treue, 2001). Although selective attention has been traditionally seen as a n eural mechanism which addresses t he problem of a computational bottleneck (Broadbent, 1958), it c an a lso be v iewed from a more pragmatic perspective. As described above, because the actions t hat a n a nimal c an t ake at a ny g iven moment a re l imited, some mechanisms must exist to eliminate irrelevant actions, and it is beneficial to begin that elimination as early in sensorimotor processing as possible. Spatial attention has been proposed as an early mechanism for action selection (Allport, 1987; Castiello, 1999; Neumann, 1 990), a m echanism wh ich en hances s ensory i nformation from particular spatial regions of interest. In summary, both the dorsal a nd ventral v isual st reams may be interpreted f rom a p ragmatic perspec tive. I n pa rticular, t he dorsal stream specifies the spatial parameters of potential actions currently available, wh ile t he ventral st ream helps to collect i nformation for selecting between t hese options. In t he next section, I su ggest t hat the affordance competition hypothesis may be a u seful f ramework for i nterpreting h ow n eural ac tivity i n pa rietal a nd f rontal co rtex contributes to the specification of potential actions and how a competition between those actions plays out. Specification of Potential Actions in the Fronto-Parietal Network The t raditional f ramework o f cog nitive ps ychology ( Figure 7 .1c) motivates o ne t o i nterpret n eural d ata a s co ntributing t o ei ther a perceptual, cog nitive, o r m otor f unction. F rom t his perspec tive, neural ac tivity i n poster ior pa rietal cortex (PPC) ha s be en notori-
The Affordance Competition Hypothesis
219
ously problematic. PPC has been interpreted as representing spatial sensory i nformation on t he location of objects i n t he environment (Andersen, 1 995; Stei n, 1 992), st rongly m odulated b y a ttention and beha vioral co ntext ( Burbaud, Doeg le, Gr oss, & B ioulac, 1991; Kalaska, 1996; Mountcastle, Lynch, Georgopoulos, Sakata, & Acuna, 1975). This has led to the hypothesis that parietal cortex is involved in constructing a “salience map” (Constantinidis & Steinmetz, 2001; Gottlieb, 2002; Kusunoki, Gottlieb, & G oldberg, 2000), which presumably forms part of the perceptual representation which serves as input to the cognitive system. However, there is also strong evidence that parietal cortical activity contains representations of actions (Andersen, 1 995; K alaska, S cott, Ci sek, & S ergio, 1 997; Ma zzoni, Bracewell, Barash, & A ndersen, 1996; Platt & Gl imcher, 1997; Snyder e t a l., 1997, 2 000b), i ncluding ac tivity spec ifying t he d irection of intended saccades (Andersen, 1995; Snyder et al., 1997) and arm reaching movements (Buneo, Jarvis, Batista, & Andersen, 2002; Ferraina & Bianchi, 1994; Kalaska & Crammond, 1995). Because action representations are supposed to be the output of the cognitive system, it seems difficult to reconcile these findings with the sensory properties of PPC, leading to persistent debates about its role. Furthermore, recent experiments have shown that neural activity in PPC is also modulated by a range of variables associated with the process of decision making, such as “expected utility” (Platt et al., 1999), “local income” (Sugrue, Corrado, & Newsome, 2004), “hazard rate” (Janssen & Shad len, 2 005), a nd “relative subjective de sirability” (Dorris & Gl imcher, 2004). In short, t he poster ior pa rietal cortex does not appear to neatly fit into any of t he c ategories of perception, cognition, or action. Indeed, it is difficult to see how neural activity in this region may be i nterpreted using the concepts of cognitive psychology (Culham & Kanwisher, 2001). In c ontrast, t he a ffordance co mpetition h ypothesis p roposes a natural way of interpreting a great deal of parietal activity. If the PPC is involved in specifying the spatial parameters of potential actions, then its neural ac tivity must be r elated to both sensory a nd motor information. F or ex ample, i f t he la teral i ntraparietal a rea ( LIP) i s involved in the specification of potential saccades, t hen its ac tivity must correlate with the location of possible saccade targets (Mazzoni et al., 1996; Snyder et al., 1997), even when multiple potential saccades are processed simultaneously (Platt & Glimcher, 1997; Powell & Goldberg, 2000). At the same time, however, ongoing selection of
220
Embodiment, Ego-Space, and Action
potential actions will modulate the strength of activities in LIP. Such modulation has been shown to be influenced by target salience (Colby et al., 1999; Kusunoki et al., 2000), reward size and selection probability (Platt & Gl imcher., 1999; Shadlen & Newsome, 2001) as well as other decision variables (Dorris et al., 2004; Janssen et al., 2005; Sugrue et al., 2004), and prior information on the type of action to be performed (Calton et al., 2002). Progressive elimination of potential s accade t argets a long t he d orsal st ream a lso ex plains wh y t he representation of space in LIP is so sparse (Gottlieb, Kusunoki, & Goldberg, 1998): only the most promising targets make it to LIP. Similar i nterpretations c an be a pplied t o o ther r egions o f t he posterior pa rietal co rtex. A s m entioned abo ve, t he d orsal v isual system is not u nified, b ut p rogressively d iverges i nto pa rallel sub systems each specialized toward t he dema nds of d ifferent k inds of tasks (Andersen, Snyder, Bradley, & Xing, 1997; Caminiti, Ferraina, & Ba ttaglia-Mayer, 1998; C olby & D uhamel, 1996; C olby & G oldberg, 1999; K alaska, Ci sek, & G osselin-Kessiby, 2 003; R izzolatti & Luppino, 2001; Stein, 1992; Wise, Boussaoud, Johnson, & Caminiti, 1997). Area LIP is concerned with control of gaze (Snyder, Batista, & A ndersen, 2000a), represents space i n a body -centered reference frame ( Snyder, Ba tista, & A ndersen, 1 998), a nd i s i nterconnected with other parts of the gaze control system including the frontal eye fields (FEF) a nd t he superior colliculus (Paré & W urtz, 2 001). The medial i ntraparietal a rea ( MIP) i s i nvolved i n t he co ntrol o f a rm reaching movements (Ferraina & B ianchi, 1994; K alaska & Cr ammond, 1 995; S nyder e t a l., 1 997), r epresents t arget l ocations w ith respect to the current hand location (Buneo et a l., 2002; Graziano, Cooke, & Taylor, 2000), a nd is i nterconnected w ith f rontal regions involved in reaching, such as dorsal premotor cortex (PMd) (Johnson, Ferraina, Bianchi, & Caminiti, 1996; Marconi et al., 2001). The anterior intraparietal area (AIP) is involved in grasping, is sensitive to object size and orientation, and is interconnected with the grasprelated ventral premotor cortex (PMv) (Nakamura et al., 2001; Rizzolatti & L upino, 2001). To su mmarize, t he dorsal st ream d iverges into parallel subsystems, each of which specifies the spatial parameters of different kinds of potential actions. Each of these subsystems contains maps of parameter space defined by the kinds of variables relevant t o e ach k ind o f ac tion (e.g. f or r eaching, m edial pa rietal regions r epresent t argets r etinotopically w ith r espect t o a n o rigin defined by the location of the hand; Buneo et al., 2002). Each of these
The Affordance Competition Hypothesis
221
maps encodes multiple potential actions which compete against each other, and sometimes against potential actions in other maps as well. The parietal specification of actions occurs before the onset of movement, and also during movement, fine-tuning the parameters of an ongoing action (Desmurget, Epstein et al., 1999; Desmurget, Grea et al., 2001; Kalaska et al., 2003).
Simultaneous Processing of Potential Actions According to the serial architecture of Figure 7.1c, the cognitive system decides upon a course of action prior to motor preparation, and then t he m otor s ystem p lans a nd ex ecutes t he de sired m ovement. This leads to the assumption that only a single course of action is planned at any given moment, and that a “desired trajectory” is prepared a head of movement i nitiation (Keele, 1968; Miller, Galanter, & Pribram, 1960). However, neural data does not strongly support the d istinction be tween dec ision ma king a nd m ovement p lanning or between planning and execution. First, there is growing evidence that decisions about actions are made w ithin the very same neural regions responsible for executing t hose actions. For example, decisions abo ut e ye m ovements i nvolve pa rietal a rea L IP ( Dorris & Glimcher, 2004; Janssen & Shad len, 2005; Platt & Gl imcher, 1999), the frontal eye fields (Coe, Tomihara, Matsuzawa, & Hikosaka, 2002; Schall & Bichot, 1998), and the superior colliculus (Basso & Wurtz, 1998; Carello & Krauzlis, 2004; Horwitz, Batista, & Newsome, 2004), all of which are parts of the saccade system. Likewise, decisions about arm actions involve regions long implicated in arm control (Cisek & Kalaska, 2005; Romo, Hernandez, & Z ainos, 2004; Romo, Hernandez, Zainos, L emus, & B rody, 2002). In ma ny of t hese regions, t he very same individual neurons appear to first represent sensory information about relevant targets and then reflect a decision to move to one of them (Cisek & Kalaska., 2005; Schall & Bichot, 1998). Second, many of the same regions which appear to be involved in movement preparation are also active during movement execution (Cisek, 2005; Crammond & Kalaska, 2000; Kalaska et al., 1998). Neural correlates of both planning and execution processes can be found even in the activity o f i ndividual c ells, wh ose a ssociation w ith m otor o utput changes in time from more abstract aspects of the task to more limb movement-related parameters (Crammond & Kalaska, 2000; Shen &
222
Embodiment, Ego-Space, and Action
Alexander, 1997a, 1997b). Such functional heterogeneity is difficult to r econcile w ith t he f ramework o f F igure 7.1c. W hy sh ould i ndividual neurons belong at different t imes t o t he s ensory, cog nitive, and motor systems? The affordance co mpetition h ypothesis su ggests a s imple ex planation. As described above, neurons throughout the fronto-parietal cortex and interconnected subcortical regions are at all times transforming information about potential targets into information about potential actions, and these representations are continuously competing against each other under the influence of various biasing factors. This may be the reason why neural populations in these regions appear to first represent all of the potential targets, then begin to modulate their ac tivity by a v ariety of decision variables, a nd finally reflect a selected motor plan. In fact, the hypothesis makes a very strong prediction: that while a decision is being made between different potential actions, ne ural cor relates o f t hese op tions shoul d be si multaneously evident in the regions responsible for their performance. This prediction has been confirmed th roughout th e o culomotor system. Preparation of multiple sequential saccades can overlap in t ime, a s shown by behavioral (McPeek & K eller, 2 002; McPeek, Skavenski, & N akayama, 2 000) a nd n europhysiological e vidence (McPeek & K eller, 2002; McPeek et a l., 2000). When t wo potential saccade t argets a re p resented s imultaneously, n eural co rrelates o f both are observed in LIP (Platt & Glimcher, 1997). Correlates of multiple potential saccade targets have also been found in the superior colliculus, where they are modulated by selection probability (Basso & Wurtz, 1 998). D uring v isual s earch t asks, c ells i n F EF i nitially respond to many potential saccade targets, but later reflect only the final selected one (Bichot, Rao, & Schall, 2002; Schall & Bichot, 1998). FEF ac tivity r elated t o t he dec ision p rocess a ppears t o a ffect the preparation of the saccade, as demonstrated with stimulation during the process of decision (Gold & Shadlen, 2000). Thu s, several regions central to the oculomotor system specify multiple potential saccades simultaneously and then select between these potential actions. Similar results have been reported for reaching movements. For example, t he presence of a d istractor ha s be en shown to i nfluence the reach trajectory to a target (Tipper, Howard, & Houghton, 1998, 2000; Welsh, Elliott, & Weeks, 1999) and hand grasp aperture (Castiello, 1999). Patients with frontal lobe damage often cannot suppress actions a ssociated w ith d istractors e ven wh ile t hey a re p lanning
The Affordance Competition Hypothesis
223
actions d irected el sewhere ( Humphreys & R iddoch, 2 000). I t ha s been proposed that such effects are the result of competition among parallel simultaneous representations of potential actions (Castiello, 1999 f or r eview). These p roposals, a long w ith t he p resent f ramework, predict that when an animal is faced with multiple potential graspable objects, neural correlates of the potential reach directions should coexist as distinct directional signals in reach-related regions (Cisek, 2006; Tipper et al., 2000). Partial i nformation on pos sible upcoming movements ha s be en shown to engage the activity of cells in reach-related regions (Bastian, Riehle, Erlhagen, & S chöner, 1998; Kurata, 1993; R iehle & Req uin, 1989). I n pa rticular, wh ile a r each d irection i s i nitially spec ified ambiguously, a p lateau o f d irectional s ignals i s obs erved i n m otor cortex and when a specific direction is selected, the population activity narrows down to reflect this choice (Bastian et al., 1998). Neural correlates of multiple potential reaching actions have been reported in premotor cortex e ven wh en t he ch oices a re cl early d istinct a nd mutually ex clusive (Cisek & K alaska, 2 005): W hen a m onkey w as presented with two opposite potential reaching actions, only one of which w ould la ter be s elected ( by a n onspatial c ue) a s t he co rrect choice, neural activity in premotor cortex specified both directions simultaneously. When information for selecting one action over the other became available, the unwanted direction was suppressed. The monkey u sed a st rategy o f p reparing bo th m ovements s imultaneously and suppressing the unwanted one despite the fact that the task design permitted the use of an alternative strategy (more consistent with traditional models of processing) in which target locations are stored in memory and converted to a motor plan only after the decision is made. Furthermore, while both actions were still under consideration, t heir corresponding neural signals reflected the biasing of a co mpetition between t hem. In particular, t he strength of each of the signals was subtly modulated by the monkey’s strategy of predicting which was more likely to be selected on the basis of prior trial history (Cisek & Kalaska, 2001a). In summary, simultaneous processing of multiple response options has been observed in both the oculomotor and arm reaching systems. This is consistent with the hypothesis that a part of perception of the world involves the specification of the potential interactions with the world that are currently available (Gibson, 1979). As these potential actions are being specified, they compete for further
224
Embodiment, Ego-Space, and Action
processing and for release into overt execution, and this competition is biased by a variety of influences, ranging from selective attention to more complex decision-making processes. Decision Making Through a Distributed Consensus Cognitive theories usually place decision-making within the cognitive system (Newell & Simon, 1972). This suggests the existence of a “central executive” which issues orders to the rest of the brain, and predicts that decision-related activity will be localized within specific parts of the nervous system. Studies of brain damage have for a long time suggested that such a central executive should be found in t he prefrontal lobes. However, neurophysiological data obtained by d irectly re cording a ctivity f rom t he c erebral c ortex h as s hown that t he p icture i s n ot n early as s imple. A s d escribed a bove, n eural recording results have repeatedly found the influence of various decision v ariables t hroughout pa rietal, p remotor, a nd p refrontal cortex, a s w ell a s subco rtical r egions. I n sh ort, t he dec ision-making system appears to be distributed throughout the brain, involving many regions that have long been considered to be purely concerned with motor control. Based o n such r esults, t he a ffordance co mpetition h ypothesis suggests t hat perhaps no c entral executive ex ists w ithin t he brain. Instead, it suggests that decisions are reached through a “distributed consensus” which occurs as representations of potential actions compete throughout the fronto-parietal system. Consider the circuit of Figure 7.3. As information about salient objects of interest is transformed into representations of possible actions, parallel representations are present throughout the dorsal stream as simultaneous peaks of activity within tuned neural populations (depicted as rectangles). While e ach pe ak o f ac tivity co mpetes w ith i ts n eighbors w ithin a given local region, it also cooperates through positive feedback with a corresponding peak in a different neural region. For example, two peaks of activity in pa rietal cortex compete against each other but each of them also cooperates with a corresponding peak in premotor cortex. At the same time, various regions involved in biasing action selection project into this distributed arena, modulating the activity of specific potential action representations therein. If the biasing in some particular neural region gets strong enough, the competition is resolved there, and the decision thus formed propagates outward
The Affordance Competition Hypothesis
225
to other parts of the fronto-parietal system. Thus, in some situations, the decision will appear first in parietal cortex and propagate to frontal regions, but in other cases it will appear first in frontal cortex and propagate back to parietal regions. It is useful here to make a d istinction between the brain regions which implement a competition between potential actions and those which compute the biases for influencing it. Figure 7.3 suggests that the co mpetition i tself p lays o ut ac ross a la rge d istributed co rtical network wh ich i ncludes t he d orsal v isual s ystem a nd t he f rontoparietal regions i nvolved i n t he planning a nd execution of specific actions. In contrast, the temporal lobe, prefrontal cortex, and basal ganglia are concerned with collecting the appropriate sensory information, i nterpreting i t w ithin t he c urrent beha vioral co ntext, a nd making p redictions abo ut t he co nsequences o f v arious ch oices, so as t o p rovide t he r ight b iases t o i nfluence t he co mpetition. Ci sek (2006) de scribes a co mputational model wh ich simulates how t his may occur during simple kinds of decision tasks. Many sources of information can cast their votes into the distributed competition oc curring i n t he f ronto-parietal s ystem. B ecause action s election i s a f undamental problem fac ed by e ven t he most primitive vertebrates, it likely involves structures which were prominently developed early and which have been conserved in evolution. The ba sal g anglia i s a p romising c andidate (Kalivas & N akamura, 1999; Mink, 1996; Redgrave, Prescott, & Gurney, 1999), a system of forebrain nuclei which is strongly conserved among vertebrates. The basic hypothesis (Berns & Sejnowski, 1998; Brown, Bullock, & Grossberg, 2004; Redgrave et al., 1999) is that the basal ganglia form a central locus in which excitation arriving from different motor systems competes, a nd a w inning behavior is selected a nd others i nhibited through projections back to the motor systems. Afferents to the input nuclei of t he ba sal ganglia (the st riatum a nd subthalamic nucleus) arrive from nearly the entire cerebral cortex and from the limbic system, converge onto t he output nuclei (substantia n igra a nd g lobus pallidus), and project through the thalamus back to the cerebral cortex. This cortico-striatal-pallido-thalamo-cortical loop is organized into m ultiple pa rallel cha nnels, r unning t hrough spec ific motor regions a s w ell a s t hrough r egions i mplicated i n h igher cog nitive functions (Alexander & Crutcher, 1990a; Middleton & Strick, 2000). In ac cordance w ith t he h ypothesis o f ba sal g anglia s election, c ell activity in the input nuclei is related to movement parameters (Alexander & Crutcher, 1990b, 1990c) but is influenced by expectation of
226
Embodiment, Ego-Space, and Action
reward (Schultz, Tremblay, & Hollerman, 2000; Takikawa, Kawagoe, & Hikosaka, 2002) and is related to the release of actions in a learned sequence (Aldridge e t a l., 1993; K ermadi, Jurquet, A rzi, & J oseph, 1993). S timulation a nd i nactivation ( Horak & A nderson, 1 984a, 1984b) of cells in the output nuclei disrupts movement speed in a manner consistent with the proposal that what is disrupted is the inhibition of competing motor programs (Wenger, Musch, & Mink, 1999). This proposal is also consistent with symptoms of Parkinson’s disease in humans (Chong, Horak, & Woollacott, 2000; Mink, 1996) and with neural activity patterns in the monkey model of the disease (Boraud, Bezard, Bioulac, & Gross, 2000). Furthermore, the finding that basal ganglia connect with prefrontal regions, in a manner similar to their c onnections with premotor c ortex, suggests th at b asal ganglia innervation of prefrontal regions also mediates selection, but on a more abstract level. This also is consistent with cognitive deficits of basal ganglia diseases (Sawamoto, Honda, Hanakawa, Fukuyama, & Shibasaki, 2002). The r ecent e volution o f p rimates i s d istinguished b y adv ances in t he ab ility t o s elect ac tions ba sed o n i ncreasingly abst ract a nd arbitrary criteria. This kind of selection may have been made possible by t he d ramatic elaboration of t he prefrontal cortex (Hauser, 1999). The prefrontal cortex is strongly implicated in decision making (Bechara, Da masio, Tranel, & A nderson, 1998; Fuster, Bodner, & K roger, 2 000; K im & Shad len, 1 999; M iller, 2 000; R owe, Toni, Josephs, F rackowiak, & P assingham, 2 000; T anji & H oshi, 2 001), which can be viewed as an aspect of action selection. Neurons in the dorsolateral prefrontal cortex (DLPFC) are sensitive to various combinations of stimulus features, and this sensitivity is always related to the particular demands of the task at hand (di Pellegrino & Wise, 1991; Hoshi, Shima, & Tanji, 1998; Kim & Shad len, 1999; Quintana & Fuster, 1999; Rainer, Asaad, & M iller, 1998). Prefrontal decisions appear to evolve t hrough t he collection of “votes” for categorically selecting one action over others, as demonstrated in studies of saccade target and reach target selection. For example, during an experiment in which monkeys reported perceptual discriminations using saccades (Kim & Shadlen., 1999), it was shown that DLPFC activity initially reflected the gradual accumulation of evidence in favor of a given target. Later, once the evidence became strong enough, the cell activity changed to simply reflect the monkey’s choice. Stimulation in the frontal eye fields (FEF) at different times during this process
The Affordance Competition Hypothesis
227
revealed that the gradual formation of the decision was accompanied by a gradual modification of the saccade plan (Gold & Shadlen, 2001). An ex periment o n r each t arget s election ( Hoshi, Sh ima, & T anji, 2000) found that when stimuli were presented, activity in PFC was sensitive to potentially relevant stimulus features, such as shape and location. After p resentation of a s ignal indicating the correct selection ru le ( shape-match o r l ocation-match), ru le-sensitive n eurons briefly became active, selecting out the relevant memorized stimulus features needed to make the response choice. After this decision process was complete, the remaining PFC activity reflected the intended movement choice. To su mmarize, f rom t he perspec tive of t he a ffordance competition hypothesis, decisions are made through a distributed consensus which emerges within an interconnected network of action-specific fronto-parietal regions t hrough t he i nfluence of biasing influences arriving f rom a v ariety of other brain regions, i ncluding t he ba sal ganglia and prefrontal cortex. This architecture supports the kinds of pragmatic decision-making required for real-time embodied activity—the k inds o f dec ision ma king wh ich p resumably d ominated behavior for millions of years, long before abstract cognitive abilities appeared. However, a n evolutionary perspective motivates one to a sk whether t he more recent cog nitive abilities such a s r ational thought and abstract planning could have evolved as specializations of this ancestral architecture. Cognitive Abilities As mentioned in the introduction, theories of embodied behavior have for a long time suggested that cognition evolved within the context of sensorimotor interaction with the world (Cisek, 1999; Clark, 1997; Hendriks-Jansen, 1996; Piaget, 1963; Thelen et al., 2001). The affordance competition hypothesis su ggests one w ay i n wh ich t his might have occurred. In pa rticular, it suggests t hat our abilities to make abst ract dec isions e volved o ut o f i ncreasingly so phisticated abilities to select action. Although the discussion below is necessarily speculative, it is consistent with neural data and leads to testable predictions. First, we may ask how complex patterns of behavior can emerge from a s imple interacting system. Consider an animal foraging for
228
Embodiment, Ego-Space, and Action
food. To satisfy its hunger, the animal has to first explore the environment to find food and then approach and eat the food. Regardless of t he i nternally m otivated na ture o f t he beha vior a nd i ts l engthy temporal sequence, t he ac tions t hemselves a re at every moment i n time constrained by t he i mmediate environment at t hat t ime. The selection of actions at any given time must also be made depending on the actions which are currently possible. When food is not present, no feeding actions are possible. Instead, actions of exploration are most appropriate for the initial behavioral context, and the selection of particular locations for exploratory behavior may be b iased by their novelty. Assuming that somewhere in the environment there is indeed some food to be found, such exploration will eventually result in the perception of a food source. Once this occurs, an action of approaching the food is specified, and wins the behavioral competition because it is strongly favored by current behavioral needs. When the food is within reach, the action of reaching is specified and is even more strongly favored for selection. Once the food is eaten, the behavioral context changes again, either returning the animal to a state of further foraging, or to a state where hunger is satiated and other behaviors (perhaps resting) win the behavioral competition. To summarize, a complex foraging behavior that consists of exploring, approaching, and eating, is organized because the criteria for selection of actions from currently available alternatives respect the logic of the animal’s interaction with the world. When exploration results in finding food, approach actions are selected. When approach brings the food near enough, reaching a nd eating ac tions w in out. W hen the food is finished the animal either returns to forage some more or rests, having accomplished its goal. More so phisticated l ong-range p lanning ma y ha ve e volved b y internalizing this “logic of interactions”. The abstract consequences of a particular type of action (such as opening a door) may be predicted on the basis of past experience, predicting specific new actions which that action makes available (such as walking through the door). In other words, a “ forward model” c an predict general consequences of e ntire a cts j ust as o ne ca n p redict s pecific proprioceptive feedback resulting f rom a spec ific movement (see above). For example, imagine a monkey in a room containing several objects. The objects within reach specify potential reaching actions, while objects further away specify potential approach actions. These actions compete for execution based on their estimated payoff. The payoff of reaching for
The Affordance Competition Hypothesis
229
an object is biased in part by the value of that object to the monkey. The payoff of approaching an object is also biased in such a way, in part because the monkey can predict that approaching an object has the potential of putting it within reach. Thus, the ultimate payoff of the unavailable reaching action can bias the selection of an approach action which makes that reaching action available. Consistent with this proposal, the anatomical organization of connections between the cerebellum and motor execution regions is recapitulated in the connections be tween t he c erebellum a nd m ore f rontal cog nitive regions (Middleton & S trick, 2 000). This ra ises t he possibility t hat the cerebellum’s putative role of predicting the consequences of specific motor commands i n primary motor regions may be r epeated in more frontal regions, where it predicts the abstract consequences of en tire beha vioral ac ts ( Alexander & Cr utcher, 1 990a; Ci sek & Kalaska, 2001b; Middleton & Strick, 2000). From t his perspec tive, t he ab ility t o a rrange ac tions i n specific s equences ma y be s een a s a f urther elabo ration o f a ncestral mechanisms f or s election a mong a lternatives. A s equence ma y be performed b y spec ifying a ll t he co mponent ac tions r ight away (in part u sing t he k inds o f p redictive spec ification d iscussed abo ve), but selecting them one-by-one according to the current step within a s equence ( Bullock, 2 004). Thus, per forming a s equence i nvolves identifying components of the sequence and then releasing one after another. Recently, exactly this process has been found to occur in the prefrontal cortex (Averbeck, Chafee, Crowe, & Georgopoulos, 2002). In particular, Averbeck et al. (2002) showed that neural activity in prefrontal cortex encoded the elements of a memorized sequence as a set of activities in which the strongest activity represented the first step of the sequence, the next strongest represented the second step, and so on. As each element of the sequence was performed, its associated activity was suppressed, allowing the next element to become the strongest and to be the next action released into execution. This neural data is remarkably consistent with the predictions of a cla ss of models called “competitive queueing” models (Bullock, 2004). It is plausible that such sequential behavior also involves the basal ganglia, which uses its influence on action selection to contribute to such serial r elease o f ac tions b y co mbining i nformation o n s erial o rder with i nformation on specified co mponent ac tions (Aldridge e t a l., 1993; Kermadi et al., 1993). The serial order information may arrive in the basal ganglia from the supplementary and presupplementary
230
Embodiment, Ego-Space, and Action
motor a reas, wh ich show ac tivity related to serial order (Clower & Alexander, 1998; Nakamura, Sakai, & Hikosaka, 1998). To summarize, some of the phenomena studied in cognitive science can be t hought of as aspects of the sophisticated action selection of which humans and other primates are capable. This proposal is ve ry c ompatible w ith t he re cent e volutionary e xpansion of t he frontal co rtical r egions t hought t o su pport ma ny cog nitive f unctions, and with the conservation of the architecture of connectivity between frontal regions and the basal ganglia and cerebellum (Cisek & Kalaska, 2001b). Concluding Remarks Modern cognitive neuroscience is defined as the study of the biological bases of mental events (Albright et a l., 2000; Gazzaniga, 2000), an attempt to bring the sciences of psychology and biology together again. As the name implies, cognitive neuroscience is strongly influenced by a particular kind of psychological theory, that of cognitive psychology, a nd i nherits i ts ba sic co nceptual t oolbox. Pr ominent within this toolbox is the idea that the brain is an information processing s ystem wh ich ma nipulates de scriptive r epresentations, o r internal states which capture k nowledge about (i.e. “ describe”) t he world or the organism. From this perspective, the functional architecture of behavior is broken down into processes which construct representations of the world, store and retrieve these from memory, manipulate them to build knowledge and make decisions, and produce and execute plans of action. The purpose of this chapter has been to discuss a different way of looking at behavior, and to suggest that it may provide a better framework for bridging biological and psychological phenomena. Within this alternative view, a f undamental concept is “embodiment”: that the brain is an organ whose role is to control the interaction between the body and its environment. From an evolutionary perspective, interaction with the world is more important than knowledge of the world, and thus one can argue that the pragmatic efficacy of a neural representation or system is more important than its descriptive accuracy (although of course there is great value in purely descriptive representations as well). In this sense, the importance of embodiment is difficult to deny. After all, the bottom line of survival is how
The Affordance Competition Hypothesis
231
well we deal with the challenges posed by the environment, and not how deeply we may contemplate them. Above, I described two concepts for developing an embodied view of behavior. First, I defined a distinction between “descriptive” and “pragmatic” r epresentations, a nd su ggested t hat t he la tter ma y be very useful for interpreting the functional role of patterns of neural activity i n ma ny regions of t he primate brain. S econd, I de scribed a f unctional ar chitecture ca lled “ affordance competition” (Cisek, 2007), which suggests that interactive behavior consists of a constant competition be tween i nternal r epresentations o f c urrently a vailable opportunities a nd dema nds for action. I r eviewed a v ariety of experimental studies whose results, I would claim, are more compatible with this hypothetical architecture than with the more familiar architecture of classical cognitive psychology. It is important to point out that many of the concepts discussed above are by no means novel. An emphasis on the pragmatic requirements of controlling situated interaction has led several times toward the basic idea that behavior consists of a co mpetition between currently a vailable po tential ac tions ( Arbib, 1 989; E wert, 1 997; F agg & A rbib, 1 998; H endriks-Jansen, 1 996; K ornblum, Ha sbroucq, & Osman, 1 990; Toates, 1 998). F or ex ample, Toates (1998) p roposed that a phylogenetically old system for making responses to particular stimuli is enhanced by mechanisms which control the competition between stimulus-response pairs. This is similar to the proposal made here that cognitive phenomena evolved within the system for action selection. Several authors have proposed that attention serves a similar role (Allport, 1987; Castiello, 1999; Humphreys & Riddoch, 2001; Neumann, 1990; Tipper et a l., 1998). A n umber of computational models have been developed around the central idea of a competition between potential actions (Cisek, 2006; Erlhagen & Schöner, 2002; Fagg & Arbib, 1998; Houghton & Tipper, 1994). The present t heory may be v iewed a s a n attempt to u nify t hese and r elated i deas i nto a g eneral t heoretical f ramework f or t ying together beha vioral a nd n eural st udies. Do ing so add resses s everal c ritical i ssues. F irst, i t h elps t o r econcile v ery o ld a nd pers istent debates about the general f unctional organization of behavior. As r eviewed abo ve, ps ychological t heory i n t he la st 1 50 y ears ha s undergone drastic oscillations between focusing solely on subjective phenomena ( “structuralism”) t o f ocusing o nly o n o vert b ehavior (“behaviorism”) to again putting all explanatory burden on internal
232
Embodiment, Ego-Space, and Action
processing (“computationalism”), and throughout this time precious few at tempts at finding middle ground have been made. Much of the tone of recent literature on embodied cognition again calls for a total overhaul. However, adopting a new viewpoint does not require complete r ejection o f c urrent i deas. De spite t he ex citement abo ut the c ognitive r evolution o verthrowing b ehaviorism, m any b ehaviorist pr inciples re garding le arning c ontinue to appl y. The “law of effect” ha s n ot g one a way. L ikewise, a p ragmatic perspec tive d oes not force one to abandon current mainstream psychological science. Many of t he concepts of cog nitive ps ychology de signed to add ress human behavior are still valid even if one grounds their foundations in simple situated behavior, and the framework presented here offers a bridge between these concepts. Second, the framework provides potential explanations for some aspects of neural data which have been notoriously difficult to interpret f rom t raditional perspec tives. F or ex ample, a deba te pers ists on whether parietal cortical activity is a representation of attended targets (Colby et al., 1999; Kusunoki et al., 2000) prior to cognitive decisions and the formulation of a motor plan, or whether it is itself a representation of an already prepared motor intention (Snyder et al., 1997; 2000a). The affordance competition model suggests that both views are partially correct. Instead of trying to fit parietal data into rigid “perceptual” or “motor” categories, one can view it as reflecting the specification of potential ac tions t hat a re biased by attentional and dec isional i nfluences. The affordance co mpetition h ypothesis also provides plausible functional explanations for other important observations regarding neural data. As discussed above, these include the widespread influence of contextual effects and attentional modulation of neural activity (Boynton, 2005; Moran & De simone, 1985; Treue, 2001), t he d ivergence of t he v isual system (Felleman & Van Essen, 1991; Milner & G oodale, 1995; Ungerleider et a l., 1982), t he involvement of motor s ystems i n cog nitive t asks such a s dec isionmaking (Cisek et al., 2005; Georgopoulos, Taira, & Lukashin, 1993; Glimcher, 2003; Gold & Shad len, 2001; Romo et al., 2004), and the anatomical organization of the basal ganglia and cerebellum (Cisek & Kalaska, 2001b; Middleton & Strick, 2000). Finally, t he p ragmatic perspec tive su ggests pos sible so lutions t o several major conceptual problems wh ich have long p lagued t raditional cognitive models. One of these is the question of how specialized activities distributed over the brain are integrated into a unified
The Affordance Competition Hypothesis
233
perceptual representation of the environment—what is known as the “binding problem” (Reynolds & De simone, 1999; Singer, 2 001; von der Malsburg, 1996). Although some questions of binding apply to the present model, others do not because the existence of a unified internal representation is not necessary. For example, pragmatic representations spec ifying po tential ac tions a nd p ragmatic r epresentations collecting sel ection cr iteria n eed n ot be ex plicitly bo und t ogether, because t he operation of t he specification a nd selection system can be integrated simply by focusing each on the same spatial region of interest t hrough o vert g aze o rientation ( i.e., “ binding t hrough t he fovea”) or covert shifts of spatial attention (Cisek & Turgeon, 1999). A second conceptual problem posing even deeper difficulties for traditional cognitive models is the question of meaning. One of the major c riticisms o f co mputational t heories o f h uman t hought ha s been the observation that rule-based computation does not capture the m eaning o f t he t okens i t ma nipulates ( Harnad, 1 990; S earle, 1980). This applies to the distributed representations of connectionist models just as it does to the symbolic representations of traditional AI. The abstract nature of computation is desirable from the perspective of generality of mathematical formalisms, but is detrimental because its representations are not “grounded” with respect to the purpose of behavior (Harnad, 1990). Numerous theorists have proposed that grounding must come from situated interaction with the en vironment ( Cisek, 1 999; Dr etske, 1 981; G allese, 2 000; Gib son, 1 979; Ha rdcastle, 1 995; M illikan, 1 989), wh ich i s per formed as a n ex tension o f p hysiological co ntrol t hrough t he en vironment by exploiting t he consistent properties of t hat environment (Cisek, 1999). The model presented here suggests neural substrates for such situated i nteraction. W ithin th e fr amework o f th is m odel, p ragmatic representations such a s parameter maps specifying potential movements a nd r epresentations o f c riteria f or ac tion s election a re grounded by virtue of their role in guiding movement. Thei r meaning is the role they play in behavior. To summarize, it is proposed here that a promising approach for bridging behavioral and neural phenomena is to view them both from an embodied perspective. The nervous system did not evolve for t he k inds o f abst ract t asks t o wh ich f ormal co mputations a nd descriptive representations are best suited. It evolved for controlling interaction with the environment, and was constrained by the pragmatic concerns of that control. Central among these concerns are the
234
Embodiment, Ego-Space, and Action
problems of action specification and action selection, which defined the basic organization of t he brain’s f unctional architecture. Once laid down, this basic architecture was strongly conserved in evolution, providing the context within which more sophisticated sensorimotor processing and decision-making evolved. Perhaps even the advanced cog nitive ab ilities o f h igher p rimates c an be i nterpreted within this context, and doing so may help to demystify some of their most enigmatic puzzles. Such spec ulations abo ut b road f unctional f rameworks c annot be conclusive. W hile I w ould suggest t hat t he a ffordance competition hypothesis provides a g ood ex planation for some n eural d ata which a re ha rd t o i nterpret f rom a cog nitive ps ychology perspec tive, I w ould c ertainly not cla im t hat it ex plains e verything about the brain. The purpose of this chapter is simply to bring these and related ideas u nder d iscussion, i n t he hope t hat t hey w ill lead to a novel way of thinking about the brain without neglecting its embodied nature.
References Adams, F., & M ele, A . (1989). The role of i ntention i n i ntentional ac tion. Canadian Journal of Philosophy, 19, 511–531. Albright, T. D., Kandel, E. R., & Posner, M. I. (2000). Cognitive neuroscience. Current Opinion in Neurobiology, 10, 612–624. Aldridge, J. W., Berridge, K. C ., Herman, M., & Z immer, L . (1993). Neuronal coding of serial order: Syntax of grooming in the neostriatum. Psychological Science, 4, 391–395. Alexander, G . E ., & Cr utcher, M . D . (1990a). Fu nctional a rchitecture o f basal ganglia circuits: Neural substrates of parallel processing. Trends in Neurosciences, 13, 266–271. Alexander, G. E., & Crutcher, M. D. (1990b). Neural representations of the target (goal) of visually guided arm movements in three motor areas of the monkey. Journal of Neurophysiology, 64, 164–178. Alexander, G . E ., & Cr utcher, M . D. (1990c). P reparation for movement: Neural representations of intended direction in three motor areas of the monkey. Journal of Neurophysiology, 64, 133–150. Allport, D. A. (1987). Selection for action: Some behavioral and neurophysiological considerations of attention and action. In H. Heuer & A. F. Sanders (Eds.), Perspectives on pe rception and a ction (p p. 3 95–419). Hillsdale, NJ: Erlbaum.
The Affordance Competition Hypothesis
235
Andersen, R. A. (1995). Encoding of intention and spatial location in the posterior parietal cortex. Cerebral Cortex, 5, 457–469. Andersen, R. A., Snyder, L. H., Bradley, D. C., & Xing, J. (1997). Multimodal representation of space in the posterior parietal cortex and its use in planning movements. Annual Review of Neuroscience, 20, 303–330. Arbib, M. A. (1989). Modularity, schemas, and neurons: A critique of Fodor. In P. Slezak & W. R. Albury (Eds.), Computers, brains and minds (pp. 193–219). Dordrecht: Kluwer Academic. Ashby, W. R. (1965). Design for a brain: The origin of adaptive behavior (2nd ed.). London: Chapman & Hall. Averbeck, B. B., Chafee, M. V., Crowe, D. A., & Georgopoulos, A. P. (2002). Parallel processing of serial movements in prefrontal cortex. Proceedings of the National Academy of Sciences U.S.A, 99, 13172–13177. Ballard, D. H., Hayhoe, M. M., & Pelz, J. B. (1995). Memory representations in natural tasks. Journal of Cognitive Neuroscience, 7, 66–80. Basso, M. A., & Wurtz, R. H. (1998). Modulation of neuronal activity in superior colliculus by changes in target probability. Journal of Neuroscience, 18, 7519–7534. Bastian, A., Riehle, A., Erlhagen, W., & Schöner, G. (1998). Prior information preshapes the population representation of movement direction in motor cortex. Neuroreport, 9, 315–319. Bechara, A., Damasio, H., Tranel, D., & Anderson, S. W. (1998). Dissociation o f w orking m emory f rom de cision ma king w ithin t he h uman prefrontal cortex. Journal of Neuroscience, 18, 428–437. Beer, R. D. (2000). Dynamical approaches to cognitive science. Trends i n Cognitive Sciences, 4, 91–99. Bergson, H. (1896). Matter and memory. New York: Macmillan. Berns, G. S., & Sejnowski, T. J. (1998). A computational model of how the basal ga nglia produce sequences. Journal of C ognitive Neuroscience, 10, 108–121. Best, J. B. (1986). Cognitive psychology. St. Paul, MN: West. Bichot, N. P., Rao, S. C., & S chall, J. D . (2002). Continuous processing in macaque f rontal c ortex during v isual s earch. Neuropsychologia, 39, 972–982. Block, N. ( 1990). The c omputer model of t he m ind. I n D. N.Osherson & E. E. Smith (Eds.), Thinking: An inv itation to c ognitive sc ience (p p. 247–289). Cambridge: MIT Press. Block, N. (1995). The mind as the soft ware of the brain. In E. E. Smith & D. N. Osherson (Eds.), Thinking: An inv itation to c ognitive science (pp. 377–425). Cambridge, MA: MIT Press. Boraud, T., Bezard, E., Bioulac, B., & Gross, C. G. (2000). Ratio of inhibitedto-activated pa llidal neurons de creases d ramatically during pa ssive limb movement in the MPTP-treated monkey. Journal of Neurophysiology, 83, 1760–1763.
236
Embodiment, Ego-Space, and Action
Boynton, G. M. (2005). Attention and visual perception. Current Opinion in Neurobiology, 15, 465–469. Broadbent, D. E. (1958). Perception and com munication. N ew York: Pergamon Press. Brooks, R . ( 1991). I ntelligence w ithout re presentation. Artificial Intelligence, 47, 139–159. Brown, J. W., Bullock, D., & Grossberg, S. (2004). How laminar frontal cortex and basal ganglia circuits interact to control planned and reactive saccades. Neural Networks, 17, 471–510. Bullock, D. (2004). Adaptive neural models of queuing and timing in fluent action. Trends Cognitive Science, 8, 426–433. Bullock, D., Cisek, P., & Grossberg, S. (1998). Cortical networks for control of voluntary arm movements under variable force conditions. Cerebral Cortex, 8, 48–62. Buneo, C. A., Jarvis, M. R., Batista, A. P., & Andersen, R. A. (2002). Direct visuomotor transformations for reaching. Nature, 416, 632–636. Burbaud, P., Doegle, C., Gross, C. G., & Bi oulac, B. (1991). A qu antitative study of neuronal discharge in areas 5, 2, and 4 of the monkey during fast arm movements. Journal of Neurophysiology, 66, 429–443. Burnod, Y., Baraduc, P., Battaglia-Mayer, A., Guigon, E., Koechlin, E., Ferraina, S ., e t a l. (1999). Pa rieto-frontal c oding o f re aching: a n i ntegrated framework. Experimental Brain Research, 129, 325–346. Calton, J. L., Dickinson, A. R., & Snyder, L. H. (2002). Non-spatial, motorspecific activation in posterior parietal cortex. Nature Neuroscience, 5, 580–588. Caminiti, R., Ferraina, S., & Battaglia-Mayer, A. (1998). Visuomotor transformations: Early cortical mechanisms of reaching. Current Opinion in Neurobiology, 8, 753–761. Carello, C. D., & K rauzlis, R. J. (2004). Manipulating intent: Evidence for a causal role of the superior colliculus in target selection. Neuron, 43, 575–583. Castiello, U. (1999). Mechanisms of selection for the control of hand action. Trends in Cognitive Sciences, 3, 264–271. Chomsky, N. (1959). A review of B. F. Skinner’s Verbal Behavior. Language, 35, 26–58. Chong, R. K., Horak, F. B., & Woollacott, M. H. (2000). Parkinson’s disease impairs the ability to c hange set quickly. Journal of th e Neurological Sciences, 175, 57–70. Churchland, P. S . ( 1987). Ep istemology i n t he a ge o f n euroscience. The Journal of Philosophy, 84, 544–553. Cisek, P. (1999). Beyond the computer metaphor: Behavior as interaction. Journal of Consciousness Studies, 6, 125–142.
The Affordance Competition Hypothesis
237
Cisek, P. (2001). Embodiment is all in the head. Behavioral and Brain Sciences, 24, 36–38. Cisek, P. (2005). Neural representations of motor plans, de sired t rajectories, and controlled objects. Cognitive Processing, 6, 15–24. Cisek, P. (2006). Integrated neural processes for defining potential actions and deciding between them: A computational model. Journal of Neuroscience, 26, 9761–9770. Cisek, P. (2007). Cortical mechanisms of action selection: The affordance competition hypothesis. Philosophical Transactions of the Royal Society B, 362, 1585–1599. Cisek, P. & Kalaska, J. F. (2001a). Activity in dorsal premotor cortex (PMd) reflects a nticipation of t he l ikely response choice during a s election task. Society for Neuroscience Abstracts, 27, 117–134. Cisek, P., & Kalaska, J. F. (2001b). Common codes for situated interaction. Behavioral and Brain Sciences, 24, 883–884. Cisek, P., & K alaska, J. F . (2005). Neural c orrelates of re aching de cisions in dorsal premotor cortex: Specification of multiple direction choices and final selection of action. Neuron, 45, 801–814. Cisek, P., & Turgeon, M. (1999). “Binding through the fovea,” a tale of perception in the service of action. Psyche, 5. Clark, A. (1997). Being there: Putting brain, body, and world together again. Cambridge, MA: MIT Press. Clower, W . T ., & A lexander, G . E . ( 1998). M ovement s equence-related activity reflecting numerical order of components in supplementary and pre supplementary motor a reas. Journal of N europhysiology, 8 0, 1562–1566. Coe, B ., T omihara, K ., Ma tsuzawa, M ., & H ikosaka, O . ( 2002). Vi sual and anticipatory bias in three cortical eye fields of the monkey during a n ad aptive de cision-making t ask. Journal of N euroscience, 22 , 5081–5090. Colby, C. L., & Duhamel, J. R. (1996). Spatial representations for action in parietal cortex. Brain Research Cognitive Brain Research, 5, 105–115. Colby, C. L., & Goldberg, M. E. (1999). Space and attention in parietal cortex. Annual Review of Neuroscience, 22, 319–349. Constantinidis, C., & Steinmetz, M. A. (2001). Neuronal responses in area 7a to multiple-stimulus displays: I. neurons encode the location of the salient stimulus. Cerebral Cortex, 11, 581–591. Crammond, D. J., & K alaska, J. F. (2000). Prior information in motor and premotor cortex: Activity during the delay period and effect on premovement activity. Journal of Neurophysiology, 84, 986–1005. Culham, J. C., & Kanwisher, N. G. (2001). Neuroimaging of cognitive functions in human parietal cortex. Current Opinion in Neurobiology, 11, 157–163.
238
Embodiment, Ego-Space, and Action
Desmurget, M., Epstein, C. M., Turner, R. S., Prablanc, C., Alexander, G. E., & Gr afton, S . T. (1999). Role o f t he p osterior pa rietal c ortex i n updating re aching m ovements to a v isual t arget. Nature Ne uroscience, 2, 563–567. Desmurget, M ., Gre a, H ., Gre the, J. S ., P rablanc, C ., A lexander, G . E ., & Grafton, S. T. (2001). Functional anatomy of nonvisual feedback loops during reaching: A p ositron em ission tomography study. Journal of Neuroscience, 21, 2919–2928. Dewey, J. (1896). The reflex arc concept in psychology. Psychological Review, 3, 357–370. di P ellegrino, G ., & Wi se, S . P . ( 1991). A n europhysiological c omparison of t hree distinct regions of t he primate f rontal lobe. Brain, 114, 951–978. Dodd, D. H., & W hite, R . M. Jr. (1980). Cognition: Mental structures and processes. Boston, MA: Allyn & Bacon. Dorris, M. C., & Glimcher, P. W. (2004). Activity in posterior parietal cortex is correlated with the relative subjective desirability of action. Neuron, 44, 365–378. Dretske, F . ( 1981). Knowledge an d th e flow of info rmation. O xford: Blackwell. Engel, A . K ., F ries, P., & Si nger, W. ( 2001). D ynamic p redictions: o scillations a nd s ynchrony i n to p-down p rocessing. National R eview o f Neuroscience, 2, 704–716. Erlhagen, W., & S chöner, G . (2002). D ynamic field t heory of move ment preparation. Psychology Review, 109, 545–572. Ewert, J.-P. (1997). Neural correlates of key stimulus and releasing mechanism: A c ase s tudy a nd t wo c oncepts. Trends i n N eurosciences, 20 , 332–339. Fadiga, L ., F ogassi, L ., G allese, V., & R izzolatti, G . ( 2000). Vi suomotor neurons: Ambiguity of the discharge or “motor” perception? International Journal of Psychophysiology, 35, 165–177. Fagg, A. H., & Arbib, M. A. (1998). Modeling parietal-premotor interactions in primate control of grasping. Neural Networks, 11, 1277–1303. Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex, 1, 1–47. Ferraina, S ., & Bi anchi, L . ( 1994). P osterior pa rietal c ortex: Fu nctional properties of neurons in area 5 d uring an instructed-delay reaching task within different parts of space. Experimental Brain Research, 99, 175–178. Fetz, E . E . ( 1992). A re m ovement pa rameters re cognizably c oded i n the ac tivity o f si ngle n eurons? Behavioral an d Br ain S ciences, 1 5, 679–690.
The Affordance Competition Hypothesis
239
Fitts, P. M. (1954). The information capacity of t he human motor system in controlling t he a mplitude of movement. Journal of E xperimental Psychology, 47, 381–391. Fuster, J. M ., Bodner, M., & K roger, J. K . (2000). Cross-modal and crosstemporal a ssociation i n n eurons o f f rontal c ortex. Nature, 4 05, 347–351. Gallese, V. (2000). The inner sense of action: Agency and motor representations. Journal of Consciousness Studies, 7, 23–40. Gazzaniga, M. S. (2000). The new cognitive neurosciences (2 nd e d.). C ambridge, MA: MIT Press. Georgopoulos, A. P., Taira, M., & Lukashin, A. V. (1993). Cognitive neurophysiology of the motor cortex. Science, 260, 47–52. Gibson, J. J. ( 1979). The ecol ogical appr oach t o v isual pe rception. B oston: Houghton Mifflin. Glenberg, A . M . (1997). W hat me mory i s for . Behavioral and Br ain S ciences, 20, 1–55. Glimcher, P. W. (2001). Ma king c hoices: The n europhysiology o f v isualsaccadic decision making. Trends in Neurosciences, 24, 654–659. Glimcher, P. W. (2003). The neurobiology of visual-saccadic decision making. Annual Review of Neuroscience, 26, 133–179. Gold, J. I., & Shadlen, M. N. (2000). Representation of a perceptual decision in developing oculomotor commands, Nature, 404, 390–394. Gold, J. I., & Shadlen, M. N. (2001). Neural computations that underlie decisions about sensory stimuli. Trends in Cognitive Science, 5, 10–16. Goodale, M. A., & Milner, A. D. (1992). Separate visual pathways for perception and action. Trends in Neurosciences, 15, 20–25. Gottlieb, J. (2002). Parietal mechanisms of target representation. Current Opinion in Neurobiology, 12, 134–140. Gottlieb, J. P., Kusunoki, M., & Goldberg, M. E. (1998). The representation of visual salience in monkey parietal cortex. Nature, 391, 481–484. Graziano, M. S. A., Cooke, D. F., & Taylor, C. S. R. (2000). Coding the location of the arm by sight. Science, 290, 1782–1786. Hardcastle, V. G. (1995). A c ritique of information processing theories of consciousness. Minds and Machines, 5, 89–107. Harnad, S . ( 1990). The s ymbol g rounding p roblem. Physica D, 4 2, 335–346. Harvey, I., Husbands, P., & Cliff, D. T. (1993). Issues in evolutionary robotics. In J.-A. Meyer, H. L. Roitblat, & S . Wilson (Eds.), Proceedings of the Second Conference on Simulation of Adaptive Behavior (pp. 364– 373). Cambridge, MA: MIT Press. Hauser, M. D. (1999). Perseveration, inhibition and the prefrontal cortex: A new look. Current Opinion in Neurobiology, 9, 214–222.
240
Embodiment, Ego-Space, and Action
Hendriks-Jansen, H . (1996). Catching ourse lves in the a ct: Sit uated a ctivity, interactive emergence, evolution, and human thought. Cambridge, MA: MIT Press. Hinde, R. A. (1966). Animal behavior: A synthesis of ethology and comparative psychology. New York: McGraw-Hill. Hofstadter, D. R. (1979). Gödel, Escher, Bach, an eternal golden braid. New York: Basic Books. Hommel, B., Müsseler, J., Aschersleben, G., & Prinz, W. (2001). The theory of event coding (TEC): A framework for perception and action planning. Behavioral and Brain Sciences, 24, 849–937. Horak, F. B., & A nderson, M. E . (1984a). Influence of g lobus pallidus on arm movements in monkeys. I. Effects of kainic acid-induced lesions. Journal of Neurophysiology, 52, 290–304. Horak, F. B., & A nderson, M. E . (1984b). Influence of g lobus pallidus on arm m ovements i n m onkeys. I I. E ffects o f st imulations. Journal of Neurophysiology, 52, 305–322. Horwitz, G. D., Batista, A. P., & Newsome, W. T. (2004). Representation of an abstract perceptual decision in macaque superior colliculus. Journal of Neurophysiology, 91, 2281–2296. Hoshi, E., Shima, K., & Tanji, J. (1998). Task-dependent selectivity of movement-related neuronal activity in the primate prefrontal cortex. Journal of Neurophysiology, 80, 3392–3397. Hoshi, E ., Shima, K., & Tanji, J. ( 2000). Neuronal ac tivity i n t he primate prefrontal cortex in the process of motor selection based on two behavioral rules. Journal of Neurophysiology, 83, 2355–2373. Houghton, G., & T ipper, S . P. (1994). A m odel of i nhibitory mechanisms in s elective attention. I n D. Da genbach & T . C arr (Eds.), Inhibitory processes in at tention, memory, and language (pp. 53–112). Orlando, FL: Academic Press. Humphreys, G. W., & Riddoch, J. M. (2000). One more cup of coffee for the road: Object-action assemblies, response blocking a nd response capture after frontal lobe damage. Experimental Brain Research, 133, 81–93. Humphreys, G. W., & R iddoch, M. J. ( 2001). Detection by ac tion: neuropsychological evidence for action-defined templates in search. Nature Neuroscience, 4, 84–88. Jackson, J. H. (1958). Evolution and dissolution of the nervous system. In J. Taylor (Ed.), Selected writings of John Hughlings Jackson (pp. 45–75). London: Staples Press. (Original work published 1884) Janssen, P., & Shadlen, M. N. (2005). A representation of the hazard rate of elapsed time in macaque area LIP. Nature Neuroscience, 8, 234–241. Johnson, P. B., Ferraina, S., Bianchi, L., & Caminiti, R. (1996). Cortical networks for visual reaching: Physiological and anatomical organization of frontal and parietal arm regions. Cerebral Cortex, 6, 102–119.
The Affordance Competition Hypothesis
241
Johnson-Laird, P. N. (1988). The computer and the mind: An introduction to cognitive science. Cambridge, MA: Harvard University Press. Kalaska, J. F. (1996). Parietal cortex area 5 and visuomotor behavior. Canadian Journal of Physiology and Pharmacology, 74, 483–498. Kalaska, J. F ., Ci sek, P ., & G osselin-Kessiby, N. ( 2003). M echanisms o f selection a nd g uidance of reaching movements i n t he pa rietal lobe. Advances in Neurology, 93, 97–119. Kalaska, J. F ., & Cr ammond, D. J. ( 1995). Deciding not to G O: Neuronal correlates of response selection in a GO/NOGO task in primate premotor and parietal cortex. Cerebral Cortex, 5, 410–428. Kalaska, J. F., Scott, S. H., Cisek, P., & Sergio, L. E. (1997). Cortical control of reaching movements. Current Opinion in Neurobiology, 7, 849–859. Kalaska, J. F ., Sergio, L . E ., & Ci sek, P. (1998). Cortical control of w holearm motor tasks. In M. Glickstein (Ed.), Sensory guidance of mo vement, Novartis Foundation Symposium 218 (pp. 176–201). Chichester, UK: Wiley. Kalivas, P. W., & Nakamura, M. (1999). Neural systems for behavioral activation and reward. Current Opinion in Neurobiology, 9, 223–227. Keele, S. W. (1968). Movement control in skilled motor performance. Psychological Bulletin, 70, 387–403. Kermadi, I., Jurquet, Y., Arzi, M., & Joseph, J. P. (1993). Neural activity in the c audate nucleus of m onkeys during spa tial s equencing. Experimental Brain Research, 94, 352–356. Kim, J.-N., & Shadlen, M. N. (1999). Neural correlates of a decision in the dorsolateral prefrontal cortex of the macaque. Nature Neuroscience, 2, 176–185. Kornblum, S ., Ha sbroucq, T., & O sman, A . (1990). D imensional overlap: Cognitive ba sis f or s timulus-response c ompatibility—A m odel a nd taxonomy. Psychological Review, 97, 253–270. Kurata, K. (1993). Premotor cortex of monkeys: set- and movement-related activity reflecting amplitude and direction of wrist movements. Journal of Neurophysiology, 69, 187–200. Kusunoki, M., Gottlieb, J., & Goldberg, M. E. (2000). The lateral intraparietal area as a salience map: The representation of abrupt onset, stimulus motion, and task relevance. Vision Research, 40, 1459–1468. Marconi, B., Genovesio, A., Battaglia-Mayer, A., Ferraina, S., Squatrito, S., Molinari, M. et al. (2001). Eye-hand coordination during reaching. I. Anatomical relationships between parietal and frontal cortex. Cerebral Cortex., 11, 513–527. Marr, D. C. (1982). Vision. San Francisco: W. H. Freeman. Matelli, M ., & Lu ppino, G . (2001). Pa rietofrontal c ircuits f or ac tion a nd space perception in the macaque monkey. Neuroimage, 14, S27–S32. Maturana, H. R., & Varela, F. J. (1980). Autopoiesis and cognition: The realization of the living (vols. 42) Boston: D. Reidel.
242
Embodiment, Ego-Space, and Action
Mazzoni, P., Bracewell, R. M., Barash, S., & Andersen, R. A. (1996). Motor intention activity in the macaque’s lateral intraparietal area. I. Dissociation of motor plan from sensory memory. Journal of Neurophysiology, 76, 1439–1457. McPeek, R. M., & Keller, E. L. (2002). Superior colliculus activity related to concurrent processing of saccade goals in a visual search task. Journal of Neurophysiology, 87, 1805–1815. McPeek, R. M., Skavenski, A. A., & Nakayama, K. (2000). Concurrent processing of saccades in visual search. Vision Research, 40, 2499–2516. Mead, G. H. (1938). The philosophy of the act. Chicago: University of Chicago Press. Merleau-Ponty, M . ( 1945). Phénoménologie d e l a pe rception. Pa ris: Gallimard. Middleton, F. A., & Strick, P. L. (2000). Basal ganglia and cerebellar loops: Motor and cognitive circuits. Brain Research Reviews, 31, 236–250. Miller, E. K. (2000). The p refrontal c ortex a nd c ognitive c ontrol. Nature Reviews Neuroscience, 1, 59–65. Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits o n o ur c apacity f or p rocessing i nformation. Psychological Review, 63, 81–97. Miller, G. A., Galanter, E., & Pribram, K. H. (1960). Plans and the structure of behavior. New York: Holt, Rinehart & Winston. Millikan, R. G . ( 1989). B iosemantics. The J ournal of P hilosophy, 8 6, 281–297. Milner, A. D., & Goodale, M. A. (1995). The visual brain in action. Oxford: Oxford University Press. Mink, J. W. (1996). The basal ganglia: Focused selection and inhibition of competing motor programs. Progress in Neurobiology, 50, 381–425. Moran, J., & Desimone, R. (1985). Selective attention gates visual processing in the extrastriate cortex. Science, 229, 782–784. Mountcastle, V. B., Lynch, J. C., Georgopoulos, A. P., Sakata, H., & Acuna, C. (1975). Posterior parietal association cortex of the monkey: Command functions for operations within extrapersonal space. Journal of Neurophysiology, 38, 871–908. Nakamura, H., Kuroda, T., Wakita, M., Kusunoki, M., Kato, A., Mikami, A. e t a l. (2001). F rom t hree-dimensional spac e v ision to p rehensile hand m ovements: The l ateral i ntraparietal a rea l inks t he a rea V 3A and the anterior intraparietal area in macaques. Journal of Neuroscience, 21, 8174–8187. Nakamura, K ., S akai, K ., & H ikosaka, O . ( 1998). N euronal ac tivity i n medial frontal cortex during learning of sequential procedures. Journal of Neurophysiology, 80, 2671–2687.
The Affordance Competition Hypothesis
243
Neumann, O. (1990). Vi sual attention a nd ac tion. I n O. Neumann & W . Prinz ( Eds.), Relationships be tween pe rception an d a ction: Cur rent approaches (pp. 227–267). Berlin: Springer-Verlag. Newell, A ., & Si mon, H . A . (1972). Human p roblem s olving. E nglewood Cliffs, NJ: Prentice-Hall. Núñez, R., & Freeman, W. J. (2000). Reclaiming cognition: The primacy of action, intention and Emotion. Thorverton, UK: Imprint Academic. O’Regan, J. K ., & N oë, A . (2001). A s ensorimotor ac count o f v ision a nd visual consciousness. Behavioral and Brain Sciences, 24, 939–1011. Paré, M ., & W urtz, R . H . (2001). P rogression i n neuronal processing for saccadic eye movements from parietal cortex area lip to superior colliculus. Journal of Neurophysiology, 85, 2545–2562. Passingham, R . E ., & T oni, I . (2001). C ontrasting t he do rsal a nd v entral visual systems: Guidance of movement versus decision making. Neuroimage., 14, S125–S131. Piaget, J. (1963). The origins of intelligence in children. New York: Norton. Pinker, S. (1997). How the mind works. New York: Norton. Pisella, L ., Gre a, H ., T ilikete, C ., Vig hetto, A ., De smurget, M ., Ro de, G . et al. (2000). An “automatic pilot” for the hand in human posterior parietal cortex: Toward reinterpreting optic ataxia. Nature Neuroscience, 3, 729–736. Platt, M. L. (2002). Neural correlates of decisions. Current Opinion in Neurobiology, 12, 141–148. Platt, M. L., & Glimcher, P. W. (1997). Responses of intraparietal neurons to saccadic targets and visual distractors. Journal of Neurophysiology, 78, 1574–1589. Platt, M. L ., & G limcher, P. W. (1999). Neural correlates of decision variables in parietal cortex. Nature, 400, 233–238. Powell, K. D., & Goldberg, M. E. (2000). Response of neurons in the lateral intraparietal area to a d istractor flashed during the delay period of a memory-guided saccade. Journal of Neurophysiology, 84, 301–310. Powers, W . T . ( 1973). Behavior: The c ontrol of pe rception. N ew Y ork: Aldine. Pylyshyn, Z. W. (1984). Computation and cognition: Toward a foundation for cognitive science. Cambridge, MA: MIT Press. Quintana, J., & Fu ster, J. M . (1999). From perceptions to ac tions: Temporal integrative functions of prefrontal and parietal neurons. Cerebral Cortex, 9, 213–221. Rainer, G ., A saad, W. F., & M iller, E . K . (1998). S elective re presentation of relevant information by neurons in the primate prefrontal cortex. Nature, 363, 577–579. Redgrave, P., Prescott, T. J., & Gurney, K. (1999). The basal ganglia: A vertebrate solution to the selection problem? Neuroscience, 89, 1009–1023.
244
Embodiment, Ego-Space, and Action
Reynolds, J. H., & Desimone, R. (1999). The role of neural mechanisms of attention in solving the binding problem. Neuron, 24, 19–25. Riehle, A., & Requin, J. (1989). Monkey primary motor and premotor cortex: Si ngle-cell ac tivity related to p rior i nformation about d irection and extent of an intended movement. Journal of Neurophysiology, 61, 534–549. Rizzolatti, G ., & Lu ppino, G . (2001). The c ortical motor s ystem. Neuron, 31, 889–901. Romo, R ., H ernandez, A ., & Z ainos, A . (2004). N euronal c orrelates o f a perceptual decision in ventral premotor cortex. Neuron, 41, 165–173. Romo, R ., H ernandez, A ., Z ainos, A ., L emus, L ., & B rody, C . D. (2002). Neuronal correlates of decision-making in secondary somatosensory cortex. Nature Neuroscience, 5, 1217–1225. Rowe, J. B ., Toni, I ., Josephs, O., Frackowiak, R . S ., & Pa ssingham, R . E . (2000). The p refrontal c ortex: re sponse s election o r ma intenance within working memory? Science, 288, 1656–1660. Sawamoto, N., Honda, M., Hanakawa, T., Fukuyama, H., & Sh ibasaki, H. (2002). Cognitive slowing in Parkinson’s disease: A behavioral evaluation i ndependent o f m otor s lowing. Journal of N euroscience, 22 , 5198–5203. Schall, J. D . & Bi chot, N. P. (1998). Neural correlates of v isual and motor decision processes. Current Opinion in Neurobiology, 8, 211–217. Schultz, W., Tremblay, L., & Hollerman, J. R. (2000). Reward processing in primate orbitofrontal cortex a nd basal ga nglia. Cerebral Cortex, 10, 272–284. Searle, J. ( 1980). Minds, brains, a nd programs. Behavioral and Brain Sciences, 3, 417–457. Shadlen, M . N., & N ewsome, W. T. (2001). N eural ba sis o f a p erceptual decision in the parietal cortex (area lip) of the rhesus monkey. Journal of Neurophysiology, 86, 1916–1936. Shannon, C. E., & Weaver, W. (1949). The mathematical theory of information. Urbana: University of Illinois Press. Shen, L., & Alexander, G. E. (1997a). Neural correlates of a spatial sensoryto-motor transformation in primary motor cortex. Journal of Neurophysiology, 77, 1171–1194. Shen, L. & Alexander, G. E. (1997b). Preferential representation of instructed target location versus limb trajectory in dorsal premotor area. Journal of Neurophysiology, 77, 1195–1212. Singer, W. (2001). Consciousness a nd t he binding problem. Annals of th e New York Academy of Sciences, 929, 123–146. Skinner, B. F. (1957). Verbal b ehavior. N ew Y ork: A ppleton C entury Crofts.
The Affordance Competition Hypothesis
245
Snyder, L. H., Batista, A. P., & Andersen, R. A. (1997). Coding of intention in the posterior parietal cortex. Nature, 386, 167–170. Snyder, L . H ., B atista, A . P., & A ndersen, R . A . (1998). C hange i n motor plan, w ithout a c hange i n t he spa tial lo cus o f a ttention, m odulates activity i n poster ior pa rietal cortex. Journal of N europhysiology, 79, 2814–2819. Snyder, L . H., Batista, A. P., & A ndersen, R. A. (2000a). Intention-related activity i n t he p osterior pa rietal c ortex: A re view. Vision Re search, 40, 1433–1441. Snyder, L . H ., B atista, A . P., & A ndersen, R . A . (2000b). S accade-related activity in t he parietal reach region. Journal of Neurophysiology, 83, 1099–1102. Stein, J. F. (1992). The representation of egocentric space i n t he posterior parietal cortex. Behavioral and Brain Sciences, 15, 691–700. Sterelny, K . (1989). C omputational f unctional p sychology: P roblems a nd prospects. In P. Slezak & W. R. Albury (Eds.), Computers, brains, and minds (pp. 71–93). Dordrecht: Kluwer Academic. Still, A ., & C ostall, A . ( 1991). Against c ognitivism: Alte rnative foun dations f or cog nitive p sychology. H emel H empstead, U K: Ha rvester Wheatsheaf. Sugrue, L. P., Corrado, G. S., & Newsome, W. T. (2004). Matching behavior and t he re presentation of v alue i n t he pa rietal c ortex. Science, 304 , 1782–1787. Takikawa, Y., Kawagoe, R., & Hikosaka, O. (2002). Reward-dependent spatial s electivity of anticipatory a ctivity in monkey c audate neurons. Journal of Neurophysiology, 87, 508–515. Tanji, J., & H oshi, E. (2001). Behavioral planning in the prefrontal cortex. Current Opinion in Neurobiology, 11, 164–170. Thelen, E., Schöner, G., Scheier, C., & Smith, L. B. (2001). The dynamics of embodiment: A field theory of infant perseverative reaching. Behavioral and Brain Sciences, 24, 1–34. Thompson, E. & Varela, F. J. (2001). Radical embodiment: Neural dynamics and consciousness. Trends in Cognitive Science, 5, 418–425. Tipper, S. P., Howard, L. A., & Houghton, G. (1998). Action-based mechanisms o f a ttention. Philosophical Transactions of th e Ro yal S ociety, London B, 353, 1385–1393. Tipper, S . P., H oward, L . A ., & H oughton, G . ( 2000). B ehavioral c onsequences of selection from neural population codes. In S. Monsell & J. Driver (Eds.), Control of c ognitive processes: Attention and performance (Vol. 18, pp. 223–245). Cambridge, MA: MIT Press. Toates, F. (1998). The interaction of cognitive and stimulus-response processes i n t he c ontrol o f b ehavior. Neuroscience an d Biobe havioral Reviews, 22, 59–83.
246
Embodiment, Ego-Space, and Action
Tolman, E. C. (1948). Cognitive maps in rats and men. Psychological Review, 55, 189–208. Treue, S. (2001). Neural correlates of attention in primate visual cortex. Trends Neurosci., 24, 295–300. Turing, A. M. (1936). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of th e L ondon Math ematical Society, Series 2, 42, 230–265. Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In D. J. Ingle, M. A. Goodale, & R. J. W. Mansfield (Eds.), Analysis of visual behavior (pp. 549–586). Cambridge, MA: MIT Press. von der Ma lsburg, C. (1996). The binding problem of neural networks. In R. Llinás & P. S. Churchland (Eds.), The mind-brain continuum: Sensory processes (pp. 131–146). Cambridge, MA: MIT Press. Watson, J. B . (1913). Psychology a s t he behaviorist v iews it . Psychological Review, 20, 158–177. Welsh, T. N., Elliott, D., & Weeks, D. J. (1999). Hand deviations toward distractors. Evidence for response competition. Experimental Brain Research, 127, 207–212. Wenger, K. K., Musch, K. L., & Mink, J. W. (1999). Impaired reaching and grasping a fter focal inactivation of Globus Pallidus pars interna in the monkey. Journal of Neurophysiology, 82, 2049–2060. Wise, S. P., Boussaoud, D., Johnson, P. B., & Caminiti, R. (1997). Premotor and pa rietal c ortex: c orticocortical c onnectivity a nd c ombinatorial computations. Annual Review of Neuroscience, 20, 25–42. Wolpert, D. M., Ghahramani, Z., & Jordan, M. I. (1995). An internal model for sensorimotor integration. Science, 269, 1880–1882.
8 fMRI Investigations of Reaching and Ego Space in Human Superior Parieto-Occipital Cortex
Jody C. Culham, Jason Gallivan, Cristiana Cavina-Pratesi, and Derek J. Quinlan
The dorsal stream, from visual cortex to parietal cortex (and with projections to frontal cortex), plays a critical role in visually guided actions (Goodale & Milner, 1992; Milner & Goodale, 1995). Within the dorsal stream lies a mosaic of subregions specialized for actions with different effectors (Andersen & Buneo, 2002; Colby & Goldberg, 1999). For example, the macaque monkey brain contains a patchwork of a reas specialized for movements of t he e ye ( lateral i ntraparietal area, LIP), head (ventral intraparietal area, VIP), arm (parietal reach region, PRR), and hand (anterior intraparietal area, AIP) (Colby & Goldberg, 1999). Moreover, recent evidence from human neuroimaging has indicated that the human brain likely contains functional equivalents of t he effector-specific regions of t he mac aque pa rietal lobes (Culham, C avina-Pratesi, & S inghal, 2 006; C ulham & K anwisher, 2001; Culham & Valyear, 2006; Grefkes & Fink, 2005). Given the essential role of the dorsal stream in the control of visually guided actions (Goodale & Jakobson, 1992; Milner & G oodale, 1995), b rain r egions w ithin oc cipito-parietal co rtex a re pa rticularly concerned with acting in peripersonal space, that is, the space
247
248
Embodiment, Ego-Space, and Action
directly a ccessible by one ’s o wn b ody ( Previc, 1 998). P eripersonal space is important in that it affords an individual the potential to act on a nd ma nipulate ob jects. Rec ently, n europsychological e vidence has dem onstrated t hat t he b rain co ntains a u nique r epresentation of per ipersonal spac e (diPelligrino, L adavas, & F arne, 1997; L adavas, di Pellegrino, Farne, & Z eloni, 1998; Ladavas, Farne, Zeloni, & di Pellegrino, 2000; Ladavas, Zeloni, & F arne, 1998). The se studies have been performed on patients with extinction, a clinical disorder whereby right- or left-hemisphere brain damaged patients will fail to report a contralesional stimulus applied to them when another stimulus is concurrently applied to the ipsilesional side (for a review, see Ladavas, 2002). For example, Ladavas and colleagues reported that right b rain-damaged pa tients wh o su ffered f rom t actile ex tinction failed to report a tactile stimulus delivered to the contralesional hand when a v isual stimulus was presented near the ipsilesional hand of the patient, that is, within peripersonal space (Ladavas, di Pellegrino et al., 1998). Interestingly, when the ipsilesional visual stimulus was instead presented far from the hand (i.e., in extrapersonal space), the effects of tactile extinction were considerably weaker. Other patients with h emispatial n eglect dem onstrate s imilarly i ntriguing r esults when tested on standard neglect task such as line bisection. These patients may only show deficits when acting in peripersonal space but not i n ex trapersonal space (Berti & F rassinetti, 2000; Ha lligan & Marshall, 1991). Evidence f rom ne urophysiology s uggests t hat t he d ifferent macaque brain regions that encode particular effectors may be specifically tuned to particular ranges of space appropriate for the effector. For example, macaque VIP is an area that plays a role in defensive movements of the head, face and upper body (Cooke, Taylor, Moore, & Graziano, 2003). Accordingly, it is highly sensitive to stimuli, particularly moving stimuli, approaching the head in ultra-near space (Colby, Duhamel, & Goldberg, 1993). Elsewhere in macaque parietal cortex, bimodal (visual and somatosensory) neurons have receptive fields t hat en compass r eachable spac e ( Iriki, Tanaka, & I wamura, 1996; Maravita & Iriki, 2004). In contrast, brain areas involved in saccadic and smooth pursuit eye movements (e.g., LIP, frontal eye fields) are unlikely to be restricted to peripersonal space because eye movements can be directed to targets at any visible distance (although, as the third experiment here suggests, vergence of the eyes may provide a key signal for encoding near space).
fMRI Investigations of Reaching and Ego Space
249
We expect that in the human brain, as in the macaque monkey brain, re gions s pecific t o pa rticular e ffectors w ill b e t uned to t he space i n wh ich t hat e ffector c an ac t. N europsychological e vidence from humans certainly suggests that the human brain must include regions spec ialized f or per ipersonal spac e; h owever, t he t ypically large size of brain lesions makes it difficult to determine which specific sub regions a re i mplicated. To d ate, n euroimaging st udies o f peripersonal spac e ha ve be en q uite l imited. Weiss a nd co lleagues (Weiss et al., 2000) reported several parietal foci that had more activation during performance of a l ine bisection task in near than far space, but at that time, little was known about the effector-specificity of parietal cortex. Taken together, past research on humans has suggested the existence of effector-specific areas and has demonstrated the importance of peripersonal space; however, no human research has specifically add ressed whether t he optimal r ange of space i n a given brain area corresponds to the operating range of the effector for which it codes. This chapter will examine the role of the dorsal stream, particularly a reas w ithin oc cipito-parietal cortex, i n reaching ac tions a nd in encoding peripersonal space within reach. In particular, we will focus o n a g eneral r egion i n t he su perior pa rieto-occipital co rtex (SPOC) that has been implicated in reaching movements. Thre e recent st udies f rom o ur lab su ggest t hat S POC p lays a ke y r ole i n arm t ransport ( Experiment 1), t hat i t i s pa rticularly r esponsive t o stimuli i n per ipersonal space (Experiment 2), a nd t hat it is modulated by whether the subject’s gaze is directed to near or far targets (Experiment 3). We will briefly highlight the findings from each of these studies and then discuss possible interpretations of these studies taken together.
Experiment 1: Activation from Arm Transport during Reaching Actions in Superior Parieto-Occipital Cortex Rationale The macaque monkey brain includes an occipito-parietal circuit for the guidance of reaching movements. As shown in Figure 8.1a, this circuit includes projections from visual area V6 in occipital cortex to area V6A and the medial intraparietal area, MIP in parietal cortex.
250
Embodiment, Ego-Space, and Action
Figure 8.1 Schematic r epresentation o f a ction-related ar eas in t he m acaque monkey brain (a) and human brain (b). The cortical surfaces were defined at t he gray-white matter border and have been partially inflated to reveal regions within the s ulci (concavities, d ark g ray) a s we ll a s on t he g yri (convexities, l ight g ray). Key sulci are indicated by white lines. For each species, both a postero-lateral view (left column) and medial view (right column) are shown. a) In the macaque brain, early visual areas (not shown) provide input to visual area V6, which sends output to v isual a rea V6A a nd t he me dial i ntraparietal a rea (MIP). B oth V6A a nd M IP are responsive during reaching movements. b) In the human brain, two reachselective areas have been identified, one in the medial intraparietal sulcus (mIPS) and one in the superior parieto-occipital cortex (SPOC). Some have proposed that SPOC may include posterior and anterior subdivisions that correspond to V6 and V6A, respectively (Pitzalis, Galletti et al., 2006b; Pitzalis, Sereno et al., 2006). A full-color version of the figure is available online at http://psychology.uwo.ca/culhamlab/PDFs/Culham_etal_CMUchapter8_ColorFigs.pdf.
fMRI Investigations of Reaching and Ego Space
251
These a reas l ie w ithin mac aque su perior pa rieto-occipital co rtex, near t he j unction o f t he d orsal pa rieto-occipital su lcus (POS) a nd the intraparietal sulcus (IPS), with V6A sandwiched between V6 and MIP. B oth V6A a nd M IP a re v isuomotor a reas t hat r espond d uring r eaching m ovements a nd en code r each d irection; wh ereas, V6 appears to provide visual information from the dorsal stream to V6A and MIP, but does not itself demonstrate motor responses (Galletti, Kutz, G amberini, B reveglieri, & F attori, 2 003). The nomenclature and f unctional definitions of reach-related a reas vary considerably from lab to lab, with some labs studying V6 and V6A (e.g., Galletti et al., 2003), some labs studying a parieto-occipital region (PO) that includes pa rts o f bo th V6 a nd V6A (e.g., C olby, G attass, O lson, & Gross, 1988), some labs studying MIP (e.g., Eskandar & Assad, 1999), and other labs studying the parietal reach region, PRR (e.g., Snyder, Batista, & A ndersen, 2000). Recent examinations of recording sites suggest that the location of PRR likely overlaps with area MIP in the caudal medial w all of the i ntraparietal sulcus (Calton, D ickinson, & Snyder, 2 002; G ail & A ndersen, 2 006). Regardless of t he confusion over the specific regions, the common feature of these areas is that t hey a re i nvolved i n t he p lanning a nd ex ecution o f r eaching movements. Neuroimaging s tudies h ave i nvestigated re ach-selective re gions in h umans; h owever, t he l iterature t o d ate i s r ather co nfusing f or two reasons. One source of confusion is the variety of tasks that have been studied. Given the practical problems of studying true reaching movements i n wh ich t he a rm ex tends to enable t he ha nd to touch a t arget, ma ny labs ha ve u tilized po inting m ovements i nstead. I n pointing movements, the subject keeps the hand in a fixed location, but aims the index finger toward a distant target. This approach has the considerable advantage of minimizing arm movements and the resultant a rtifacts t hat a re t roublesome f or f MRI (Culham, 2 006). However, po inting m ovements a re q uite d ifferent f rom r eaching movements in their nature. First, whereas reaching movements are performed to interact with an object (e.g., to push an elevator button or pick up a c up of coffee), pointing movements typically serve a communicative function (e.g., to indicate to another individual where something of i nterest l ies). Second, whereas reaching movements are only effective within range of the arm and hand, pointing movements are not constrained by space. Indeed, one can point to a star in the sky that is light years away even though reaching the star is unthinkable.
252
Embodiment, Ego-Space, and Action
A second source of confusion is the wide range of brain regions that have been inconsistently reported and emphasized by various g roups. Ma ny of t he early neuroimaging st udies found a la rge number of areas involved in reaching (Connolly, Goodale, Desouza, Menon, & Vilis, 2000; Culham, Danckert et al., 2003; Grafton, Fagg, Woods, & Arbib, 1996; Kawashima et al., 1996; Kertzman, Schwarz, Zeffiro, & Ha llett, 1 997). M ore r ecently, a t l east t wo r each-related zones in the human brain have been identified with greater consistency (Figure 8.1b). One region lies in the anterior part of the PPC, medial t o t he i ntraparietal su lcus ( medial I PS, m IPS). S ome ha ve implied t hat t his a rea i s t he f unctional eq uivalent o f t he h uman PRR (DeSouza et al., 2000), or more specifically MIP (Grefke s, Ritzl, Zilles, & Fink, 2004). A second region lies in SPOC and has also been suggested a s a f unctional eq uivalent o f PRR (Connolly, A ndersen, & Goodale, 2003) or V6A (Pitzalis, Sereno et al., 2006). To date, the precise anatomical location of SPOC remains rather vague and thus we a re using t his fa irly general term to refer to a g eneral region at the su perior-medial a spect o f t he pa rieto-occipital j unction, n ear the superior end of the parieto-occipital sulcus (see General Discussion for elaboration of potential subregions). An intriguing paper by Prado and colleagues (2005) suggests that these two regions have different functional properties: the former region, mIPS, responds during reaching movements regardless of where t he eyes a re d irected; the la tter r egion, w ithin S POC, r esponds d uring r eaching m ovements to peripheral but not foveated targets. We wanted to conduct an experiment that would identify brain areas specific to true reaching (not pointing) and would isolate activation s pecific t o t he ac t o f t ransporting t he a rm t o t he l ocation of a t arget. To do so, we took advantage of a l ongstanding distinction between two components of reach-to-grasp actions. Jeannerod (1981, 1984) p roposed t hat a r each-to-grasp ac tion co mprised t wo key components controlled by separate v isuomotor “channels”: t he movement of the hand to the object (transport component) and the formation of the hand into a grip appropriate for grasping the object (grip co mponent). Giv en t hat t ransport a nd g rip co mponents f requently co-occur, they are clearly tightly coordinated (Frak, Paulignan, Jeannerod, Michel, & Cohen, 2006; Jeannerod, 1986; Jeannerod, Decety, & M ichel, 1994); however, d ifferent attributes of t he object are relevant for each of the two components, with object location and distance from the hand being most relevant for arm transport and
fMRI Investigations of Reaching and Ego Space
253
with object shape, size and orientation being most relevant for hand grip preshaping. Although there is some dispute about whether transport and grip components are truly distinct (e.g., Smeets & Brenner, 1999), developmental a nd ne uropsychological s tudies on v isuomotor b ehavior suggest so me deg ree o f d issociation. De velopmental st udies ha ve found t hat wh ile o ne-week-old n ewborns c an t ransport t he a rm toward a fi xated object (von Hofsten, 1979, 1982), it is not until 4 to 5 m onths of age t hat t he g rip component appears a nd not u ntil one y ear t hat p recision g rip c an be obs erved ( DiFranco, M uir, & Dodwell, 1978; Halverson, 1931). In adults, neuropsychological studies have shown that lesions that include AIP impair the formation of the grip component (Binkofski et al., 1998); whereas more posterior lesions w ithin pa rietal co rtex, i ncluding su perior pa rieto-occipital cortex, impair the transport component (Karnath & Perenin, 2005). Our prior neuroimaging st udies su ggest t hat, wh ile A IP is ac tivated by both reaching and grasping, the activation is reliably greater for grasping, presumably because grasping requires preshaping of the hand based on object properties (Culham et al., 2003). In particular we found that AIP is selectively activated when object properties are computed for the purpose of grasping (for example scaling the finger aperture to match object size) a nd not for perception (Cavina-Pratesi, Goodale, & Culham, 2007). Moreover, in fMRI experiments of delayed grasping, we found greater AIP activation for grasping than reaching both during the visual presentation of the object and during t he execution of t he ac tion (Culham, 2 004; Singhal, K aufman, Valyear, & Culham, 2006), suggesting that the area is neither strictly visual nor activated solely by motor or somatosensory components of the task. In E xperiment 1, we u sed f unctional magnetic re sonance i maging ( fMRI) t o i nvestigate wh ether t he b rain a reas m ediating a rm transport a re s eparate f rom t hose t hat m ediate g rip f ormation. In our ex periment, subjects were presented w ith a s eries of t hreedimensional objects placed either in a near location, adjacent to the hand, or a far location, within reach of the hand but not immediately adjacent. Subjects performed three types of tasks at each of the two locations: (1) touching the object with the knuckle of the right hand; (2) grasping and picking up the object with the right hand; or (3) passively viewing the object. The transport component was manipulated by positioning the objects in the reachable location (requiring arm
254
Embodiment, Ego-Space, and Action
Figure 8.2 Design and results of Experiment 1 investigating brain activation for transport and grip components of reach-to-grasp movements. a) Schematic representation of the actions tested in Experiment 1: actions executed toward reachable vs. adjacent locations in space (transport component) are depicted in the right and left side of both panels, respectively; grasping versus touching actions (grip component) are depicted in the upper and lower panels, respectively. The yellow cross represents the location of the fi xation point with respect to the position of the objects. b) Group a ctivation map highlighting A IP (in circle) for c omparing g rasping vs. touching (at the reachable location). Activation is rendered on one axial slice of an average a natomical for a ll subjects. c) Bar g raph d isplays t he magnitude of p eak activation in percent BOLD signal change (%BSC) in each experimental condition averaged across subjects in left AIP. d) Group activation map highlighting upper and lower POS for c omparing touching executed at t he reachable location vs. the adjacent location. Again, activations are rendered on one a xial slice of a n average anatomical for all subjects. e) Bar graphs display the magnitude of peak activation in % BSC in each experimental condition averaged across subjects in the upper and lower POS. Sulci are indicated by white lines: solid line = postcentral sulcus; thick dotted line = intraparietal sulcus (IPS) and thin dotted line = parieto-occipital sulcus (POS). A full-color version of the figure is available online at http://psychology. uwo.ca/culhamlab/PDFs/Culham_etal_CMUchapter8_ColorFigs.pdf
transport) versus the adjacent location (requiring no arm transport). The grip component was manipulated by asking the subjects to grasp the object (requiring a grip component) versus simply touching it with the knuckle (requiring no grip component). Subjects kept their
fMRI Investigations of Reaching and Ego Space
255
gaze fixed upon a point of light throughout each trial. A s chematic representation o f t he ac tions per formed b y t he pa rticipants i n t he adjacent and in the reachable position is depicted in Figure 8.2a. Methods and Results A high-field (4 Tesla) f MRI scanner was used to collect blood-oxygenation level dependent (BOLD) activation in 10 right-handed subjects who performed actions involving a transport component, a grip component, bo th, o r n either. S ubjects la y su pine w ith t heir h eads tilted such that the natural line of gaze was toward the workspace of the hand (Culham et al., 2003). A tilted platform was positioned over the hips and subjects rested the right arm and hand at the base of the platform. The subject had the right upper arm supported by a brace that p revented m ovement o f t he sh oulder a nd h ead, b ut a llowed rotation of the elbow and wrist. Thus the moveable range of the arm formed an arc slightly less than 90 degrees of rotation (see yellow area in Figure 8.3a). Variable objects, each constructed of several Lego® pieces, were placed on the table by the experimenter at one of two locations, either a n “adjacent” location i mmediately to t he left of t he ha nd, or a “ reachable” location upwards a nd to t he r ight of the hand (see Figure 8.2a). Subjects could touch or grasp objects in the near location merely by mov ing t he w rist; whereas, t hey could touch or grasp objects in the far location only by extending the elbow to m ove t he ha nd u p a nd t o t he r ight. S ubjects per formed o ne o f two actions on a g iven trial, either a reach-to-touch movement that involved touching the object with the knuckles, or a reach-to-grasp movement that involved grasping, lifting, and returning the object. As a control, two additional passive viewing conditions, one for each object distance were included. Thus the paradigm was a 2 × 3 factorial design with distance (adjacent vs. reachable) and task (touching vs. grasping vs. passive viewing) factors. A slow event-related design was used to ensure that if hand-movement artifacts occurred, they could be removed while preserving the BOLD response that typically occurs several seconds later. Standard imaging pa rameters w ere u sed ( 3 × 3 × 5 –6 m m v oxels, v olume acquisition time=2 s) to collect data within occipital, parietal, posterior frontal, and superior temporal cortex. Subjects were required to maintain fi xation on a sma ll l ight-emitting d iode (LED) placed
256
Embodiment, Ego-Space, and Action
TABLE 8.1 Summary of the Present Studies (Experiments 1, 2, and 3) and other Magnetoencephalography (MEG), Functional Magnetic Resonance Imaging (fMRI), and Positron Emission Tomography (PET) Studies Reporting Activation in Superior Parieto-Occipital Cortex as Shown in Figure 8.5 (PCu = precuneus; Cu = cuneus; POJ = parieto-occipital junctions; POS = parietooccipital sulcus) Reference (technique)
Contrast
Talairach coordinates X
Y
Z
Source in reference for Talairach coordinates
Source figure in reference for foci in our Figure 8.5
Luminance flicker Portin et al., 1998 (MEG)
Luminance flicker > Pattern flicker
Dechent & Frahm, 2003 (fMRI)
Luminance flicker > Pattern flicker
Figure 4
N/A -60 N/A -70
2 15
(V6) Table 2 average (V6A)
Figures 5, 6
Pointing preparation Astafiev et al., 2003 (fMRI)
Delayed pointing > Delayed saccade
-7
-79
42
Table Supplementary Material Pcu
Figure 1E
Connolly et al., 2003 (fMRI)
Delay activity for effector and location > Delay activity for effector only
-1
-74
38
Results secti on
Figure 3
Reaching preparation Beurze et al., 2006 (fMRI)
Cue for target location > fixation
-24
-67
31
Table 2
Figure 2
Cue for effector > fixation
-21
-70
37
Table 3
Figure 3
-10
-90
36
Table 2 (POJ)
Figure 3a
Reaching Prado et al., 2005 (fMRI)
Reach to nonfoveated targets > Reach to foveated targets
fMRI Investigations of Reaching and Ego Space
Reference (technique)
Contrast
Talairach coordinates X
Y
Z
Source in reference for Talairach coordinates
257
Source figure in reference for foci in our Figure 8.5
Pellijeff et al., 2006 (fMRI)
Reaching to novel position > Reaching to repeated position
-21
-58
42
Table 1 Average (Pcu)
Figure 1
DeJong et al., 2001 (PET)
Reach to variable targets > Reach the same target
-22
-82
29
Table 1 Average (PCu, Cu and POS)
Figure 1
Experiment 1 (fMRI)
Reach-to-touch > Touch AND Reach-tograsp > Grasp
-7
-82
30
Average(upperlower POS)
Figure 1
-11
-72
46
Results section
Figure 10
V6 Retinotopy Pitzalis et al., 2006 (fMRI)
Wide-field retinotopic map Near preference
Experiment 2 (fMRI)
Passive-viewing within reach > Passiveviewing outside reach
1
-75
29
Figure 2
Experiment 3b (fMRI)
Vergence near the head > Vergence far from the head
-8
-86
28
Figure 3
midway between the two objects. The room remained d ark except for a 2 s period for each trial in which the object was illuminated and the action was executed. Prior to each trial, the experimenter placed a new object on t he platform a nd t he subject received a n auditory cue via headphones to “reach,” “grasp,” or “look” on the upcoming trial. At t he beginning of each trial, a b right LED mounted on t he ceiling of the magnet was illuminated for 2 s, prompting the subjects
258
Embodiment, Ego-Space, and Action
to perform the cued action (and then return the hand to the starting location) or to passively view the object. After each trial, the subject rested in darkness for a 12 s intertrial interval. We first identified brain areas involved in the grip component by performing a r andom e ffects contrast between grasping objects at the reachable location vs. touching objects at the reachable location, consistent w ith previous st udies (Binkofski et a l., 1998; Culham et al., 2003; Frey, Vinton, Norlund, & Grafton, 2005). As expected, this contrast produced activation in the anterior intraparietal (AIP) cortex, specifically at the junction of the IPS and the postcentral sulcus (PCS; see Figures 8.2b; Talairach coordinates in Table 8.1). AIP also showed higher activation for grasping vs. reaching at the adjacent location (Figure 8.2c). We then identified brain areas involved in the transport co mponent b y per forming a co ntrast be tween t ouching objects at the reachable location vs., touching objects in the adjacent location. This contrast produced activation in SPOC (see Figures 8.2d), which also showed higher activation for grasping objects in the reachable vs. adjacent location (Figure 8.2e). The SPOC activation for t he t wo pa ssive v iewing conditions w as identical (Figure 8 .2e), suggesting that stimulus confounds (such as retinal location) could not account for the activation difference attributed to the transport component. Implications These results demonstrate that the transport and the grip component of a reach-to-grasp task rely on different brain structures. While AIP is activated by the computation of grip aperture regardless of whether a reach is required to acquire the object, SPOC is much more active when a ctions a re ex ecuted t oward a n o bject r equiring a rm ex tension. A f unctional d issociation be tween t he t wo co mponents d oes not imply that they work separately from one another. Indeed the two components t ake place simultaneously a nd behavioral ex periments have shown that they are closely choreographed. In the future, functional connectivity studies would be v aluable for investigating the nature of the crosstalk between SPOC and AIP.
fMRI Investigations of Reaching and Ego Space
259
Experiment 2: A Preference for Objects within Arm’sLength in Superior Parieto-Occipital Cortex Rationale We reasoned that if SPOC is involved in reaching movements, it may show a preferential response to objects within reachable space. Given past research from our lab (Cavina-Pratesi et al., 2007) showing that human A IP a nd S POC a re ac tivated b y t he v isual p resentation o f an object within reachable space even without any overt action, we investigated whether or not such passive viewing responses would be modulated by whether objects were within vs. out of reach.
Methods and Results Within t he s ame s essions a s E xperiment 1 , a nd u sing t he s ame setup a nd t he s ame 10 sub jects, w e r an E xperiment 2 t o ex amine whether the response in transport- and grip-related areas would be modulated by ob ject d istance. Once again, we presented objects i n the adjacent and reachable locations; however, we also included an additional location that was beyond reach (see Figure 8.3a). Subjects maintained fi xation on a central point throughout all trials. On some trials, subjects were i nstructed to reach-to-touch or reach-to-grasp objects i n one of t he t wo reachable locations (though ac tions were never performed to the other two locations). On other trials, subjects simply passively viewed an object placed at any of the three locations (adjacent, reachable, and unreachable). We per formed a co njunction a nalysis t o i dentify r egions t hat were more activated during passive viewing for objects within reach than outside of reach ([adjacent > u nreachable] A ND [reachable > unreachable]). As shown in Figure 8.3b, this contrast produced activation in SPOC (Talairach coordinates in Table 8.1). As expected by the co ntrast u sed t o i dentify t he a rea, t here w as h igher ac tivation during for passive viewing of adjacent and reachable locations than unreachable locations; in addition, the area responded more strongly to grasping and reaching (at the reachable location) than to passive viewing (Figure 8.3b). The activation partially overlapped with the transport-related region identified in Experiment 1.
260
Embodiment, Ego-Space, and Action
Figure 8.3 Methods, s tatistical m aps a nd f MRI a ctivation for E xperiment 2 investigating responses to re achable vs. u nreachable objects. a) S chematic representation of t he t hree p ossible lo cations at w hich objects we re pre sented d uring passive viewing trials. The arc highlights the area corresponding to t he moveable range of the arm. The cross represents the location of the fi xation point. In addition to these three conditions, two other conditions, not shown, were included: Grasping an object at the reachable location and touching an object at the reachable location. b) Group activation showing the region of SPOC that was activated by a conjunction analysis of ([adjacent > unreachable] AND [reachable > unreachable]). c) Bar graphs display the magnitude of peak activation (%BSC) in all conditions for the region circled in b. A full-color version of the figure is available online at http:// psychology.uwo.ca/culhamlab/PDFs/Culham_etal_CMUchapter8_ColorFigs.pdf
Implications These results are consistent with earlier suggestions that peripersonal space may have a particular relevance within the dorsal stream. Specifically, they suggest that neurons within SPOC show a preferential response to ob jects w ithin re achable s pace, e ven w hen no e xplicit action is required. The se findings are consistent with the suggestion that an object can automatically evoke affordances, potential actions that can be performed on the particular object (Gibson, 1979). Moreover, they suggest such affordances may have neural correlates within brain areas responsible for particular types of actions. We have additional control experiments underway to ensure that these r esults a re n ot d ue t o pos sible st imulus co nfounds such a s object size or position within the visual field; however, we think such confounds are unlikely to account for our data. In our experiments,
fMRI Investigations of Reaching and Ego Space
261
the objects had t he same physical size but naturally further objects subtended a smaller retinal image size than closer objects. Although some brain a reas w ithin t he ventral st ream have been found to be modulated b y r etinal i mage s ize ( Hasson, Ha rel, L evy, & Ma lach, 2003; Ma lach, L evy, & Ha sson, 2 002), o ur ac tivation w as f ound within t he d orsal st ream, wh ere o ne w ould ex pect r eal w orld s ize would be more relevant than retinal size. Another possible concern is the difference in retinal position of the objects. The placement of the objects was restricted by the reachable space, which was limited to an arc-shaped zone with the fulcrum at the right elbow. Thu s, the retinal location of the objects in the three positions could not be held constant. Ba sed on t he geometry of t he s etup: (1) a ll t hree objects were in the lower visual field with the near object being more peripheral and the far object appearing closer to the fovea; (2) the adjacent and unreachable objects were in the left visual field while the reachable object was in the right field; and (3), the fi xation point was midway in depth between the adjacent and reachable objects (as in Expt. 1). We d on’t bel ieve t hat t hese fac tors co ntributed t o o ur findings because: (1) t here were no activation differences i n SPOC between the adjacent and reachable objects, suggesting that retinal eccentricity isn’t likely to play a role; (2) if visual hemifields were a critical factor, we would predict g reater left hemisphere activation for objects in the reachable location compared to the adjacent and unreachable locations (with t he co nverse pa ttern i n t he r ight h emisphere), b ut no such pa ttern was observed; and (3) given that SPOC lies within the dorsal stream and is sensitive only to low spatial frequencies, it is unlikely to be sensitive to the image blurring that would strongest for the furthest object. Given that the reach-selective SPOC appears to be more activated by objects in reachable space than beyond, a f uture line of research will investigate whether this effect can be m odulated by ex tending peripersonal space by providing the subject with a tool. Growing evidence suggests t hat tools can ex tend t he range of action space a nd this can affect neural and behavioral responses. A seminal study by Iriki and colleagues (1996) demonstrated that when a macaque monkey learns to use a tool, the receptive fields of reach-selective neurons in t he i ntraparietal co rtex ex panded t o en compass t he spac e t hat became reachable with the tool. Human neuropsychological studies have also found that peripersonal space is modified by availability of a tool. For example, a patient with left neglect in peripersonal space
262
Embodiment, Ego-Space, and Action
showed an extension of that neglect to far space during line bisection tasks when using a stick but not when using a laser pointer, suggesting that the stick was treated as an extension of the body but the laser pointer was not (Berti & F rassinetti, 2000). Although these human neuropsychological st udies su ggest t hat t he human b rain, l ike t he monkey b rain, co ntains n eurons t uned t o ac tion spac e, t he la rge extent of lesions makes it difficult to determine which areas contain such neurons. We expect that SPOC is one such region and that its response to objects during passive viewing should be modulated by the availability of a tool to extend reachable space. Experiment 3: A Preference for Near Gaze in Superior Parieto-Occipital Cortex Rationale Experiment 3 from our lab (Quinlan & Culham, 2007) also suggests that the human SPOC may be particularly responsive to near space. Specifically, we found that SPOC activation was modulated by gaze distance, with stronger responses when subjects were fixating upon a near point than a far point. This research arose from an earlier experiment that had originally been i ntended t o ex amine t he pos sibility o f a p reference f or n ear space in a h uman area that has been proposed as the human functional eq uivalent o f t he mac aque v entral i ntraparietal (V IP) a rea (Bremmer et a l., 2001; Goltz et a l., 2001; see a lso Sereno & H uang, 2006). Electrophysiological studies have shown that a subset of neurons within macaque VIP respond more strongly to motion in ultranear space (very close to the face) than at further distances (Colby, Duhamel, & Goldberg, 1993), so we investigated whether putative human VIP demonstrated a similar near preference to motion. In an initial experiment, we had presented subjects with patterned objects that loomed toward the face and receded. The objects could be presented at one of three distance ranges: near the face, above the hand, or above the feet. Stimuli were carefully equated for low-level visual properties such as visual angle, velocity and so forth. Although we did n ot obs erve a p reference f or ob jects m oving i n n ear spac e v s. far space within the putative human VIP, we did observe activation in SPOC. In our initial experiments, we had i nstructed subjects to
fMRI Investigations of Reaching and Ego Space
263
follow t he looming-and-receding t argets w ith t heir e yes. Thus one factor that may have led to activation in the superior occipital cortex was the distance at which gaze was directed. We conducted an experiment to de termine whether simply having the eyes gaze on a n ear vs. far point could induce activation in the superior parieto-occipital cortex. When the eyes are directed to close targets, a near response is invoked that consists of three components called the near triad. First, when looking at near targets, the eyes rotate inward to maintain fi xation on the object with each eye (vergence). Second, the lens of the eye thickens to keep the object in focus (accommodation). Third, t he pupil constricts to i ncrease t he depth-of-field. A lthough t hese co mponents ha ve so metimes be en studied i n i solation ( Hasebe e t a l., 1 999; R ichter, C ostello, S ponheim, Lee, & P ardo, 2004; R ichter, Lee, & P ardo, 2000), in the real world, they co-occur. Therefore, we simply asked the subjects to look at each point, such that vergence, accommodation, and pupil size all provided cues as to the depth of the fixation point.
Methods and Results We gave eight right-handed subjects the simple task of gazing at small (0.7°) st ationary l ights ( LEDs) a t o ne o f t hree d istances a long t he natural line of sight: 15, 26 or 84 cm from the eye (See Figure 8.4a). The LEDs were viewed in an otherwise completely dark scanner and were calibrated to have the same luminance and visual angle. Only one LED was illuminated at a t ime and subjects were instructed to maintain fi xation o n wh ichever L ED w as c urrently i lluminated. When one LED was extinguished and another was illuminated, the subject made a s imple vergence shift (w ithout a ny s accadic c omponents) from the first LED to the second. LEDs were illuminated for 16 s a t a t ime i n ps eudo-random order. Subjects lay supine w ithin the magnet and viewed the LEDs through a mirror tilted at approximately 45°. A su rface coil was used to provide high signal-to-noise within the occipital and parietal cortices. A contrast of near vs. far viewing produced robust activation just posterior to the superior parieto-occipital sulcus in all eight subjects (Figures 8 .4b & 8 .4c; Talairach coordinates i n Table 8 .1). The time courses from this region within SPOC showed that following an initial t ransient response to a cha nge i n gaze d istance, t here was a
264
Embodiment, Ego-Space, and Action
Figure 8.4 Methods, s tatistical m aps a nd f MRI a ctivation for E xperiment 3 investigating re sponses to ne ar v s. f ar ve rgence. a ) S chematic re presentation of the e ye p ositions u sed i n t he d istance fi xation e xperiment. The e yeballs a nd t he vergence a ngle a re showed f rom above. Subjects fi xated one of t hree i lluminated light emitting diodes (LEDs) that were positioned at 15, 26 and 84 cm. Fixation was held for 16 seconds at w hich time the LED was extinguished and a ne w LED was i lluminated. b) A ctivation m ap re sulting f rom a c omparison of ne ar v s. f ar fi xations. c) B ar g raph d isplays t he m agnitude of s ustained a ctivation i n SP OC (%BSC) for e ach fi xation d istance, ave raged a cross subjects. A f ull-color ve rsion of the figure is available online at http://psychology.uwo.ca/culhamlab/PDFs/Culham_etal_CMUchapter8_ColorFigs.pdf
sustained response that scaled with the distance of the fixation point (highest for the near point, lowest for the far point). At lower thresholds, w e obs erved ac tivation s ites el sewhere i n t he oc cipital l obe, though these were less consistent between subjects and less robust than t he S POC f ocus. E ye t racking o utside t he s canner i ndicated that the activation differences were not due to differences in stability of gaze across the three distances.
Implications These r esults su ggest t hat S POC ac tivation i s m odulated b y g aze distance, wh ich ma y p rovide t he d orsal st ream w ith i nformation about object distance for action. In order to compute real world distance, the visual system needs information about where the eyes are currently d irected ( based on vi sual signals, proprioceptive signals from the eye muscles, and/or efference copy signals generated with the co mmand t o m ove t he e yes) a s w ell a s i nformation abo ut t he location of the target with respect to gaze (based on retinal location and binocular disparity). We propose that the modulation of SPOC
fMRI Investigations of Reaching and Ego Space
265
activity by gaze distance provides the first key component necessary for computing target locations for action. Both single neurons of the macaque PRR (Cohen & Andersen, 2002) and a reach-related region of the human brain (in the anteromedial IPS) (DeSouza et al., 2000) have responses that can be modulated by directing eye gaze leftward vs. rightward. Such eye-position dependent modulation properties, sometimes referred to as gain fields, are thought to play an important role in the conversion of information from retinotopic to egocentric (e.g., head-centered) coordinate frames. Our results suggest that gain fields may also exist in the third dimension, depth, to provide signals which could also be useful for the computation of physical distance, which is particularly important for the accurate control of actions. Indeed, behavioral studies suggest that eye position and vergence play an important role in the accuracy of reaching movements (Bock, 1986; Henriques & Cr awford, 2000; Henriques, Klier, Smith, Lowy, & Cr awford, 1998; Henriques, Medendorp, Gielen, & Crawford, 2003; Neggers & Bekkering, 1999; van Donkelaar & Staub, 2000). Because w e a llowed a ll t hree co mponents o f t he n ear r esponse (vergence, accommodation, and changes in pupil size) to co-occur, we cannot definitively state whether any one of these three components i s t he d riving f orce i n t he n ear-selective r esponse i n S POC. However, pa st r esearch ha s su ggested v ergence p rovides a m uch stronger cue to distance than the other two components (e.g., Foley, 1980). General Discussion To s ummarize, w e h ave r eported th ree s tudies th at h ighlight th e importance o f t he h uman S POC i n t ransporting t he a rm d uring reaching m ovements a nd i n en coding per ipersonal spac e. S patial encoding of peripersonal space appears to be ba sed on modulation of activation by both object position (with gaze fi xed) a nd by gaze distance (when n o ob ject i s p resent). A lthough t he ex act r elationships between the activation foci in our three experiments are yet to be determined, t hese results taken together suggest t hat t he SPOC region in general may be a key node within the dorsal stream for the computation of object distance, as needed to g uide actions such a s reaching.
266
Embodiment, Ego-Space, and Action
Taken t ogether, t he r esults o f t he t hree ex periments su ggest that multiple f actors a ffect r esponses w ithin SPOC. G aze d istance alone may su ffice to modulate responses i n SPOC (Experiment 3). However, e ven wh en g aze is h eld co nstant, t he S POC r esponse t o objects during passive viewing depends on whether or not they are in reachable space (Experiment 2). Furthermore, the SPOC response depends n ot o nly o n abso lute d istance, b ut o n ac tions per formed toward objects: t he response t o f urther, but st ill reachable o bjects, can be higher than the response to adjacent objects when actions are performed on the objects (Experiment 1). At first this may seem contrary to the findings of Experiments 2 and 3 of a near preference in SPOC; however, the computations for guidance of the arm to an object are more complex when t he object is f urther f rom t he ha nd and this may recruit SPOC to a greater degree. In add ition, o ur d ata su ggest t hat e ye pos ition ma y be a nother critical component in the relationship between space and hand. That is, t onic s ignals abo ut c urrent g aze d istance ( perhaps v ergence i n particular) may provide useful signals for enhancing the response to stimuli in near space and for computing the egocentric target location to guide arm movements. Other research has also suggested that SPOC may encode eye position information. First, the region is part of a network for eye movements (Paus et al., 1997). Second, SPOC is modulated by saccadic eye movements, even in the dark (Law, Svarer, Rostrup, & Paulson, 1998), supporting our findings that eye position signals are important in the area, even in the absence of other visual stimulation or task demands. There is growing evidence from past studies, as well as the three new studies presented here, to suggest that SPOC plays an important role in actions such as reaching and pointing; however, it remains to be determined whether SPOC comprises different subregions. Preliminary comparisons within subjects suggested some overlap between the transport-selective activation in lower POS in Experiment 1 and the reachable-selective activation in Experiment 2; however, no such intrasubject comparisons were possible between Experiments 1 & 2 compared to E xperiment 3. Figure 8 .5 presents a s chematic of t he activation foci from numerous studies which have reported SPOC activation. Our loose definition of SPOC includes the superior end of the parieto-occipital sulcus, as well as the regions immediately posterior (in the cuneus) and anterior (in the precuneus) to the sulcus. Several cha racteristics of t he SPOC region c an be n oted i n Figure
fMRI Investigations of Reaching and Ego Space
267
Figure 8.5 Summary of activation foci within superior parieto-occipital cortex in nine past studies and the three present studies. Activation foci are shown on the medial su rface o f o ne r epresentative sub ject’s l eft hemisphere. The c ortical s urface was defined at the gray-white matter border and has been partially inflated to reveal regions within the sulci (concavities, in dark gray) and on the gyri (convexities, in light gray). Foci are schematically represented based on their sizes and anatomical locations relative to the parieto-occipital, calcarine, and cingulate sulci, as depicted in figures from t he original studies, as specified in Table 1. A f ull-color version of t he figure i s av ailable on line at h ttp://psychology.uwo.ca/culhamlab/ PDFs/Culham_etal_CMUchapter8_ColorFigs.pdf
8.5. First, the response properties in the region strongly suggest it belongs w ithin t he dorsal st ream. Using human ma gnetoencephalography (MEG), Ha ri a nd colleagues have reported a f ocus i n t he dorsal pa rieto-occipital su lcus w ith d orsal st ream p roperties: fa st latencies, sensitivity to luminance rather than pattern changes, and motion selectivity (Hari & Salmelin, 1997; Portin et al., 1998; Vanni, Tanskanen, Seppa, Uutela, & Ha ri, 2001). Human f MRI has found somewhat more i nferior fo ci for l uminance ( vs. p attern) c hanges (Dechent & Frahm, 2003) and blinking (Bristow, Frith, & Rees, 2005). Second, SPOC has been commonly activated by the preparation and execution of pointing a nd reaching movements, w ith some studies reporting activation anterior to t he superior POS i n t he precuneus (Astafiev et al., 2003; Connolly et al., 2003; Pellijeff et al., 2006; Prado et al., 2005), and some studies also reporting activation in the POS or behind it in the cuneus (Beurze et al., 2007; Connolly et al., 2003; de
268
Embodiment, Ego-Space, and Action
Jong et al., 2001). Third, the recent human fMRI work of one group with experience in neurophysiology of reach-related a reas (Galletti et a l., 2 003) ha s l ed t o t he p roposal t hat t he h uman eq uivalent o f V6 lies posterior to the superior POS while the human equivalent of V6A is more anterior, on the parietal side of the superior POS. Putative human V6 contains a s imilar retinotopic map a s mac aque V6 (Pitzalis, Galletti et al., 2006b); whereas, putative human V6A, like macaque V6A, has only weak eccentricity mapping and shows reachrelated responses (Pitzalis, Galletti et al., 2006a). In sum, recent evidence from other labs and from the three experiments summarized here suggest that the human SPOC is a d orsal stream area involved in planning actions to locations in near space based on information such as current gaze angle.
Acknowledgments This research was funded by grants to JCC from the Canadian Institutes o f H ealth Re search ( CIHR), t he N atural S ciences a nd E ngineering Research Council (of Canada), the Canadian Foundation for Innovation and the (Ontario) Premier’s Research Excellence Award. CCP was funded by a CIHR grant to the Group on Action and Perception. We thank Claudio Galletti and Patrizia Fattori for explaining the relationship between the parietal reach region and area MIP. We also thank Marlene Behrmann and John Zettel for comments on an earlier draft.
References Andersen, R. A., & Buneo, C. A. (2002). Intentional maps in posterior parietal cortex. Annual Review of Neuroscience, 25, 189–220. Astafiev, S. V., Shulman, G. L., Stanley, C. M., Snyder, A. Z., Van Essen, D. C., & C orbetta, M. (2003). Functional organization of human intraparietal and frontal cortex for attending, looking, and pointing. Journal of Neuroscience, 23(11), 4689–4699. Berti, A ., & F rassinetti, F. (2000). W hen far becomes near: remapping of space by tool use. Journal of Cognitive Neuroscience, 12(3), 415–420. Beurze, S. M., De Lange, F. P., Toni, I., & Medendorp, W. P. (2007). Integration of t arget and e ffector i nformation i n t he h uman b rain d uring reach planning. Journal of Neurophysiology, 97(1), 188–199.
fMRI Investigations of Reaching and Ego Space
269
Binkofski, F., Dohle, C., Posse, S., Stephan, K. M., Hefter, H., Seitz, R. J., et al. (1998). Human a nterior i ntraparietal a rea subs erves prehension: A combined lesion and functional MRI activation study. Neurology, 50(5), 1253–1259. Bock, O. (1986). Contribution of retinal versus extraretinal signals towards visual localization in goal-directed movements. Experimental Brain Research, 64(3), 476–482. Bremmer, F., Schlack, A., Shah, N. J., Z afiris, O., Kubischik, M., Hoffman, K.-P. et al. (2001). Polymodal motion processing in posterior parietal and premotor cortex: A human fMRI study strongly implies equivalencies between humans and monkeys. Neuron, 29(1), 287–296. Bristow, D., Frith, C., & Rees, G. (2005). Two distinct neural effects of blinking on human visual processing. Neuroimage, 27(1), 136–145. Calton, J. L., Dickinson, A. R., & Snyder, L. H. (2002). Non-spatial, motorspecific activation in posterior parietal cortex. Nature Neuroscience, 5(6), 580–588. Cavina-Pratesi, C., Goodale, M. A., & Cu lham, J. C . (2007). f MRI reveals a d issociation b etween g rasping a nd p erceiving t he si ze o f re al 3D objects. Public Library of Science (PLOS) One, 2(5), e424. Cohen, Y. E ., & A ndersen, R . A . (2002). A c ommon ref erence f rame f or movement plans in the posterior parietal cortex. Nature Reviews Neuroscience, 3(7), 553–562. Colby, C . L ., D uhamel, J.-R., & G oldberg, M . E . (1993). Ventral i ntraparietal a rea o f t he mac aque: A natomic lo cation a nd v isual re sponse properties. Journal of Neurophysiology, 6(3), 902–914. Colby, C. L., Gattass, R., Olson, C. R., & Gross, C. G. (1988). Topographical organization of cortical afferents to extrastriate visual area PO in the macaque: A dual tracer study. Journal of Comparative Neurology, 269, 392–413. Colby, C. L., & Goldberg, M. E. (1999). Space and attention in parietal cortex. Annual Review of Neuroscience, 22, 319–349. Connolly, J. D., Andersen, R. A., & Goodale, M. A. (2003). fMRI evidence for a “parietal reach region” in the human brain. Experimental Brain Research, 153(2), 140–145. Connolly, J. D ., G oodale, M . A ., De souza, J. F ., Menon, R . S ., & Vi lis, T. (2000). A c omparison o f fr ontoparietal f MRI a ctivation d uring anti-saccades an d an ti-pointing. Journal of N europhysiology, 8 4(3), 1645–1655. Cooke, D. F., Taylor, C. S., Moore, T., & Gr aziano, M. S. (2003). Complex movements evoked by m icrostimulation of t he ventral i ntraparietal area. Proceedings of th e National Academy of S ciences of th e United States of America, 100(10), 6163–6168. Culham, J. C. (2004). Human brain imaging reveals a parietal area specialized for grasping. In N. Kanwisher & J. Duncan (Eds.), Attention and
270
Embodiment, Ego-Space, and Action
performance: Vol. 10. Func tional b rain im aging of hum an c ognition (pp. 417–438). Oxford: Oxford University Press. Culham, J. C. (2006). Functional neuroimaging: Experimental design and analysis. In R. Cabeza & A. Kingstone (Eds.), Handbook of functional neuroimaging of cognition (2nd ed., pp. 53–82). Cambridge MA: MIT Press. Culham, J. C., Cavina-Pratesi, C., & Singhal, A. (2006). The role of parietal cortex in visuomotor control: What have we learned from neuroimaging? Neuropsychologia, 44(13), 2668–2684. Culham, J. C ., Danckert, S. L ., DeSouza, J. F., Gati, J. S ., Menon, R . S., & Goodale, M . A . ( 2003). Vi sually g uided g rasping p roduces f MRI activation in dorsal but not ventral stream brain areas. Experimental Brain Research, 153(2), 180–189. Culham, J. C., & Kanwisher, N. G. (2001). Neuroimaging of cognitive functions i n h uman pa rietal c ortex. Current O pinion in N eurobiology, 11(2), 157–163. Culham, J. C ., & V alyear, K . F. (2006). Human pa rietal c ortex i n ac tion. Current Opinion in Neurobiology, 16(2), 205–212. de Jong, B. M., van der Graaf, F. H., & Paans, A. M. (2001). Brain activation related to t he representations of external space and body scheme in visuomotor control. Neuroimage, 14(5), 1128–1135. Dechent, P., & Frahm, J. (2003). Characterization of the human visual V6 complex by f unctional magnetic resonance imaging. European Journal of Neuroscience, 17(10), 2201–2211. DeSouza, J. F ., D ukelow, S . P., G ati, J. S ., Menon, R . S ., A ndersen, R . A ., & Vi lis, T. (2000). E ye p osition sig nal m odulates a h uman pa rietal pointing region during memory-guided movements. Journal of Neuroscience, 20(15), 5835–5840. DiFranco, D., Muir, D. W., & Dodwell, P. C. (1978). Reaching in very young infants. Perception, 7, 385–392. diPelligrino, G., Ladavas, E., & Farne, A. (1997). Seeing where your hands are. Nature, 388, 730. Eskandar, E. N., & Assad, J. A. (1999). Dissociation of visual, motor and predictive signals in parietal cortex during visual guidance. Nature Neuroscience, 2(1), 88–93. Foley, J. M . (1980). Bi nocular d istance p erception. Psychological Re view, 87(5), 411–434. Frak, V., Paulignan, Y., Jeannerod, M., Michel, F., & Cohen, H. (2006). Prehension movements in a pa tient (AC) w ith posterior parietal cortex damage a nd p osterior c allosal s ection. Brain and C ognition, 60 (1), 43–48. Frey, S. H., Vinton, D., Norlund, R., & Grafton, S. T. (2005). Cortical topography of human a nterior i ntraparietal c ortex ac tive during v isually
fMRI Investigations of Reaching and Ego Space
271
guided g rasping. Brain Re search, Cognitive Brain Re search, 23(2–3), 397–405. Gail, A., & A ndersen, R . A. (2006). Neural dy namics in monkey parietal reach re gion re flect context-specific s ensorimotor t ransformations. Journal of Neuroscience, 26(37), 9376–9384. Galletti, C., Kutz, D. F., Gamberini, M., Breveglieri, R., & Fattori, P. (2003). Role o f t he m edial pa rieto-occipital c ortex i n t he c ontrol o f re aching a nd g rasping m ovements. Experimental Brain Re search, 153(2), 158–170. Gibson, J. J. ( 1979). The ecol ogical appr oach t o v isual pe rception. B oston: Houghton Mifflin. Goltz, H. C., Dukelow, S. P., De Souza, J. F. X., Culham, J. C., van den Berg, A. V., Goosens, H. H. L. et al. (2001). A putative homologue of monkey area VIP in humans. Paper presented at the Society for Neuroscience, San Diego, CA. Goodale, M. A., & Jakobson, L. S. (1992). Action systems in the posterior parietal cortex. Behavioral and Brain Sciences, 15(4), 747. Goodale, M. A., & Milner, A. D. (1992). Separate visual pathways for perception and action. Trends in Neurosciences, 15(1), 20–25. Grafton, S. T., Fagg, A. H., Woods, R. P., & Arbib, M. A. (1996). Functional anatomy of pointing and grasping in humans. Cerebral Cortex, 6(2), 226–237. Grefkes, C., & Fink, G. R. (2005). The functional organization of the intraparietal sulcus in humans and monkeys. Journal of Anatomy, 207(1), 3–17. Grefkes, C., Ritzl, A., Zilles, K., & Fink, G. R. (2004). Human medial intraparietal c ortex s ubserves v isuomotor c oordinate t ransformation. Neuroimage, 23(4), 1494–1506. Halligan, P. W., & Ma rshall, J. C . (1991). Left neglect for near but not far space in man. Nature, 350(6318), 498–500. Halverson, H. M. (1931). An experimental study of prehension in infants by means of systematic cinema records. Genetic Psychology Monographs, 10, 110–286. Hari, R., & Salmelin, R. (1997). Human cortical oscillations: A neuromagnetic view through the skull. Trends in Neurosciences, 20(1), 44–49. Hasebe, H ., O yamada, H ., K inomura, S ., K awashima, R ., O uchi, Y ., Nobezawa, S . e t a l. (1999). H uman c ortical a reas ac tivated i n rel ation to v ergence e ye m ovements-a P ET s tudy. Neuroimage, 1 0(2), 200–208. Hasson, U., Ha rel, M., L evy, I., & Ma lach, R . (2003). L arge-scale m irrorsymmetry o rganization o f h uman o ccipito-temporal ob ject a reas. Neuron, 37(6), 1027–1041.
272
Embodiment, Ego-Space, and Action
Henriques, D. Y., & Cr awford, J. D . (2000). D irection-dependent d istortions o f re tinocentric spac e i n t he v isuomotor t ransformation f or pointing. Experimental Brain Research, 132(2), 179–194. Henriques, D. Y., K lier, E . M., Smith, M. A., Lowy, D., & Cr awford, J. D . (1998). Gaze-centered remapping of remembered v isual space in a n open-loop pointing task. Journal of Neuroscience, 18(4), 1583–1594. Henriques, D. Y., Medendorp, W. P., Gielen, C. C., & Crawford, J. D. (2003). Geometric computations underlying eye-hand coordination: Orientations of t he t wo e yes a nd t he he ad. Experimental Brain Re search, 152(1), 70–78. Iriki, A ., T anaka, M ., & I wamura, Y . ( 1996). C oding o f m odified body schema during tool use by mac aque postcentral neurones. Neuroreport, 7(14), 2325–2330. Jeannerod, M. (1981). Intersegmental coordination during reaching at natural visual objects. In J. Long & A. Baddeley (Eds.), Attention an d Performance ( Vol. 60, pp. 153–168). Hillsdale NJ: Erlbaum. Jeannerod, M. (1984). The timing of natural prehension movements. Journal of Motor Behavior, 16(3), 235–254. Jeannerod, M . (1986). Mechanisms o f v isuomotor c oordination: A s tudy in n ormal a nd b rain- d amaged sub jects. Neuropsychologia, 2 4(1), 41–78. Jeannerod, M ., De cety, J., & M ichel, F . ( 1994). I mpairment o f g rasping movements following a b ilateral posterior parietal lesion. Neuropsychologia, 32(4), 369–380. Karnath, H. O., & Perenin, M. T. (2005). Cortical control of visually guided reaching: evidence from patients with optic ataxia. Cerebral Cortex, 15(10), 1561–1569. Kawashima, R., Naitoh, E., Matsumura, M., Itoh, H., Ono, S., Satoh, K. et al. (1996). Topographic representation in human intraparietal sulcus of reaching and saccade. Neuroreport, 7, 1253–1256. Kertzman, C., Schwarz, U., Zeffiro, T. A., & Hallett, M. (1997). The role of posterior pa rietal c ortex i n v isually g uided re aching movements i n humans. Experimental Brain Research, 114(1), 170–183. Ladavas, E . (2002). Functional and dy namic properties of v isual peripersonal space. Trends in Cognitive Sciences, 6(1), 17–22. Ladavas, E., diPellegrino, G., Farne, A., & Zeloni, G. (1998). Neuropsychological evidence of an integrated visuotactile representation of peripersonal spac e i n h umans. Journal of C ognitive N euroscience, 1 0(5), 581–589. Ladavas, E., Farne, A., Zeloni, G., & di Pellegrino, G. (2000). Seeing or not seeing w here yo ur ha nds a re. Experimental Br ain Re search, 1 31(4), 458–467.
fMRI Investigations of Reaching and Ego Space
273
Ladavas, E., Zeloni, G., & Farne, A. (1998). Visual peripersonal space centred on the face in humans. Brain, 121(Pt. 12), 2317–2326. Law, I., Svarer, C ., Rostrup, E ., & Pa ulson, O. B. (1998). Pa rieto-occipital cortex ac tivation during s elf-generated e ye movements i n t he d ark. Brain, 121(Pt. 11), 2189–2200. Malach, R ., L evy, I ., & Ha sson, U. (2002). The topography of h igh-order human object areas. Trends in Cognitive Sciences, 6(4), 176–184. Maravita, A., & Iriki, A. (2004). Tools for the body (schema). Trends in Cognitive Sciences, 8(2), 79–86. Milner, A. D., & Goodale, M. A. (1995). The visual brain in action. Oxford: Oxford University Press. Neggers, S. F., & B ekkering, H. (1999). Integration of v isual and somatosensory target information in goal-directed eye and arm movements. Experimental Brain Research, 125(1), 97–107. Paus, T., Jech, R., Thompson, C. J., C omeau, R., Peters, T., & Evans, A. C. (1997). Transcranial magnetic stimulation during positron emission tomography: A new method for studying connectivity of the human cerebral cortex. Journal of Neuroscience, 17(9), 3178–3184. Pellijeff, A ., B onilha, L ., M organ, P. S ., M cKenzie, K ., & J ackson, S . R . (2006). Pa rietal u pdating o f l imb p osture: A n e vent-related f MRI study. Neuropsychologia, 44(13), 2685–2690. Pitzalis, S., Galletti, C., Huang, R. S., Patria, F., Committeri, G., Galati, G. et al. (2006a). Visuotopic properties of the putative human homologue of the macaque V6A. Paper presented at the Organization for Human Brain Mapping, Florence, Italy. Pitzalis, S ., G alletti, C ., H uang, R . S ., Pa tria, F., C ommitteri, G ., G alati, G. et al. (2006b). Wide-field retinotopy defines human cortical visual area v6. Journal of Neuroscience, 26(30), 7962–7973. Pitzalis, S., Sereno, M., Committeri, G., Galati, G., Fattori, P., & Galletti, C. (2006). A possible human homologue of the macaque V6A. Journal of Vision, 6(6), 536a. Portin, K ., Salenius, S ., Salmelin, R ., & Ha ri, R . (1998). Activation of t he human occipital and parietal cortex by pattern and luminance stimuli: Neuromagnetic measurements. Cerebral Cortex, 8(3), 253–260. Prado, J., C lavagnier, S ., O tzenberger, H ., S cheiber, C ., Ken nedy, H ., & Perenin, M . T. (2005). Two c ortical s ystems f or re aching i n c entral and peripheral vision. Neuron, 48(5), 849–858. Previc, F. H. (1998). The neuropsychology of 3-D space. Psychological Bulletin, 124(2), 123–164. Quinlan, D. J., & Cu lham, J. C. (2007). fMRI reveals a preference for near viewing in the human superior parieto-occipital cortex. Neuroimage, 36(1), 167–187.
274
Embodiment, Ego-Space, and Action
Richter, H. O., Costello, P., Sponheim, S. R., Lee, J. T., & Pardo, J. V. (2004). Functional ne uroanatomy of t he h uman ne ar/far re sponse to bl ur cues: Eye-lens accommodation/vergence to p oint t argets v arying i n depth. European Journal of Neuroscience, 20(10), 2722–2732. Richter, H. O., Lee, J. T., & Pardo, J. V. (2000). Neuroanatomical correlates of the near response: Voluntary modulation of accommodation/vergence in the human visual system. European Journal of Neuroscience, 12(1), 311–321. Sereno, M. I., & Huang, R. S. (2006). A human parietal face area contains aligned head-centered visual and tactile maps. Nature Neuroscience, 9(10), 1337–1343. Singhal, A., Kaufman, L., Valyear, K., & Culham, J. C. (2006). fMRI reactivation of the human lateral occipital complex during delayed actions to remembered objects. Visual Cognition, 14(1), 122–125. Smeets, J. B., & Brenner, E. (1999). A new view on grasping. Motor Control, 3(3), 237–271. Snyder, L . H ., B atista, A . P., & A ndersen, R . A . (2000). I ntention-related activity i n t he p osterior pa rietal c ortex: A re view. Vision Re search, 40(10–12), 1433–1441. van Donkelaar, P., & Staub, J. (2000). Eye-hand coordination to visual versus remembered targets. Experimental Brain Research, 133(3), 414–418. Vanni, S., Tanskanen, T., Seppa, M., Uutela, K., & Hari, R. (2001). Coinciding early activation of the human primary visual cortex and anteromedial c uneus. Proceedings of th e N ational A cademy of S ciences, U.S.A., 98(5), 2776–2780. von H ofsten, C . ( 1979). De velopment o f v isually-directed re aching: The approach phase. Journal of Human Movement Studies, 5, 160–178. von Hofsten, C . (1982). Eye-hand coordination in t he newborn. Developmental Psychology, 18, 450–461. Weiss, P. H., Marshall, J. C., Wunderlich, G., Tellmann, L., Halligan, P. W., Freund, H. J., et al. (2000). Neural consequences of acting in near versus far space: A p hysiological ba sis for clinical d issociations. Brain, 123( Pt. 12), 2531–2541.
9 The Growing Body in Action What Infant Locomotion Tells Us About Perceptually Guided Action
Karen E. Adolph
A Changeable Body in a Variable World Twenty years ago, Eleanor Gibson (1987) asked, “What does infant perception tell us about theories of perception?” (p. 515). Her answer was t hat t heories of perception, t ypically built on adults’ behavior in esoteric recognition and discrimination tasks, must take into account the perceptual accomplishments of young infants. Although infants cannot recognize letters, follow researchers’ instructions, or provide researchers with verbal judgments about their percepts, they can generate perceptual i nformation t hrough spontaneous ex ploratory movements a nd u se it for g uiding motor ac tion. I n Gibson’s (1987) words, “Present-day theories of perception are going to have to give an account of how perception gives rise to and guides action” (p. 518). Perhaps, the fact that infants make such difficult subjects in traditional perception pa radigms encouraged researchers to recognize the links between perception and motor action. Infants create visual, t actile, vestibular, a nd muscle-joint i nformation by moving their eyes, heads, and bodies. The consequent perceptual information 275
276
Embodiment, Ego-Space, and Action
can provide the basis for selecting and modifying future movements because it is both exteroceptive, specifying events and objects in the environment, a nd proprioceptive, spec ifying t he c urrent st atus of the body and its participation in ongoing events. Here, I pose Gibson’s question once again, but now in the context of a new literature on the perceptual guidance of motor action. The notion that a central function of perception is to guide motor action is no longer new. But, in t he midst of a n ew generation of t heorizing ba sed on adults’ behavior i n e soteric motor dec ision t asks, my answer will be that perception-action studies with infants tell us that theories of perception-action coupling will have to take learning and development into account. Issues of learning and development are accentuated in research with infants because changes in infants’ bodies and motor skills are especially dramatic, and encounters with novel features of t he environment a re especially pronounced compared with later periods of life. However, learning and development are not limited to infancy. Throughout the life span, bodily propensities change due to gains and losses in weight, muscle stiffness, and strength. New motor sk ills a re acquired a nd old ones a re lost. The environment still holds some surprises. At any age, the central problem for understanding perceptually guided action is how observers cope with a changeable body in a variable world. Embodied and Embedded Action Motor actions are always embodied and embedded (Bernstein, 1996; Clark, 1997). The functional outcome of motor actions is inextricably bound to the biomechanical status of the body (the size, shape, mass, compliance, strength, flexibility, and coordination of the various body parts) and the physical properties of the surrounding environment (the surfaces and media that support the body, the objects toward which movements are directed, and the effects of gravity acting on the various body parts). A few, haphazard, feather-kicks of a 12-week-old f etus c an p ropel i t so mersaulting, n early w eightlessly, through the buoyant amniotic fluid, whereas powerful muscle forces, precisely timed to exploit inertia, are required for an expert gymnast to launch a so mersault at the end of a t umbling run. From fetus to skilled a thlete, m otor ac tions a re a lways constrained o r fac ilitated by the body’s current dimensions and propensities in the context of
The Growing Body in Action
277
an i mmediate en vironment w ith pa rticular p hysical su pports a nd hindrances. Depending o n t he c urrent co nstellation o f body a nd en vironmental fac tors, t he s ame f unctional outcome c an require very d ifferent muscle actions and, reciprocally, the same muscle actions can result i n ve ry d ifferent f unctional out comes (B ernstein, 1996). For example, for fetuses to bring their hands to their mouths in the first weeks of gestation, they must flex their arms at the shoulder because their a rm buds a re so sh ort (Moore & P ersaud, 1998). To per form the same hand-to-mouth behavior several weeks later, fetuses must bend their arms at the elbow to take their longer limbs into account (Robinson & K leven, 2005). The same leg kicks that somersault the fetus through the amniotic fluid early in gestation fail to extend the legs toward t he end of gestation when t he growing fetus is pressed against t he u terine w all ( de V ries, V isser, & Pr echtl, 1 982). A fter birth, v igorous l eg k icks flex a nd ex tend t he l egs, b ut w ithout t he buoyancy of the amniotic fluid, gravity keeps t he infant rooted in place (Thelen & Fisher, 1983). The cha nging constraints of t he body a nd environment a re not limited to developmental changes (such as the lengthening and differentiation o f t he f etal a rm a nd t he dec rease i n spac e w ithin t he uterus). At a ny point i n t he l ifespan, t he fac ts of embodiment c an vary due to seemingly insignificant factors in the course of everyday activity (Adolph, 2002; Reed, 1989). Carrying an object can alter the body’s functional dimensions. Variations in clothing and footgear can affect the ability to create resistive forces. Leaning forward, lifting an arm, turning the head, or even drawing a deep breath can create moment-to-moment changes in the location of the body’s center of mass and, as a consequence, pose continually changing demands for m aintaining b alance. S imilarly, v ariability an d n ovelty in t he environmental co ntext i s t he r ule, n ot t he ex ception. A ctions a re typically performed in a world that is cluttered with potential obstacles and opportunities. Objects and surfaces have dimensions, material properties, and locations that can change from one encounter to the next (think of cha irs, doors, a nd t he condition of t he sidewalk due to weather a nd debris). Ma ny objects move a nd pa rticipate i n events—people, animals, balls, and cars. As the Greek philosopher, Heraclitus put it, “No man ever steps in the same river twice, for it is not the same river and he is not the same man.”
278
Embodiment, Ego-Space, and Action
Affordances The Gibsons’ concept of affordances captures the functional significance of embodied a nd embedded ac tion (E. J. Gibson, 1982; E . J. Gibson & Pick, 2000; J. J. Gibson, 1979). Affordances are possibilities for motor action. The probability of performing a n action successfully depends on the fit between the behaviorally relevant properties of the body i n relation to the surrounding environment (Adolph & Berger, 2006; Warren, 1984). As such, affordances reflect the objective state of affairs. Actions have a certain probability of success, regardless of whether actors perceive, misperceive, or take advantage of the possibilities. Thus, the notion of affordances is distinct from claims about h ow a ffordances m ight be per ceived o r ex ploited, a nd t he description of an affordance involves only the relationship between the relevant physical features of the body and environment. Because a ffordances a re r elational, t he fac ts o f em bodiment must be t aken w ith reference to t he properties of t he environment in which the body i s embedded and vice versa. For example, walking over open ground is possible only when walkers have sufficient strength, post ural control, a nd endurance relative to t he length of the path, and the slant, rigidity, and texture of the ground surface. Walking between two obstacles is possible only when walkers’ largest body dimensions are smaller than the size of the opening (Warren & Whang, 1987). In fact, bodily propensities and environmental properties are so intimately connected for supporting motor actions that changes in a single factor on either side of the affordance relationship alter the probability of successful performance. For example, Figure 9.1A shows the affordance function for a t ypical 14-month-old toddler walking down an adjustable sloping ramp. On shallower slopes, from 0° to 20°, the probability of walking successfully is close to 1.0. On slopes between 20° and 32°, tiny changes in the degree of slant cause the probability of success to drop from 1.0 to 0. The affordance threshold, defined here as the slope where the probability of success is 0.50, is approximately 26°. Figure 9.1B shows the affordance function f or a n ad ult w oman w alking t hrough a n ad justable d oorway (Franchak & Adolph, 2007). For wider doorways, larger than 31 cm, the probability of walking successfully is 1.0. With small changes in doorway width, the probability of success drops from 1.0 to 0. The estimated affordance threshold is 30 cm. Reciprocally, changes in body dimensions and propensities—both structural and dynamic aspects of the body—affect the probability
The Growing Body in Action A
B
C
D
279
Figure 9.1 Affordance t hresholds. (A) A ffordance f unction for one 1 4-monthold infant walking down slopes. The function is calculated by the ratio of successful a nd f ailed at tempts to w alk. The affordance t hreshold i s t he e stimated s lope where the probability of walking successfully is 0.50. (B) Affordance function and threshold for one woman walking through doorways. (C) Developmental changes in affordance thresholds for one infant walking down slopes. Figure adapted from R. Kail (Ed.), Advances in Child Development & Behavior, by K. E. Adolph, “Learning to keep balance,” 2002, with permission from Elsevier Science. (D) Developmental changes in affordance thresholds for one pregnant woman walking through doorways.
of performing these same actions successfully. The changes may be temporary: For example, affordance thresholds for infants walking down slopes decreased by 4.4°, on average, while loaded with lead weights on their shoulders (Adolph & Avolio, 2000). Similarly, affordance thresholds for navigating between two obstacles would increase while carrying a p izza box or wearing bulky clothing. Or, cha nges may be more permanent as the result of injury, aging, growth, weight gain/loss, increase/decrease in motor proficiency, and so on. Figure 9.1C shows t he cha nge i n a ffordance t hresholds for w alking down slopes f or o ne i nfant o ver w eeks o f te sting ( Adolph, 1 997). W ith changes in the infant’s body dimensions and walking skill, he could manage steeper slopes each week than the last. Figure 9.1D shows the change i n a ffordance t hresholds for pa ssing t hrough doorways for one woman over the course of her pregnancy (Franchak & Adolph,
280
Embodiment, Ego-Space, and Action
2007). Increase in her body dimensions and decrease in her ability to contort and compress her abdomen caused a corresponding increase in affordance thresholds over weeks of testing. How Development Changes Affordances for Action The examples of infants walking down slopes and pregnant women walking through doorways illustrate how rapidly and dramatically affordances c an cha nge w ith de velopment. A pa ssable d oorway a t one month becomes impassable a month later. An impossibly steep slope this week becomes navigable a week later. In essence, when possibilities for motor action are plotted against a continuously changing feature of the environment (as in Figures 9.1A-B), developmental changes in body dimensions and skills shift the affordance function back and forth along the x-axis, causing a corresponding shift in the value of the threshold. Developmental changes in the body, its propensities, and the surrounding environment are especially striking in the beginning of life. Body Growth The rate of fetal body growth, for example, is far more astronomical than the external view of mothers’ bulging abdomens might suggest. At 4 weeks postconception, the average embryo is 4 mm long, about the size of a pe a (Moore & Persaud, 1998). The head comprises one half of t he b ody length. From p ea-sized embryo to term ne wborn, fetuses i ncrease t heir h eight b y a pproximately 8 ,000% a nd t heir weight by 42,500%. By the end of gestation at 38 to 40 weeks, the newborn i s 57 c m long, a nd t he head comprises one fourth of t he body length (Ounsted & M oar, 1986). By co mparison, adults’ head length is one eighth of their total body height (C. E. Palmer, 1944). Rapid and dramatic body growth continues during infancy. Like fetal growth, the rates of change during infancy differ for different parts of the body. Gains in infants’ body length are faster than gains in weight and head circumference, so that overall body proportions undergo a general slimming. In essence, infants begin to grow into their large heads. Their top-heavy bodies become increasingly cylindrical and their center of mass moves from the bottom of the ribcage
The Growing Body in Action
281
to below the belly button. After 2 years of age, rate of body growth slows until the adolescent growth spurt. Continuous body growth as represented by the smooth curves on a standard growth chart is actually a m isrepresentation. Children’s body growth—from fetus to adolescent—is episodic, not continuous (Johnson, Veldhuis, & Lampl, 1996; Lampl, 1993; Lampl, Ashizawa, Kawabata, & J ohnson, 1998; Lampl & J eanty, 2003; Lampl & J ohnson, 1993; Lampl, Johnson, & Frongillo, 2001). That is, brief, 24-hr periods of extremely rapid growth are interspersed w ith long periods of stasis during wh ich no g rowth occurs for days or weeks on end. Figure 9.2A shows the smooth growth functions on a standard growth chart for height, and Figure 9.2B shows the actual episodic increases in height for one inf ant. Episodic growth i s c haracteristic of cha nges i n height, weight, head circumference, a nd leg bone growth. F or ex ample, d aily m easurements o f i nfants’ h eight sh ow increases of 0.5 to 1.65 cm punctuated by 2- to 28-day periods of no growth (Lampl, Veldhuis, & J ohnson, 1992). The t iming of g rowth spurts, the amplitude of the changes, and the length of the plateaus when no growth occurs show large intra- and intersubject variability. The long plateaus with no growth are not related to stress or illness. Rather, normal, healthy children grow in fits and starts. Even within a 24 -hr per iod, g rowth i s ep isodic. Ch ildren g row m ore a t n ight while lying down than during the day while standing and walking, especially in their weight-bearing extremities (Lampl, 1992; Noonan et al., 2004). Motor Proficiency and New Perception-Action Systems. During the same time period that infants’ body size and body proportions are changing, so are their abilities to perform various motor skills (for reviews, s ee Adolph & B erger, 2 005, 2 006; B ertenthal & Clifton, 1998). For example, between 5 a nd 8 m onths, most infants begin l earning t o s it ( Bly, 1 994; F rankenburg & Dodd s, 1 967). A t first, balance is so precarious that they must prop themselves upright in a tripod position by supporting their body weight on their arms between their outstretched legs. Their hamstrings are so loose, that they l ose ba lance i n a 36 0° a rc, i ncluding fa lling n ose-to-knees. Gradually, only one arm must be used for balance. Then, the hands become f reed f rom s upporting f unctions, first mome ntarily, a nd
282
Embodiment, Ego-Space, and Action
Length (cm)
A
Age (years)
Length (cm)
B
Age (mos) Figure 9.2 Growth curves for height. (A) Standard growth curves from birth to 18 years. Data are mathematically smoothed and averaged over children. Dashed line represents boys. Solid line represents girls. Adapted from growth charts developed by t he National C enter for H ealth St atistics i n c ollaboration w ith t he National C enter for C hronic D isease P revention a nd H ealth P romotion ( 2000). (B) M icrogenetic e pisodic g rowth c urve for one e xemplar i nfant. E ach ve rtical line re presents d aily re plicate ob servations. Re printed f rom Science, 2 58, by M . Lampl, J. D. Veldhuis, and M. L. Johnson, “Saltation and Stasis: A Model of Human Growth,” pp. 801–803, 1992, with permission from the American Association for the Advancement of Science.
The Growing Body in Action
283
later for extended periods. Finally, infants can turn their heads, lean in various directions, and hold objects and manipulate them without losing balance. Most infants’ first success at forward locomotion is crawling (others may initially “log roll,” “bum shuffle,” or “cruise”). Many crawlers first ma ster p rone p rogression b y bel ly c rawling, wh ere t heir abdomens rest on the ground during part of each crawling cycle (Adolph, Vereijken, & Den ny, 1998). They use their arms, legs, and bellies i n v arious, id iosyncratic c ombinations, s ometimes pu shing with o nly o ne l imb i n a g irdle a nd d ragging t he la me a rm o r l eg behind, sometimes pushing with first the knee then the foot on one leg, a nd so metimes la unching t hemselves f rom k nees o r f eet o nto belly during each cycle. They may move arms and legs on alternate sides of the body t ogether like a t rot, ipsilateral limbs together like a lumbering bear, lift front then back limbs into the air like a bunny hop, and display irregular patterns of interlimb timing—the possibilities a re i mmense bec ause ba lance is so u nconstrained w ith t he abdomen r esting o n t he floor. Fi gure 9.3A s hows one i nfant’s p attern of interlimb timing over a series of cycles (Adolph et al., 1998). First, he moved his left arm forward. Then, he pushed with both legs and l ifted h is abdomen off the floor. Next he moved h is r ight a rm forward. Use of the right leg was especially variable, and for a l ong period the infant maintained the leg aloft. Despite tremendous intraand i nter-subject variability i n crawling movements, proficiency at belly c rawling i ncreases w ith e ach w eek o f p ractice, a s i ndicated by i mprovements i n c rawling spe ed ( Figure 9.4A) a nd step l ength (Adolph et al., 1998). Eventually (typically, at about 8 m onths of a ge), i nfants acquire sufficient strength and balance control to crawl with their abdomens off the floor. In contrast to the variability endemic in belly crawling, the t iming i n i nterlimb coo rdination o f ha nds-and-knees g aits i s nearly uniform (Adolph et al., 1998; Freedland & Bertenthal, 1994). Within the first week or two after achieving hand-and-knees crawling, i nfants m ove t heir a rms a nd l egs i n a n a lternating n ear-trot. Figure 9.3B shows the pattern of interlimb timing in one infant that is typical of hands-and-knees gaits. She moved her right arm and left leg together a nd her left a rm a nd right leg together, w ith t he a rms slightly preceding the legs in each case. Again, proficiency at crawling shows dramatic improvements over the course of a few months (Figure 9.4B), with increases in the amplitude and speed of crawling movements (Adolph et al., 1998).
284
Embodiment, Ego-Space, and Action
A
Limb
Belly Crawler
Time (seconds) B
Limb
Hands/Knees Crawler
Time (seconds) Figure 9.3 Patterns of i nterlimb t iming i n (A) b elly c rawling, a nd ( B) h andsand-knees crawling. Shaded regions represent t ime when t he l imb (or belly) was supporting t he b ody i n s tance. O pen re gions re present t ime w hen t he l imb (or belly) was moving forward in swing.
Typically, coincident with crawling at about 10 months of age (Frankenburg & Dodds, 1967), infants begin to master upright postures. They pull to a stand against furniture and later cruise—that is, move sideways in an upright position while holding onto furniture for support. Initially, cruisers’ arms do most of the work of supporting body w eight, ma intaining ba lance, a nd ste ering; t he l egs h old only part of the body weight (many infants begin cruising up on their toes) a nd often become crossed and entangled. As the legs take on more of a supporting function, infants put less weight on their arms
The Growing Body in Action
285
B
A
C
Figure 9.4 Developmental c hanges i n ve locity d uring ( A) b elly c rawling, ( B) hands-and-knees c rawling, a nd ( C) w alking. Cr awling d ata we re a dapted f rom Child Development, 69, by K. E. Adolph, B. Vereijken, and M. Denny, “Learning to crawl,” pp. 1299–1312, 1998, with permission from Blackwell Publishers.
and t he coo rdination be tween a rm a nd l eg m ovements i mproves (Vereijken & Adolph, 1999; Vereijken & Waardenburg, 1996). Walking i s t he m ost h eralded o f i nfants’ m otor sk ills, p robably because it is such an obvious step toward adultlike behavior. On average, infants take their first toddling steps toward the end of their first year (Frankenburg & Dodd s, 1967), but like most motor sk ills, t he range in the age of walking onset is extremely wide (9 to 16 months of age). Like crawling, improvements in walking are most rapid and dramatic in the first few months after onset, and thereafter begin to asymptote a nd r eflect m ore sub tle t ypes o f cha nges (Adolph, Vereijken, & Sh rout, 2003; Bril & B reniere, 1992; Bril & L edebt, 1998). Asymptotic performance curves are typical of motor performance, including measures of errors (e.g., missteps), variability (e.g., coefficient of variation), speed, amplitude, and accuracy (Schmidt & Lee, 1999). I nfants’ step l engths i ncrease, t he la teral d istance be tween
286
Embodiment, Ego-Space, and Action
their legs a nd out-toeing decrease, both i ntra- a nd i nterlimb coordination b ecome le ss v ariable a nd more e fficient, a nd a s shown i n Figure 9.4C (Adolph, Garciaguirre, & Badaly, 2007a; Garciaguirre, Adolph, & Shrout, 2007), infants walk faster. Note that novice walking infants move twice as fast in their first weeks of walking as experienced crawlers do in their last weeks of crawling. Once infants can string a s eries of consecutive w alking steps t ogether, t he ability to walk provides them with a more time-efficient mode of travel. The de velopmental cha nges i n s itting, c rawling, c ruising, a nd walking reflect more than the acquisition of motor proficiency. The transitions be tween e ach o f t hese post ural m ilestones r eflect the acquisition of new perception-action systems (Adolph, 2002, 2005). Whereas changes in motor proficiency shift the affordance function back and forth along the x-axis (as illustrated in Figure 9.1), transitions to new perception-action systems create new affordance functions. Each posture represents a d ifferent problem space defined by a unique set of parameters for maintaining balance. Each has a different key pivot around which the body rotates (the hips for sitting, the wrists for crawling, the shoulders for cruising, and the ankles for walking) and a different region of permissible postural sway within which the body can rotate before falling. Infants use different muscle groups for keeping t he body u pright and for propelling it forward. There a re d ifferent v antage po ints f or v iewing t he g round a head, correlations be tween v isual a nd v estibular i nformation, ac cess t o mechanical i nformation f rom t ouching t he g round, a nd so o n. I n Campos and colleagues’ (2000) words, The ma pping b etween v ision a nd p osture t hat r esults f rom crawling experience will need to be remapped as the infant acquires new motor sk ills such a s s tanding a nd w alking…In fact, remapping is likely to occur with the acquisition of every new motor skill in a continuously coevolving perception-action cycle. (p. 174)
Each post ure r equires m oving d ifferent body pa rts a nd ba lance i s controlled by different sources of visual, vestibular, and mechanical information. After the infancy period, it is hard to imagine comparable novelty in perception-action systems, but bicycling, swimming, and swinging arm over arm along monkey bars are likely candidates because these sk ills i nvolve such d ifferent c onstraints on maintaining bal-
The Growing Body in Action
287
ance. In bicycling, for example, the key pivot is at the bottom of the front wheel and the region of permissible postural sway depends on the angles of the wheels as well as the body’s position. While swinging on monkey bars, the body rotates around the shoulders, and the arms must support the entire body weight. Rather than representing the acquisition of new perception-action systems, most of the locomotor sk ills acq uired d uring ch ildhood a nd ad ulthood (driving a car, motorcycling, rock climbing, ice skating, skiing, surf boarding, etc.) may reflect the growth and adaptation of existing perceptionaction systems for sitting, crawling, cruising, and walking. Environment Developmental changes in infants’ bodies and skills bring about corresponding changes in the environment. New postures and vantage points and new and improved forms of mobility allow infants to gain access to new places a nd surfaces. Instead of looking at t he legs of the coffee table, they can peer over the top. Instead of looking at an object from across the room, they can go to retrieve it. Rather than waiting for caregivers to transport them, they can go to see things for themselves. Features of the environment that adults take for granted, such as sloping ground and narrow openings between obstacles, are novel for newly mobile infants. Moreover, de velopmental ex pansion i n t he en vironment i s n ot solely depen dent o n de velopmental cha nges i n m obility. A l eaner, more mature looking body and better performance of various locomotor sk ills may i nspire caregivers to provide i nfants w ith g reater access to the environment. Now, parents may put infants on the floor with greater frequency, remove the gate blocking the stairs, and allow infants to travel into an adjoining room on their own.
Summary: Changing Affordances Changes i n i nfants’ bod ies, p ropensities, a nd en vironments oc cur concurrently, but along different developmental trajectories. Singly, or in combination, t hese fac tors c an cha nge t he constraints on ac tion. Episodic body growth means that infants begin a new day with a new body size and a n ew set of body p roportions. In particular, less top-
288
Embodiment, Ego-Space, and Action
heavy body p roportions and increased muscle mass relative to body fat introduce new possibilities for keeping balance in stance and locomotion. R apid i mprovements i n motor proficiency mean that affordances for ba lance a nd locomotion cha nge f rom week to week, a nd these changes will be most pronounced in the first months after infants acquire a new perception-action system. Moreover, because the acquisition of sitting, crawling, cruising, and walking postures appear staggered over several months, at each point in development, infants are experts in an earlier developing posture and novices in a later developing one. Thus, a situation that affords balance in an experienced sitting posture may be impossible in a novice crawling posture. Developmental changes in infants’ environments mean t hat infants are likely to encounter novel features of t he environment t hat a fford or pre clude the use of their newly acquired perception-action systems. Infants’ Perception of Affordances The critical question for understanding the adaptive control of action, of course, is wh ether a ffordances a re perceived—whether ch ildren and adults gear t heir motor decisions to t he actual possibilities for action (for reviews, see Adolph, 1997; Adolph & Berger, 2005, 2006; Adolph, Eppler, & Gibson, 1993). Given the constant flux of developmental changes, t he cha llenge for infants is to detect t he new constraints on action. Infants must continually update their assessment of affordances to take their new bodies, skills, and environments into account. I begin with the problem for perception and then describe three case studies (avoiding a precipice, navigating slopes, and crossing bridges) that highlight the importance of learning and development in the perceptual guidance of locomotion.
The Perceptual Problem Perceiving a ffordances, l ike a ny per ceptual p roblem, beg ins w ith a de scription o f wha t t here i s t o be per ceived ( J. J . Gibso n, 1 979). As i llustrated b y t he i ndividual d ata i n F igures 9.1A-B, l ike ma ny psychophysical fu nctions, th e c urves th at c haracterize th e t ransition from possible to impossible actions are typically steep, S-shaped functions with long extended tails. Although affordance thresholds
The Growing Body in Action
289
for walking down slopes and passing through doorways vary widely between i ndividuals, t he f unctions across i ndividuals a re similarly steep. M ore ge nerally, re gardless of t he lo cation of t he a ffordance function along the x-axis, most actions are either possible or impossible for a wide range of situations that lie along the tails of the function, and have a shifting probability of success for a narrow range of situations that lie along the inflection of the function. Consider, f or ex ample, t he en tire r ange o f sl opes, f rom 0 ° t o 90°, and the entire range of doorways (or more generally, openings between obstacles), f rom 0 c m to infinitely w ide. Regardless of t he particulars of the current situation, the inflection of the affordance function is likely to occupy only a small section of the range of possibilities. The consistent shape of the affordance function simplifies the p roblem f or per ception: I n m ost c ases, per ceivers m ust d etermine only which tail of the function best describes the current situation (e.g., the slope is perfectly safe or impossibly steep; the doorway is completely passable or absolutely impassable). However, because of the particulars of the current situation (the slope is slippery, the walker is carrying a l oad, walking proficiency has i mproved, e tc.), t he l ocation o f t he i nflection o f t he f unction along t he x-axis—whether t he perso n c an ma nage ste ep sl opes o r only shallow ones, narrow doorways or only wider ones—can vary widely f rom o ne m oment t o t he n ext. I n fac t, a n u nlimited n umber o f v ariables i ncluding fac tors such a s sla nt, f riction, l oad, a nd walking proficiency, create an n-dimensional axis, along which the affordance f unction s lides f rom mome nt to mome nt. The changing location of the affordance function complicates the problem for perception: Obs ervers m ust de termine wh ere a long t he x-axis (or, more generally, n-dimensional axis) the transition from possible to impossible occurs, that is, the region where their threshold currently lies. Figures 9.1C and 9.1D show dramatic developmental changes in the location of the affordance threshold for individual participants. To illustrate further, the range in affordance thresholds for walking down slopes for 14-month-olds (± 1 week of age) is quite large, from 4° t o 2 8° ( Adolph, 1 995; A dolph & A volio, 2 000). The wide range in affordance t hresholds g iven t he na rrow spread i n ch ronological age r eflects t he v ariable t iming a nd a mplitude of d evelopmental changes. In addition, in the event that the affordance lies along the inflection o f t he c urve (e.g., t he p robability o f w alking s afely d own t he
290
Embodiment, Ego-Space, and Action
slope or t hrough t he doorway i s be tween 0 a nd 1.0), or perceivers are unsure about the precise location of the transition from possible to impossible, they must weigh the probability of success against the penalties f or u nder- a nd o verestimation er rors. Thus, m otor dec isions—like a ny per ceptual j udgment—reflect bo th o bservers’ se nsitivity to the perceptual information and a r esponse criterion. For both i nfants a nd adults, some outcomes such a s fa lling downward (down a slope or over the brink of a cliff ) constitute a highly aversive penalty, whereas other outcomes such as entrapment (e.g., becoming wedged in an overly narrow doorway) constitute a relatively innocuous penalty (Adolph, 1997; Joh, Adolph, Narayanan, & Dietz, 2007; C. F. Palmer, 1987; Warren & W hang, 1987). Factors such a s motivation and fatigue may also play an important role in setting a response criterion. What kind of data would constitute evidence that infants (or animals of any age) can solve the perceptual problem of detecting affordances? One source of evidence is infants’ motor decisions, that is, whether infants match their attempts to perform a t arget action to the conditional probability of success. Obtaining such data requires test paradigms where the challenges are novel, the probability of success varies from trial to trial, and infants are sufficiently motivated to produce a large number of trials while staying on task. If infants’ response c riterion i s t oo l iberal, t hey a re l ikely t o r espond i ndiscriminately, precluding a ssessment of t heir sensitivity to t he a ffordances. If their response criterion is too conservative, they are likely to refuse to participate in the experiment. Fortunately, infants love to p ractice t heir n ewly de veloping m otor sk ills, a nd w ith a b it o f incentive (praise from their caregivers and pieces of dry cereal) they happily produce dozens of t rials over a l engthy s ession. W hen t he penalty for error is falling downward, infants’ response criterion is sufficiently conservative to assess their perception of affordances for balance and locomotion. Case Study 1: Avoiding Cliffs and Gaps An obvious test case to study infants’ perception of affordances is a falling off place (E. J. Gibson, 1991). A basic requirement for balance and locomotion is a continuous floor to support the body. A ground surface t hat terminates i n a la rge d rop-off is a cliff. A g round su r-
The Growing Body in Action
291
face that is interrupted by a w ide gap is a crevasse. These should be avoided.
The Visual Cliff Most researchers have studied infants’ perception of affordances at the edge of a drop-off on a “visual cliff,” rather than a real one. First devised by Gibson a nd Walk (E. J. Gibson & W alk, 1960; Walk & Gibson, 1961), the apparatus is a large glass table with wooden sides, divided in half by a na rrow board. On t he “deep” side, a pa tterned surface lies on the floor far below the glass, creating the illusion of an abrupt drop-off. On t he “shallow” side, t he patterned surface is placed directly beneath the glass, providing visual information for a continuous ground surface. The glass serves to ensure infants’ safety if they venture onto the deep side and the wooden sides prevent them from falling off the table onto the floor. Human infants a re placed on t he c enterboard a nd c aregivers en courage t hem f rom first one side a nd t hen t he o ther ( Figure 9.5). O ther a nimals a re p laced o n the centerboard and given several minutes to descend to the side of their choice. Dozens of i nvestigations have y ielded i ntriguing but conflicting findings regarding the role of locomotor experience in avoiding the deep side of the visual cliff. Precocial animals such as infant goats and chicks, who walk moments after birth, avoid the deep side on their first ex posure ( E. J . Gibso n & W alk, 1960; Walk & Gibso n, 1961). Similarly, some altricial animals such as rats, whose locomotor skills develop slowly, do not require visual experience with locomotion to avoid the apparent drop-off (Walk, Gibson, & Tighe, 1957). However, other a ltricial spec ies such a s k ittens, r abbits, a nd i nfant humans, crawl r ight o ver t he ed ge o f t he de ep s ide wh en t hey first become mobile. The problem is not lack of depth perception because human infants can see the drop-off months before they begin crawling. A frequently cited study with human infants showed that crawling ex perience p redicts a voidance o f t he a pparent d rop-off, when testing a ge i s controlled: A fter only t wo weeks of c rawling ex perience, 6 5% of i nfants c rawled o ver t he de ep s ide of t he v isual cl iff, but a fter six weeks of crawling experience, the percentage dropped to 35% of infants (Bertenthal, Campos, & Ba rrett, 1984). However, other c ross-sectional st udies found t he opposite results, where t he
292
Embodiment, Ego-Space, and Action
Figure 9.5 Crawling i nfant o n t he v isual c liff. O n t he d eep sid e, s afety g lass covered a l arge d rop-off. Caregivers beckoned to i nfants f rom t he far side of t he obstacle. A dapted f rom Psychological M onographs: G eneral a nd Ap plied, 7 5(15, Whole No. 519), by R. D. Walk & E. J. Gibson, “A comparative and analytical study of visual depth perception,” 1961, with permission from the American Psychological Association.
duration of crawling experience, controlling for age, was positively related to crossing over the deep side (Richards & Rader, 1981, 1983). Longitudinal d ata a re i nconclusive bec ause i nfants l earn f rom repeated testing that the safety glass provides support for locomotion and they become more likely to cross over (Campos, Hiatt, Ramsay, Henderson, & S vejda, 1 978; Ep pler, S atterwhite, Wendt, & B ruce, 1997; Titzer, 1995). Some evidence suggests t hat locomotor ex perience is posture-specific: The same crawlers who avoided the drop-off when te sted on t heir ha nds a nd k nees c rossed over t he cl iff when tested moments later in an upright posture in a wheeled baby-walker (Rader, Ba usano, & R ichards, 1 980). O ther e vidence su ggests t hat locomotor experience generalizes from an earlier developing perception-action system to a later developing one: 12-month-old walking infants avoided the apparent drop-off after only two weeks of walking experience appended to their several weeks of crawling experience (Witherington, Campos, Anderson, Lejeune, & Seah, 2005). Albeit the most famous test paradigm, the visual cliff is not optimal. Discrepant findings may result from methodological problems
The Growing Body in Action
293
stemming from the design of the apparatus. The safety glass presents mixed messages: The v isual cliff looks dangerous, but feels safe. In fact, because human infants quickly learn that the apparatus is perfectly safe, avoidance attenuates and they can only be tested with one trial on the deep side. In addition, the safety glass may lead to underestimation of i nfants’ er rors. Sometimes i nfants lean forward onto the g lass w ith t heir ha nds o r st art onto t he g lass a nd t hen r etreat (e.g., Campos et al., 1978); if the glass were not there, they would have fa llen. M oreover, t he d imensions o f t he v isual cl iff are fixed so that researchers cannot test the accuracy of infants’ responses or ask whether infants scale their locomotor decisions to the size of the challenge. The heights of t he sha llow a nd de ep sides l ie fa r on t he tails of t he a ffordance fu nction, rather than near the i nflection of the curve. Adjustable Gaps: A New Psychophysical Approach Circumventing t he m ethodological p roblems o n t he v isual cl iff required a n ew approach. Thus, we devised a n ew “gaps” paradigm (Adolph, 2000). Rather than a glass table with wooden sides, 9.5month-old infants were observed as they approached a deep crevasse between two platforms (Figure 9.6A-B). We removed the safety glass so that visual and haptic information would be in agreement rather than in conflict, and perceptual errors would lead to the real consequence of falling. To ensure infants’ safety, a h ighly trained experimenter followed alongside infants to provide rescue if they began to fall. Fortunately, “spotting” infants did not produce the same problem a s t he s afety g lass: I nfants d id not simply learn to rely on t he experimenter to c atch t hem, a nd avoidance d id not attenuate over trials. The dimensions of the apparatus were adjustable, rather than fi xed, so t hat we could a ssess t he ac curacy of i nfants’ motor dec isions. Moving one platform along a c alibrated track varied the gap width from 0 to 90 cm in 2 cm increments. Because the depth of the crevasse between the two platforms was always the same, the penalty for fa lling w as i dentical ac ross a ll r isky g ap s izes. The la rgest g ap width had a pproximately t he s ame d imensions a s t he de ep side of the standard visual cliff. Most critical, to determine the role of experience and the specificity of learning across developmental transitions in perception-action systems, each infant was tested in an earlier developing sitting posture (M sitting ex perience = 1 5 weeks) a nd a la ter de veloping c rawling
294
Embodiment, Ego-Space, and Action
A
B
Figure 9.6 Infants i n ( A) sit ting a nd ( B) c rawling p ostures at t he e dge of a n adjustable gap in the surface of support. Caregivers (not shown) offered lures from the f ar sid e of t he gap . A n e xperimenter ( shown) fol lowed a longside i nfants to ensure their safety. Reprinted from Psychological Science, by K. E. Adolph, “Specificity of le arning: Why infants fall over a ve ritable cliff,” pp. 290–295, 2000, with permission from Blackwell Publishers.
posture (M crawling experience = 6 w eeks). (Note t hat t he average duration of experience in the less familiar crawling posture was similar to the duration of crawling experience in the more experienced group in Bertenthal and colleagues’ (1984) study with infants on the visual cliff.) As shown in Figures 9.6A-B, the task for the infants was
The Growing Body in Action
295
the same in both postures—to lean forward over the gap to retrieve an attractive lure. Caregivers stood at the far side of the landing platform a nd e ncouraged i nfants’ efforts. R ather th an t esting i nfants with only one trial, infants were observed over dozens of trials. We used a modified psychophysical staircase procedure to estimate each infant’s affordance threshold, and then assessed infants’ motor decisions b y p resenting s afe a nd r isky g aps, r elative t o t he t hreshold increment. Motor dec isions were de termined ba sed on a n attempt rate: The number of successful plus failed attempts to span the gap divided by the sum of successful attempts, failed attempts, and refusals to attempt the action. (The inverse avoidance rate yields the same information.) Adaptive motor decisions would be evidenced by high attempt rates on gaps smaller than the affordance threshold, and low attempt rates on gaps larger than the threshold. Two experiments confirmed the role of experience with balance and l ocomotion i n de tecting a ffordances f or c rossing a c revasse (Adolph, 20 00). W hen tested i n t heir ex perienced s itting post ure, infants g auged p recisely h ow fa r f orward t hey co uld l ean w ithout falling into the hole. They scaled their motor decisions to their body size a nd sitting proficiency so t hat t heir attempt rates matched t he conditional p robability o f suc cess. N one o f t he i nfants a ttempted the la rgest 90-cm gap. Like t he v isual cl iff, t he gaps pa radigm was relatively novel: Typically, infants are not encouraged to cross a deep precipice on their own. Thus, adaptive motor decisions in the sitting posture attest to transfer of learning from everyday experiences to a novel environmental challenge. However, when facing the crevasse in their less familiar crawling posture, t he s ame i nfants g rossly o verestimated t heir abilities a nd fell into impossibly wide gaps on trial after trial. Infant showed a higher proportion of errors at each risky gap size in the less experienced crawling posture compared with the more experienced sitting posture. For example, in the crawling posture, 33% of infants in each of the two experiments fell over the brink of the largest 90-cm gap on every t rial, but none of t he infants attempted t he 90-cm gap in the sitting posture. It i s u nlikely t hat t he d ifference be tween s itting a nd c rawling reflects different r esponse c riteria. C onditions w ere b locked a nd counterbalanced, and there were no differences between infants who experienced repeated rescues by the experimenter when tested first in the crawling posture and infants who were tested first i n the sitting posture. It is also unlikely that the difference be tween s itting
296
Embodiment, Ego-Space, and Action
and crawling reflected differences in visual or mechanical access to the gap. In both postures, we confirmed from videotape that infants made v isual contact w ith t he gap at t he start of each t rial; i ndeed, the apparatus and lure were arranged so that the gap was directly in infants’ line of sight. In addition, infants spontaneously explored the gap in both sitting and crawling positions by leaning forward while extending and then retracting their a rm i nto the gap. R ather, the disparity between attempt rates in the sitting and crawling postures indicates that learning does not transfer across developmental transitions in perception-action systems. Apparently, what infants know about ke eping ba lance wh ile s itting d oes n ot h elp t hem t o g auge affordances in the same situation when learning to crawl. The new methodological approach of testing infants with multiple trials at a range of safe and risky increments also revealed something that could not be a ssessed with one trial on each side of the v isual cliff—the a daptiveness of i nfants’ motor d ecisions re lative to t heir individual affordance thresholds and over variations in the perceptual information. The proportion of infants (33%) who attempted to span the 90-cm gap width in a crawling posture replicated the results from Bertenthal and colleagues’ (1984) study of infants on the visual cliff, where 35% of infants in the six-week-experience group crawled over the deep side. However, trials at sma ller but equally risky gap increments (the depth of the crevasse was always the same), showed that s ix w eeks o f c rawling ex perience w as n ot su fficient to ensure even a 33% error rate when faced with a deep crevasse. On sma ller, risky gap increments, attempt rates increased sharply in the crawling posture, indicating that infants’ motor decisions were more adaptive farther out on the tails of the affordance function and less adaptive on r isky gap i ncrements closer to t he t hreshold. The s ame pattern held for t he ex perienced sitting post ure, but t he attempt r atio w as lower at each risky gap width. Similarly, o ther r esearchers ha ve f ound spec ificity o f l earning between sitting and crawling postures when testing infants with barriers in their path. In a longitudinal study, infants reached around a barrier t o r etrieve a t arget ob ject wh ile te sted i n a s itting post ure several w eeks bef ore t hey dem onstrated t he ab ility t o r etrieve t he object while tested in a crawling posture (Lockman, 1984). In a follow-up cross-sectional study, 10- and 12-month-olds were more successful at retrieving objects f rom behind a ba rrier when t hey were sitting than when they had to execute the detour by crawling (Lockman & Adams, 2001).
The Growing Body in Action
297
In a v ariant of t he gaps pa radigm, infants a lso showed posturespecific l earning ac ross t he de velopmental t ransition be tween t wo upright postures—cruising a nd walking (Adolph, 2005; Leo, Chiu, & Adolph, 2000). Infants were tested at 11 months of age, when they averaged eig ht weeks of ex perience c ruising s ideways a long f urniture, but t hey had n ot yet beg un to walk i ndependently. Using t he psychophysical procedure to estimate affordance thresholds and assess m otor dec isions, i nfants w ere te sted i n t wo co nditions. A “handrail” condition was relevant to infants’ experience maintaining balance with the arms in cruising: There was an adjustable gap (0 to 90 cm) in the handrail they held for support and a continuous floor ben eath t heir f eet ( Figure 9.7A). A “ floor” co ndition w as r elevant f or ma intaining ba lance w ith t he l egs i n w alking: The floor had a n adjustable gap (0–90 cm) a nd t he ha ndrail was continuous (Figure 9.7B). Because the handrail may have blocked infants’ view of the floor near their feet, an experimenter called infants’ attention to the gap in both conditions at the start of each trial to ensure that they saw the size of the obstacle. In the handrail condition, infants correctly gauged how far they could st retch t heir a rms t o c ruise o ver t he g ap i n t he ha ndrail. Attempt rates were scaled to the actual affordance function: Attempt rates w ere h igh o n s afe g aps sma ller t han t heir t hresholds a nd decreased sharply on risky gaps larger than their thresholds. However, i n t he floor condition, t he s ame i nfants showed g rossly i naccurate moto r de cisions. They a ttempted s afe a nd r isky ga ps a like, despite viewing the gap in the floor at the start of each trial. Every infant showed higher error rates on risky gap increments in the floor condition compared with the handrail condition, and 41% of infants attempted to cruise into the 90-cm gap in the floor. Newly walking 11-month-olds erred in both conditions, as if they did not know how many steps t hey could ma nage be tween g aps i n t he ha ndrail, a nd they did not realize that they need a solid floor to support their bodies. I n su mmary, a lthough c ruising a nd w alking sha re a co mmon upright post ure, practice cruising does not appear to te ach i nfants how to detect affordances for walking.
Case Study 2: Navigating Slopes A series of longitudinal and cross-sectional experiments with infants on slopes provide further evidence that infants must learn to detect
298
Embodiment, Ego-Space, and Action A
B
Figure 9.7 Cruising i nfants w ith ( A) a n a djustable gap i n t he h andrail u sed for m anual s upport, a nd a c ontinuous floor b eneath t he fe et a nd ( B) a n a djustable gap i n t he floor, a nd a c ontinuous h andrail to hold for s upport. C aregivers (not shown) encouraged infants from the end of the landing platform. An experimenter (shown) followed alongside infants to e nsure their safety. Reprinted from A. Woodward & A. Needham (Eds.), Learning and the infant mind, by K. E. Adolph & A. S. Joh, “Multiple learning mechanisms in the development of action,” in press, with permission from Oxford University Press.
affordances f or ba lance a nd l ocomotion, a nd t hat l earning oc curs in t he context of d ramatic de velopmental cha nge (for r eviews, s ee Adolph, 2002, 2005; Adolph & Berger, 2006; Adolph & Eppler, 2002). Like c liffs a nd g aps, na vigating o ver sl opes i s r elatively n ovel f or infants. On pa rents’ r eports, m ost i nfants ha ve n ever c rawled o r walked over steep slopes or u sed a p layground sl ide on t heir own.
The Growing Body in Action
299
Compared w ith c liffs a nd g aps, sl opes ha ve a co ntinuous v isible and tangible surface between the brink and the edge of the obstacle rather than an abrupt discontinuity. Thus, sloping ground provides a u nique test c ase for a ssessing t ransfer of learning f rom e veryday experience to a novel environmental challenge. We devised an adjustable sloping walkway by connecting a middle sloping r amp to t wo flat pl atforms w ith pi ano h inges (Fi gure 9.8). One platform was stationary. Raising and lowering the second platform allowed the degree of slant to be adjusted from 0° to 90° in 2° increments. The psychophysical method was used to determine an affordance t hreshold f or e ach i nfant f or c rawling o r w alking. Safe a nd r isky sl opes w ere p resented r elative t o t he t hreshold t o assess infants’ motor decisions. Caregivers stood at the top or bottom of t he w alkway a nd en couraged t heir i nfants t o come up or down. An experimenter followed alongside infants to ensure their safety. As in the gap studies, the penalty for errors on downhill slopes was falling downward, a co nsequence t hat i nfants find aversive. Many falls d uring de scent w ere q uite d ramatic—with t he ex perimenter catching infants midair, spread-eagled like Superman, on the slope
Figure 9.8 Infant descending an adjustable slope. Caregivers (not shown) encouraged infants from the end of t he landing platform. An experimenter (shown) followed a longside i nfants to e nsure t heir s afety. Re printed f rom t he Monographs of th e S ociety fo r R esearch in C hild D evelopment, 62 (3, S erial N o. 251), by K . E . Adolph, “Learning in the development of i nfant locomotion,” 1997, with permission from Blackwell Publishers.
300
Embodiment, Ego-Space, and Action
as t hey t umbled downward he adfirst or on t heir back s a s i nfants’ feet slid out from under themselves. In contrast, falling while crawling or walking uphill is less aversive. When infants fell, they could lean forward and safely catch themselves with their hands.
Crawling and Walking Down Slopes In a longitudinal study (Adolph, 1997), infants were observed going up a nd d own sl opes e very t hree w eeks, f rom t heir first w eek o f crawling, u ntil s everal m onths a fter they began walking. Infants’ performance on uphill slopes reflected their indifference to the penalty for er rors. Attempt r ates were u niformly h igh on r isky sl opes at each week of crawling and walking. Infants launched themselves at impossibly steep slopes and made h eroic efforts to reach t he top platform, but failed repeatedly over dozens of trials. Infants’ performance on downhill slopes was a very different story. They ad opted m ore co nservative r esponse cr iteria a nd i t wa s possible to track weekly changes in their motor decisions. In their first week of crawling, i nfants plunged headfirst down i mpossibly r isky slopes and required rescue by the experimenter. They fell down 75% of risky slopes. With each week of locomotor experience, motor decisions gradually zeroed in on infants’ actual ability until attempts to descend closely matched the probability of success. After 22 weeks of crawling, attempt rates had decreased to 0.10 on risky slopes. Clearly, infants were not simply learning that the experimenter would catch them bec ause t hey bec ame m ore r eticent t o a ttempt r isky sl opes, rather than more reckless. The steady decrease in errors points to impressive transfer because infants’ affordance thresholds, crawling proficiency on flat ground, and body dimensions changed from week to w eek. A r isky sl ope o ne w eek w as per fectly s afe t he n ext w eek when c rawling sk ill had i mproved. S afe sl opes f or bel ly c rawling were impossibly risky a week or so later when infants began crawling on hands and knees. Despite m onths o f te sting a nd h undreds o f t rials de scending slopes, infants showed no evidence of transfer from crawling to walking. Errors were just as high in infants’ first week of w alking as in their first week of crawling, and learning was no faster the second time around. Learning was so posture-specific that new walkers showed dissociations in their motor decisions between consecutive
The Growing Body in Action
301
trials, tested a lternately in t heir old, fa miliar crawling posture and in t heir u nfamiliar upright posture. When placed prone at t he top of a 36° sl ope, new walkers behaved l ike ex perienced crawlers a nd avoided de scent. Moments later, when placed upright at t he top of the same slope, they walked straight over the brink and fell. When placed prone again, they avoided, and so on. As in crawling, errors were a lways h ighest o n r isky s lopes cl osest t o i nfants’ a ffordance thresholds and lowest on slopes farthest out on the tail of the affordance function. Moreover, infants in a control group who were tested only in their first and tenth weeks of crawling and in their first week of w alking w ere i ndistinguishable f rom i nfants wh o had ex perienced hundreds of trials on slopes. Apparently, learning transferred from everyday experience on flat ground to detecting affordances of slopes. Slope experience was not required. Cross-sectional d ata re plicated t he findings f rom t he lo ngitudinal obs ervations. E ight- t o 9 -month-old i nfants w ith 6 .5 w eeks of c rawling ex perience, o n a verage, a ttempt t o c rawl d own sl opes far be yond t heir ab ilities ( Adolph e t a l., 1 993). E leven-month-old infants, averaging 13 weeks of c rawling ex perience, ma tched t heir attempts to crawl to the probability of success (Mondschein, Adolph, & Tamis-LeMonda, 2000). Likewise, 12-month-old crawlers showed highly adaptive motor decisions for descending slopes. In contrast, 12-month-olds who had j ust beg un walking, attempted i mpossibly steep slopes on repeated trials. Their attempt rates were 0.73 on 50° slopes (Adolph, Joh, Ishak, Lobo, & Berger, 2005). By 14 months of age, when infants averaged 11 weeks of walking experience, attempts to walk were geared to infants’ actual abilities (Adolph, 1995). By 18 months of a ge, i nfants were h ighly ex perienced w alkers, a nd t heir motor de cisions w ere e ven mo re finely at tuned to t he a ffordances for walking down slopes (Adolph et al., 2005). At every age, infants showed more adaptive motor decisions for going down slopes than for going up, indicating that they adopted different response criteria for a scent a nd de scent. Moreover, at e very a ge a nd for both uphill and d own, i nfants sh owed l ower a ttempt r atios o n sl opes fa rthest out on the tail of the affordance function and higher attempt ratios on s lopes closest t o t heir a ffordance t hreshold. These data s uggest that perceptual discrimination of affordances is most difficult in the region bordering the transition from possible to impossible actions. Across ages, perceptual learning reflects a process of gradually gearing motor decisions to the affordance threshold.
302
Embodiment, Ego-Space, and Action
Walking with Weights By the time infants are tested in the laboratory, they may have had several days to adjust to naturally occurring changes in their body dimensions a nd m otor p roficiency. H owever, tem porary cha nges in bodily propensities, such as changes in the location of the center of mass while carrying a t oy, require instant recalibration to a n ew affordance t hreshold. The add ition o f t he t oy a lters i nfants’ f unctional body dimensions. Experimental manipulation of infants’ body dimensions showed that experienced 14-month-old walking infants (averaging 9 w eeks of w alking ex perience) c an u pdate t heir a ssessment o f t heir o wn abilities on the fly (Adolph & Avolio, 2000). Infants wore a tightly fitted Velcro vest with removable shoulder packs filled with either lead weights (25% of their body weight) or feather-weight polyfil (Figure 9.9). The lead-weight load made infants’ bodies more top-heavy and immaturely p roportioned, a h indrance e specially wh ile w alking down sl opes. W hile c arrying t he l ead-weight l oads, i nfants’ a ffordance t hresholds were several deg rees sha llower t han wh ile c arrying the feather-weight loads. The load condition changed randomly
Figure 9.9 Infant we aring fitted V elcro ve st w ith re movable s houlder p acks. Lead-weight or feather-weight loads could be fitted into the shoulder packs at various percentages of i nfants’ body weight. Reprinted from J. L ockman, J. Re iser, & C. A . Ne lson (E ds.), Action a s a n or ganizer of p erception a nd c ognition d uring learning and development: Symposium on C hild Development (Vol. 33), by K . E. Adolph, “Learning to le arn in the development of a ction,” pp. 91–122, 2005, with permission from Lawrence Erlbaum Associates.
The Growing Body in Action
303
from trial to trial, meaning that infants would have to detect the different affordance thresholds for the lead- and feather-weight loads at the start of each trial. Indeed, infants recalibrated their judgments of risky slopes to their new, more precarious balance constraints. They correctly t reated t he s ame deg rees of slope a s r isky wh ile wearing the lead-weight shoulder-packs but as safe while wearing the featherweight shoulder-packs. In both conditions, attempt rates were lowest for slopes farthest out on the tail of the affordance function and highest for slopes closest to the threshold. Recalibration t o t he w eights a ttests t o i mpressive t ransfer o f learning on several counts. The slopes were relatively novel—few of the infants had walked over steep slopes prior to participation. Carrying a l oad attached to the body w as novel—infants did not carry backpacks or purses. And, the loads decreased infants’ walking proficiency and required adjustments in gait patterns just to stay upright and move forward. In fact, walking with smaller loads (15% of body weight) over flat ground causes 14-month-olds to take smaller, slower steps, keep both feet on the ground for longer periods of time, and hold one foot in the air for shorter periods of time (Garciaguirre et al., 2007). Case Study 3: Spanning Bridges One of the consequences of developmental changes in infants’ motor skills is that aspects of the environment can take on new functional significance. A handrail, for example, is a necessary support for balance and locomotion for cruising infants. Without the handrail (or edge of a piece of furniture), cruisers cannot move in an upright position. D uring t he c ruising per iod of de velopment, t he ha ndrail ha s the same functional status as the floor. After infants begin walking, however, a ha ndrail is unnecessary under normal conditions. Children and adults typically use handrails only to augment their natural abilities when they are tired or when the conditions are treacherous. The handrail becomes supplemental and functions as a tool. Using a Handrail At 1 6 m onths o f a ge, m ost i nfants a re r elatively ex perienced ( 18 weeks, on average) and proficient walkers. On flat ground, they walk
304
Embodiment, Ego-Space, and Action
unsupported, w ithout h olding a ha ndrail o r a c aregiver’s ha nd t o augment t heir ba lance. Only a f ew infants have experience using a handrail to climb and descend stairs (Berger, Theuring, & A dolph, 2007). Thus, a novel situation was devised where the use of a ha ndrail would be w arranted (Berger & A dolph, 2003). Wooden bridges of v arious w idths (12 c m t o 7 2 c m) spa nned a 7 6-cm-deep p recipice over a 74-cm-long gap. The bottom of the precipice was always clearly vi sible o n e ither s ide o f th e b ridge an d th e l ength o f th e bridge required infants to take several steps at a minimum in order to cross. On some trials, a wooden handrail was available for infants to hold on to, a nd on other t rials, t he ha ndrail was absent (Figure 9.10). Without the handrail, the narrowest bridges were impossible: Bridges w ere na rrower t han t he i nfants’ w idest d imensions a nd their dynamic base of support (infants walk with their legs splayed apart and their bodies oscillate from side to side). Walking sideways was not a n option because infants cannot keep ba lance while edging along sideways unsupported. Parents stood at the far side of the precipice offering toys as a lure. An experimenter followed alongside infants to ensure their safety. As expected, infants perceived different possibilities for walking based on bridge width. More important, infants’ perception of affor-
Figure 9.10 Infant crossing an adjustable bridge. Caregivers (not shown) encouraged infants from the end of t he landing platform. An experimenter (shown) followed a longside i nfants to e nsure t heir s afety. Re printed f rom Developmental Psychology, 39(3), by S. E. Berger & K. E. Adolph, “Infants use handrails as tools in a locomotor task,” pp. 594–605, 2003, with permission from the American Psychological Association.
The Growing Body in Action
305
dances was also related to the presence of the handrail. Infants ran straight over the widest bridges, ignoring the handrail when it was available. However, on narrow 12- to 24 cm bridges, infants attempted to walk when the handrail was available and avoided walking when the handrail was removed. To use the handrails, infants turned their bodies sideways, and modified t heir walking gait by inching a long with their trailing leg following in the footstep of their leading leg. Falling w as r are i n bo th ha ndrail co nditions: On ly 6 % o f i nfants’ attempts en ded i n a fa ll. A pparently, i nfants r ecognized t hat t he handrail offered add itional pos sibilities f or u pright l ocomotion b y augmenting their balance on narrow bridges.
Wobbly and Wooden Handrails A follow-up study showed that experienced walking infants take the material substance of the handrail into account (Berger, Adolph, & Lobo, 2005). A pair of wobbly handrails was constructed from rubber a nd foam. W hen i nfants pressed t heir weight onto t he wobbly handrails, t he ha ndrails def ormed a nd s agged d ownward ( Figure 9.11). Parents standing on the far side of the landing platform encouraged their 16-month-old walking infants to cross 10- to 40-cm-wide
Figure 9.11 Infant testing a wobbly h andrail used for s upport. Caregivers (not shown) encouraged infants from the end of the landing platform. An experimenter (shown) fol lowed a longside i nfants to e nsure t heir s afety. Re printed f rom Child Development, 76, by S. E. Berger, K. E. Adolph, and S. A. Lobo, “Out of the toolbox: Toddlers d ifferentiate wobbly a nd wooden handrails,” pp. 1294–1307, 2005, w ith permission from Blackwell Publishers.
306
Embodiment, Ego-Space, and Action
bridges. A handrail was available on every trial. On some trials, the handrail was built of sturdy wood as in the earlier experiment, and on other trials, t he ha ndrail was wobbly rubber or foam. A ll t hree handrails looked solid. Thus, to make the distinction between sturdy and wobbly handrails, infants needed to test the handrail before stepping out onto the bridge to determine whether it would give beneath their weight. On the widest bridges, infants walked straight across regardless of handrail type. But, on the 10- to 20-cm bridges, infants were more likely to walk when the handrail was built of sturdy wood than when it was wobbly. Thus, infants d id not perceive t he ha ndrail as a support for augmenting ba lance merely bec ause it w as a su rface stretching from here to there. Rather the rigidity of the surface was an important source of information for affordances. Infants generated haptic information about the material substance of the handrail by pushing, tapping, squeezing, rubbing, and mouthing the surface before leaving the starting platform.
Learning in Development What do these case studies of infant locomotion tell us about perceptually guided action? As in research on affordances with adult participants (Kinsella-Shaw, Shaw, & Turvey, 1992; Mark, Jiang, King, & Paasche, 1999; Warren, 1984; Warren & Whang, 1987), the data from research with infants indicate that experienced observers can readily a nd ac curately de tect pos sibilities f or m otor ac tion. P erceiving whether it is possible to walk through an aperture, step over a gap, climb up a stair, walk over a slope, and so on requires that perceptual i nformation about t he environment (aperture w idth, gap size, stair height, slant) be related to perceptual information about the self (current body d imensions, level of balance control, and so on), and some so urces o f i nformation ma y s imultaneously spec ify r elevant aspects of both environment and self (e.g., optic flow). Exploratory movements close the perception-action loop by providing information about what to do next based on information for affordances in the current moment (Joh et al., in press; Mark, Baillet, Craver, Douglas, & Fox, 1990). Visual information about upcoming obstacles can elicit m ore f ocused i nformation g athering abo ut pos sibilities f or locomotion through touch, postural sway, and testing various strategies (Adolph, Eppler, Marin, Weise, & Clearfield, 2000).
The Growing Body in Action
307
Studies with infants tell us something new. The evidence indicates that the ability to detect affordances requires learning (e.g., Adolph, 2005; Ca mpos et a l., 2 000; E . J . Gibso n & P ick, 2 000). E xploiting the f ull r ange of pos sibilities for ac tion, a nd perhaps more i mportant, c urtailing i mpossible ac tions i n r isky s ituations, d oes n ot come automatically with the acquisition of balance and locomotion. Affordances (or lack o f t hem) t hat seem blatantly obvious to u s a s adults (“Don’t fall over the cliff !”) are not obvious to newly mobile infants. Instead, perceptually guided action improves as infants gain experience with their newly acquired perception-action systems. Of course, i nfants beco me o lder a s t hey acq uire m ore ex perience, so that a ge-related a nd ex perience-related cha nges a re normally confounded. H owever, t he i ndependent e ffects o f ag e a nd ex perience can be a ssessed st atistically a nd ex perimentally ( in l ongitudinal studies with experience held constant, and in cross-sectional studies with age held constant), and experience is the stronger predictor (for reviews, see Adolph, 1997; Adolph & Berger, 2006). A r elated iss ue co ncerns t he speci ficity o f l earning. O ptimally, detecting affordances should be ma ximally flexible so t hat learning transfers to novel variations in both the body and environment. Flexibility is critical at every stage of life because local conditions are continually in flux. However, the infancy period provides an especially useful window into learning and transfer, given the rapid, large-scale developmental c hanges in inf ants’ b ody dim ensions, m otor p roficiency, and environments. Moreover, the acquisition of new postural control systems over t he first t wo years of life a llows researchers to ask abo ut de velopmental co nstraints o n l earning, t hat i s, wh ether learning transfers from one perception-action system to another. The evidence points to broad transfer of learning within perception-action s ystems a nd na rrow spec ificity of learning between perception-action s ystems. E xperienced i nfants de tected n ovel affordances for sitting, crawling, cruising, and walking. Thei r motor decisions r eflected t he cha nging r elationship be tween body a nd environment. They ad apted to v ariations i n en vironmental c onstraints (gap width, slope, bridge width, and so on), and to naturally occurring and experimentally induced changes in their body dimensions and skills (e.g., lead-weight shoulder packs). In contrast, novice infants showed poorly adapted motor decisions. The y attempted impossibly steep slopes and wide gaps and deep cliffs, and appeared to ig nore i nformation abo ut t he l imits o f t heir p hysical ab ilities. Novices showed no evidence that learning transfers from an earlier
308
Embodiment, Ego-Space, and Action
developing perception-action system to a later developing one. Over weeks of experience, infants’ motor decisions gradually honed in on the current constraints on action so t hat infants’ attempts matched the actual affordances for action. Experience w ith pa rticular t ypes of challenges (such as practice descending slopes) was not required. What infants needed was 10 to 20 weeks of everyday experience with balance and locomotion.
Learning Sets A candidate learning mechanism that could support flexible transfer within perception-action systems and specificity across perception-action s ystems i s a l earning s et (Adolph, 2 002, 2 005; A dolph & E ppler, 2 002). The ter m w as first coined by Ha rlow (1949, 1959; Harlow & Kuenne, 1949) to refer to a s et of exploratory procedures and st rategies t hat p rovide t he m eans f or g enerating so lutions t o novel problems of a pa rticular t ype. Acquiring t he ability to solve novel problems—or “ learning to learn,” a s Ha rlow put it—is more effective for a b roader r ange of situations t han learning pa rticular solutions for familiar problems (Stevenson, 1972). In fact, in a world of continually varying and novel affordances, learning simple facts, cue-consequence a ssociations, a nd s timulus ge neralizations wou ld be ma ladaptive. Yesterday’s fac ts a nd consequences may no longer hold. Features of the environment may be truly novel. With a learning set, the scope of transfer is limited only by the boundaries of the particular problem space. Learning sets have three important characteristics. First, learning sets support broad transfer of learning to novel problems. For example, in Harlow’s (1949, 1959) model system, adult monkeys learned to solve discrimination problems (e.g., find which of two objects hides a raisin) or oddity problems (which of three objects hides a r aisin). Object features v aried f rom one t rial block t o t he next. W hen t he monkeys had acquired a learning set, they demonstrated perfect performance with pairs or trios of completely new shapes. They could figure out i n one t rial wh ich object features were relevant for t hat trial block. For discrimination problems, they used a win-stay/loseshift r ule: I f t he first object t hey ex plored co vered t he r aisin, t hey tracked that object over the trial block; if not, they tracked the other one. For oddity problems, they used an odd-man-out strategy: The
The Growing Body in Action
309
target ob ject d iffered f rom t he o ther t wo o n m ore features. A lbeit monkeys’ learning sets involved only simple strategies for operating within a tiny problem space, the learning sets represented something far more powerful than stimulus generalization. Monkeys were solving truly novel instances of a pa rticular type of problem. They had acquired a means for coping with novelty. The second characteristic of learning sets is that transfer is limited t o t he s ize o f t he p roblem spac e. F or ex ample, m onkeys wh o had acquired a learning set for discrimination problems could solve new instances of discrimination problems on the first presentation of a new pair of objects. However, experts at discrimination problems were no better than novices when challenged with oddity problems. Similarly, experts on oddity problems behaved like novices on discrimination problems. Discrimination and oddity problems each occupied a different problem space. The t hird cha racteristic o f l earning s ets i s t hat acq uisition i s extremely sl ow a nd d ifficult. I t en tails r ecognition o f t he p roblem space (e.g., the pairs of objects in a discrimination problem), identification of the relevant parameters for operating within it (object color or shape rather than spatial position), acquisition of the appropriate exploratory procedures for generating the requisite information (visual a nd ha ptic ob ject s earch), a nd abst raction o f g eneral st rategies t o so lve t he p roblem a t ha nd (win-stay/lose-shift strategy to track the food). Harlow’s monkeys required hundreds of trial blocks with d ifferent i nstances o f t he pa rticular p roblem t ype p resented over multiple sessions—thousands of trials in total—to acquire a learning set for the small and circumscribed arena of discrimination problems or o ddity proble ms. A t first, m onkeys s earched ha phazardly or searched under objects in a particular position. They had to unlearn overly na rrow st rategies for s earching w ith t he pa rticular pair of objects i n t he c urrent t rial block (e.g., g reen c ylinders h ide raisins), before abstracting general strategies that would enable them to find the raisin with novel objects in the next trial block. How might the learning set framework be a pplied to perceptual guidance of action? On the learning set account, each new postural control system that arises in the course of motor development operates as a distinct problem space. Compared with the simple discrimination and oddity problems in Harlow’s model system, the problem space for postural control is extremely broad. The range of problems is en ormous. E very m ovement o n e very su rface co nstitutes a d if-
310
Embodiment, Ego-Space, and Action
ferent problem for post ural control. Every cha nge i n i nfants’ body growth and skill level creates different biomechanical constraints on balance and locomotion. Whereas Harlow’s monkeys required thousands of trials in a daily regimen to acquire learning sets for simple discrimination and oddity problems, human infants might require hundreds of t housands or m illions of “ trials” ove r m any we eks or months—epochs of experience—to acquire a learning set for coping with balance and locomotion. Flexibility o f l earning a rises a s i nfants beg in t o co nsolidate a learning set. Experience using a newly developed perception-action system provides infants with the necessary repertoire of exploratory procedures a nd st rategies f or so lving p roblems o n-line. I nformation-generating behaviors provide the basis for identifying the critical parameters for the particular problem space and for calibrating the settings of those parameters under changing conditions. Exploratory movements generate the requisite information about the current st atus of t he body r elative to t he environment a nd v ice versa: visual exploration as infants notice the obstacle, swaying movements as t hey ma ke t heir approach, haptic ex ploration a s t hey probe t he obstacle with a l imb, and means-ends exploration as they test various alternatives (Adolph & Eppler, 2002; Adolph et al., 2000; Adolph & Joh, in press). Thus, within the boundaries of each problem space, infants can cope with novel and variable changes in affordances for action. Specificity o f l earning em erges bec ause d ifferent perceptionaction s ystems a re defined b y d ifferent s ets o f c ritical pa rameters. Sitting, c rawling, c ruising, a nd w alking, f or ex ample, i nvolve d ifferent parameters for maintaining balance: different regions of permissible post ural s way, muscle g roups for ba lance a nd propulsion, vantage points for viewing the ground, sources of perceptual information abo ut t he body ’s m ovements, co rrelations be tween v isual and v estibular i nput, a nd so o n. B ecause e ach p roblem spac e ha s unique parameters, learning does not transfer between perceptionaction s ystems. W hat i nfants l earn abo ut ba lance a nd l ocomotion in a c rawling posture, for instance, does not help them to solve the problem o f ba lance a nd l ocomotion i n a n ewly acq uired w alking posture.
The Growing Body in Action
311
Everyday Experience A q uestion t hat ari ses a bout l earning m echanisms t hat r equire particular t ypes of practice is whether t he actual opportunities for learning a re co nsistent w ith t he p urported r egimen: A re l earning sets p lausible? I n t his c ase, i nfants w ould n eed i mmense a mounts of practice with varied instances of each problem type to acquire a learning set for each perception-action system. The available laboratory d ata support such a p roposition. For example, a fter six weeks of everyday experience with balance and locomotion, infants avoid a drop-off on the deep side of the visual cliff and a 90-cm-gap; after 15 w eeks o f ex perience, t hey ma tch t heir m otor dec isions t o t he physical d imensions o f t he d rop-off ( Adolph, 2 000; B ertenthal e t al., 1984). Infants require 10 weeks of locomotor experience before errors dec rease to 5 0% on r isky slopes a nd 2 0 weeks before er rors decrease to 10% (Adolph, 1997). More convincing evidence that everyday experience could support infants’ acquisition of learning sets is provided by naturalistic d ata ob tained f rom d aily ch ecklist d iaries, “ step-counters” i n infants’ shoes, and video recordings of infants crawling and walking in everyday environments (Adolph, 2002; Adolph, Robinson, Young, & Gill-Alvarez, in press; Chan, Biancaniello, Adolph, & Marin, 2000; Chan, L u, Ma rin, & A dolph, 1 999; G arciaguirre & A dolph, 2 006; Robinson, A dolph, & Y oung, 2 004). I n fac t, i f r esearchers co uld design a p ractice r egimen m ost co nducive t o acq uiring a l earning set, i t w ould r esemble i nfants’ e veryday ex periences w ith ba lance and locomotion. Infants’ locomotor experience is immense—on the order of practice r egimens u sed b y O lympic a thletes a nd co ncert p ianists t o achieve expert performance, rather than the experimental training sessions administered to monkeys in Harlow’s original studies. In 15 minutes of f ree play, for ex ample, t he average 14-month-old w alking i nfant t akes 550 steps , t raveling ha lf t he d istance up t he E iffel Tower (Garciaguirre & A dolph, 2 006). I n t he co urse o f a d ay, t he average toddler takes more than 13,000 steps, traveling the length of 39 American football fields. Crawling experience is less intense, but equally remarkable. At t he end of t he d ay, t he average crawler ha s taken more than 3,000 steps (Adolph, 2002). Practice w ith b alance an d l ocomotion i s di stributed o ver tim e, rather than massed, so that infants have time to recover from fatigue,
312
Embodiment, Ego-Space, and Action
renew their motivation to continue, and consolidate what they have learned. Infants must work to keep their bodies in balance during all o f t heir w aking h ours. H owever, t hey a re ac tually o n t he floor engaged in stance or locomotion for only 5 t o 6 h ours per d ay, and most of that time (approximately 80%), they are in stance (Adolph, 2002; G arciaguirre & A dolph, 2 006). Thus, l ocomotion oc curs i n short, l ittle bouts of ac tivity, i nterspersed w ith longer rest per iods. Practice i s a lso d istributed ac ross d ays. On so me d ays, i nfants s it, crawl, cruise, or walk, but on other days, they do not (Adolph, Robinson e t a l., 2 007; R obinson e t a l., i n p ress). Di stributed p ractice across days is most noticeable when infants are first acquiring a new perception-action system. Falling is commonplace. Toddlers fall, on average, 15 times per hour ( Garciaguirre & A dolph, 2 006). M ost fa lls a re n ot s erious enough to make infants cry or to warrant parents’ attention. Surprisingly, infants fall most frequently due to unexpected shifts i n their center of mass; they lift t heir a rm or t urn t heir head a nd t ip over. Although i nfants t rip a nd sl ip wh en t he el evation a nd f riction o f the ground surface is variable, most indoor floor coverings are uniform in elevation and provide sufficient traction. Thus, like Harlow’s monkeys who learned to ignore local contingencies between object features i n a pa rticular t rial block, i nfants may learn to ig nore t he color and visual texture of floor coverings, and focus on changes in elevation and layout that specify varying affordances. Finally, infants’ practice is variable, rather than blocked, so that infants experience balance and locomotion in various physical and social contexts, typically changing after a few minutes or so. Infants travel through all of the open rooms in their homes, averaging 6 to 12 different ground surfaces per day (Adolph, 2002; Chan, Biancaniello et al., 2000; Chan, Lu et al., 1999). Of course, the size of infants’ homes, t heir la yout, t he f urniture a nd floor co verings, a nd so o n, vary t remendously ac ross i nfants a nd g eographic r egions. I nfants reared in Manhattan, for example, are unlikely to have regular access to stairs, and unlikely to crawl and walk over grass, sand, and concrete (Berger et al., 2007). Infants reared in the suburbs have regular exposure to all of those surfaces. In sum, infants’ everyday locomotor ex periences resemble a t ype of practice regimen t hat would be highly conducive to acquiring a l earning set: immense a mounts of variable a nd d istributed p ractice ( Gentile, 2 000; S chmidt & L ee, 1999).
The Growing Body in Action
313
After Infancy? Walking is not the endpoint in locomotor development. After infancy, children and adults continue to master new locomotor skills. Unfortunately, developmental research on perceptually guided locomotion is largely limited to infancy, and research with adults has focused on skills acquired during infancy that involve sitting and upright locomotion (e.g., Mark et al., 1990; e.g., Warren, 1984; Warren & Whang, 1987). Thus, the developmental story of perceiving affordances after the infancy period is largely speculative. Most forms of locomotion acquired after i nfancy a re l ikely t o be o utgrowths o f t he per ception-action s ystems acq uired d uring infancy. Like toddlers’ use of a handrail for crossing bridges, specialized skills such as ice skating, skiing, surf boarding, and dancing on pointe are likely to represent enlargement of the problem space for upright locomotion. L ike st anding a nd walking, t he ke y pivots for maintaining balance are at the ankles and hips. Driving and kayaking may be en largements of t he problem spac e for s itting; t he ke y pivot is at the hips. Rock-climbing may be an outgrowth of crawling because the body is supported over all four limbs. Some specialized skills, however, may constitute new perceptionaction systems, as different from the skills acquired during infancy as crawling is from walking. Skills such as brachiating hand-overhand along monkey bars, bicycling, and swimming require special environments (water), opportunities (access to water), devices (bicycles, monkey bars), and social supports (supportive parents and peer models) f or acq uisition a nd per formance. L ike t he post ural m ilestones in infant development, bicycling, swimming, and brachiating involve n ew c ritical pa rameters f or co ntrolling ba lance a nd l ocomotion: new regions of permissible postural sway, key pivots about which the body r otates, sources of perceptual information, vantage points for viewing the support surface, and so on. As w ith inf ants’ p ostural mi lestones, flexibility i s i mperative for per ception-action s ystems acq uired la ter i n l ife. Ch ildren a nd adults m ust ad apt ac tions t o t he cha nging a ffordances offered by variations i n t he environment a nd i n t heir own bod ies a nd sk ills; motor decisions should take penalties for error into account. Thus, specialized sk ills ma y r equire acq uisition o f n ew l earning s ets f or each new problem space. Ten- to 12-year-old children, for example, showed evidence of flexible m otor dec isions f or b icycling t hrough
314
Embodiment, Ego-Space, and Action
traffic in tersections in an imm ersive, in teractive, v irtual e nvironment (Plumert, Kearney, & Cremer, 2004). While seated on stationary bikes, ch ildren v aried t he l ikelihood of c rossing t he simulated road and adjusted their rate of pedaling in accordance with the time gaps presented by faster and slower moving “cars.” Despite i mportant s imilarities, t he per ception-action s ystems acquired d uring i nfancy a nd t hose acq uired la ter i n l ife ha ve t wo important differences. First, age-related changes in cognition, motivation, and social skills may affect motor decisions about affordances. Infants, for example, are not embarrassed to fall, but older children and adults may be. Infants may not explicitly consider consequences, but older ch ildren a nd adults do. I nfants c annot u se ex plicit r ules (e.g., “ look bo th w ays bef ore c rossing t he st reet”), b ut o lder ch ildren a nd ad ults c an. M oreover, ch ildren a nd ad ults ha ve y ears o f experience ma king m otor dec isions f or t he per ception-action s ystems ac quired d uring i nfancy. P ossibly, per ception-action s ystems acquired later in life can bootstrap from earlier acquired systems by incorporating heuristic strategies and general information generating behaviors. A second important difference is that infants’ perception-action systems are acquired under intense practice regimens. Infants practice sitting, crawling, and walking, for example, during most of their waking hours. Practice is variable (hundreds of “trials” keeping balance on different su pport su rfaces i n d ifferent physical a nd social contexts) a nd distributed (short bouts of activity interspersed w ith rest periods), gearing infants for broad transfer of learning. In contrast, bec ause t he spec ialized sk ills acq uired a fter i nfancy r equire special opportunities for performance, children accumulate smaller quantities of experience under more limited practice schedules.
Conclusions I conclude where I began, with how infant locomotion speaks to the larger l iterature o n per ception-action co upling. S tudies o f i nfant locomotion raise several issues that any theory of perceptually guided action should address. First, motor decisions must be geared to affordances. For motor action to be adaptive, it must take the constraints and propensities of the body and the environment into account. Second, the central problem in detecting and exploiting affordances for
The Growing Body in Action
315
action is the means for coping with novelty and variability. Because the body and environment are always changing, behavioral flexibility is paramount. Third, human infants must learn to cope with novelty and variability. Infants develop new motor abilities before they use their abilities adaptively. With sufficient experience, infants’ motor decisions a re as accurate as t hose of adults. Fourth, situations t hat call for immense behavioral flexibility require a learning mechanism that supports extremely broad transfer. Learning sets are a likely candidate for such a mechanism. What infants learn when they acquire a learning set are exploratory procedures and strategies for on-line problem solving. As Harlow put it, they are learning to learn. Finally, learning i s a lways n ested i n t he co ntext o f de velopmental cha nge. Changes in infants’ bodies, sk ills, and environments are rapid and dramatic, but changing affordances are not limited to infancy, and learning surely does not cease after a few months of walking experience. Animals of every age continually learn new ways of detecting possibilities for ac tion at t he same t ime t hat t hose possibilities a re changing. References Adolph, K . E . (1995). A ps ychophysical a ssessment of to ddlers’ a bility to cope with slopes. Journal of Experimental Psychology: Human Perception and Performance, 21, 734–750. Adolph, K. E . (1997). L earning i n t he development of i nfant locomotion. Monographs of th e S ociety fo r Re search in C hild D evelopment, 62(3, Serial No. 251). Adolph, K. E. (2000). Specificity of learning: Why infants fall over a v eritable cliff. Psychological Science, 11, 290–295. Adolph, K. E. (2002). Learning to keep balance. In R. Kail (Ed.), Advances in chil d d evelopment an d be havior (Vol. 3 0, pp. 1–30). A msterdam: Elsevier Science. Adolph, K . E . (2005). L earning to le arn i n t he de velopment of ac tion. I n J. Lockman & J. Rei ser (Eds.), Action as an organizer of learning and development: The 32nd Minnesota Symposium on Child Development (pp. 91–122). Hillsdale, NJ: Erlbaum. Adolph, K. E., & Avolio, A. M. (2000). Walking infants adapt locomotion to c hanging b ody d imensions. Journal of E xperimental P sychology: Human Perception and Performance, 26, 1148–1166.
316
Embodiment, Ego-Space, and Action
Adolph, K . E ., & B erger, S . E . ( 2005). P hysical a nd m otor de velopment. In M. H. Bornstein & M. E. Lamb (Eds.), Developmental science: An advanced textbook (5th ed., pp. 223–281). Mahwah, NJ: Erlbaum. Adolph, K. E ., & B erger, S. E . (2006). Motor development. In D. Kuhn & R. S. Siegler (Eds.), Handbook of chil d psychology: Vo l. 2 . Cognition, perception, and language (6th ed., pp. 161–213). New York: Wiley. Adolph, K. E., & Eppler, M. A. (2002). Flexibility and specificity in infant motor skill acquisition. In J. W. Fagen & H. Hayne (Eds.), Progress in infancy research (Vol. 2, pp. 121–167). Mahwah, NJ: Erlbaum. Adolph, K. E ., Eppler, M. A., & Gibs on, E . J. (1993). Development of perception of affordances. In C. K. Rovee-Collier & L . P. Lipsitt (Eds.), Advances in infanc y r esearch (V ol. 8 , pp . 5 1–98). N orwood, N J: Ablex. Adolph, K . E ., Eppler, M. A ., Ma rin, L ., Weise, I. B., & C learfield, M. W. (2000). E xploration i n t he s ervice o f p rospective c ontrol. Infant Behavior and Development, 23, 441–460. Adolph, K . E ., & J oh, A . S . (in p ress). M ultiple le arning m echanisms i n the de velopment of ac tion. I n A . Woodward & A . Needham (Eds.), Learning and the infant mind. New York: Oxford University Press. Adolph, K. E., Joh, A. S., Ishak, S., Lobo, S. A., & Berger, S. E. (2005, October). Specificity of infants’ k nowledge for action. Paper presented at the Cognitive Development Society, San Diego, CA. Adolph, K. E., Robinson, S. R., Young, J. W., & Gi ll-Alvarez, F. (in press). What is the shape of developmental change? Psychological Review. Adolph, K . E ., Vereijken, B ., & Den ny, M . A . (1998). L earning to c rawl. Child Development, 69, 1299–1312. Adolph, K. E., Vereijken, B., & Shrout, P. E. (2003). What changes in infant walking and why. Child Development, 74, 474–497. Berger, S . E ., & A dolph, K . E . (2003). I nfants u se ha ndrails a s to ols i n a locomotor task. Developmental Psychology, 39, 594–605. Berger, S. E., Adolph, K. E., & Lobo, S. A. (2005). Out of the toolbox: Toddlers differentiate wobbly and wooden handrails. Child Development, 76, 1294–1307. Berger, S. E., Theuring, C. F., & Adolph, K. E. (2007). How and when infants learn to climb stairs. Infant Behavior and Development, 30, 36–49. Bernstein, N. (1996). On dexterity and its development. In M. L. Latash & M. T. Turvey (Eds.), Dexterity and its development (pp. 3–244). Mahwah, NJ: Erlbaum. Bertenthal, B. I., Campos, J. J., & Barrett, K. C. (1984). Self-produced locomotion: A n or ganizer of e motional, c ognitive, a nd s ocial d evelopment in infancy. In R. N. Emde & R. J. Harmon (Eds.), Continuities and discontinuities in development (pp. 175–210). New York: Plenum Press.
The Growing Body in Action
317
Bertenthal, B. I., & Clifton, R. K. (1998). Perception and action. In D. Kuhn & R. S. Siegler (Eds.), Handbook of child psychology: Vol. 2. Cognition, perception, and language (5th ed., pp. 51–102). New York: Wiley. Bly, L . (1994). Motor s kill a cquisition in th e first year. S an A ntonio, T X: Therapy Skill Builders. Bril, B., & Breniere, Y. (1992). Postural requirements and progression velocity in young walkers. Journal of Motor Behavior, 24, 105–116. Bril, B., & L edebt, A. (1998). Head coordination as a m eans to a ssist sensory integration in learning to walk. Neuroscience and Biobehavioral Reviews, 22, 555–563. Campos, J. J., Anderson, D. I., Barbu-Roth, M. A., Hubbard, E. M., Hertenstein, M. J., & Witherington, D. C. (2000). Travel broadens the mind. Infancy, 1, 149–219. Campos, J. J., H iatt, S ., R amsay, D., Henderson, C ., & S vejda, M . (1978). The emergence of fear on the visual cliff. In M. Lewis & L. Rosenblum (Eds.), The development of affect (pp. 149–182). New York: Plenum. Chan, M . Y ., Bi ancaniello, R ., A dolph, K . E ., & Ma rin, L . ( 2000, J uly). Tracking infant s’ l ocomotor e xperience: The te lephone di ary. P oster presented to t he International Conference on Infant Studies, Brighton, England. Chan, M. Y., Lu, Y., Marin, L., & Adolph, K. E. (1999). A baby’s day: Capturing crawling experience. In M. A. Grealy & J. A. Thomp son (Eds.), Studies in pe rception and a ction (Vol. 5, pp. 245–249). Mahwah, NJ: Erlbaum. Clark, A. (1997). Being there: Putting brain, body, and world together again. Cambridge, MA: MIT Press. de Vries, J. I. P., Visser, G. H. A., & Prechtl, H. F. R. (1982). The emergence of fetal behaviour. I. Qualitative aspects. Early Human Development, 7, 301–322. Eppler, M . A ., S atterwhite, T ., W endt, J., & B ruce, K . ( 1997). I nfants’ responses to a v isual c liff a nd o ther g round su rfaces. I n M . A . Schmuckler & J. M. Kennedy (Eds.), Studies in perception and action (Vol. 4, pp. 219–222). Mahwah, NJ: Erlbaum. Franchak, J. M ., & A dolph, K . E . (2007, Ma y). Perceiving ch anging a ffordances for action: Pregnant women walking through doorways. Paper presented a t t he M eeting o f t he Vi sion S ciences S ociety, S arasota, FL. Frankenburg, W. K ., & Do dds, J. B . (1967). The Denver Developmental Screening Test. Journal of Pediatrics, 71, 181–191. Freedland, R . L ., & B ertenthal, B . I . ( 1994). De velopmental c hanges i n interlimb c oordination: T ransition to ha nds-and-knees c rawling. Psychological Science, 5, 26–32.
318
Embodiment, Ego-Space, and Action
Garciaguirre, J. S., & Adolph, K. E. (2006, June). Infants’ everyday locomotor e xperience: A w alking an d fa lling m arathon. Pap er p resented to the International Conference on Infant Studies, Kyoto, Japan. Garciaguirre, J. S ., A dolph, K . E ., & Sh rout, P. E . (2007). B aby c arriage: Infants walking with loads. Child Development, 78, 664–680. Gentile, A. M. (2000). Skill acquisition: Action, movement, and neuromotor processes. In J. Carr & R. Shepard (Eds.), Movement science: Foundations fo r physical th erapy in r ehabilitation (2nd ed., pp. 111–187). New York: Aspen Press. Gibson, E. J. (1982). The concept of affordances in development: The renascence of functionalism. In W. A. Collins (Ed.), The concept of development: The M innesota s ymposia on chil d ps ychology (Vol. 1 5, pp . 55–81). Mahwah, NJ: Erlbaum. Gibson, E. J. (1987). Introductory essay: What does infant perception tell us about theories of perception? Journal of Experimental Psychology: Human Perception & Performance, 13, 515–523. Gibson, E . J. ( 1991). An od yssey i n l earning and pe rception. C ambridge, MA: MIT Press. Gibson, E . J., & P ick, A . D. (2000). An ecological approach to perceptual learning and development. New York: Oxford University Press. Gibson, E. J., & Walk, R. D. (1960). The “visual cliff.” Scientific American, 202, 64–71. Gibson, J. J. ( 1979). The ecol ogical appr oach t o v isual pe rception. B oston: Houghton Mifflin. Harlow, H. F. (1949). The formation of learning sets. Psychological Review, 56, 51–65. Harlow, H . F . ( 1959). L earning s et a nd er ror f actor t heory. I n S . K och (Ed.), Psychology: A s tudy of a sc ience ( pp. 4 92–533). N ew Y ork: McGraw-Hill. Harlow, H. F., & Kuenne, M. (1949). Learning to think. Scientific American, 181, 3–6. Joh, A . S ., A dolph, K . E ., N arayanan, P. J., & D ietz, V. A . (2007). G auging p ossibilities f or ac tion ba sed o n f riction u nderfoot. Journal of Experimental P sychology: H uman P erception an d P erformance, 3 3, 1145–1157. Johnson, M. L ., Veldhuis, J. D ., & L ampl, M. (1996). Is g rowth saltatory? The usefulness and limitations of frequency distributions in analyzing pulsatile data. Endocrinology, 137, 5197–5204. Kinsella-Shaw, J. M ., Shaw, R ., & Turvey, M. T. (1992). Perceiving “walkon-able” slopes. Ecological Psychology, 4, 223–239. Lampl, M . (1992). Fu rther obs ervations on d iurnal v ariation i n standing height. Annals of Human Biology, 19, 87–90. Lampl, M. (1993). Evidence of saltatory growth in infancy. American Journal of Human Biology, 5, 641–652.
The Growing Body in Action
319
Lampl, M., Ashizawa, K., Kawabata, M., & Johnson, M. L. (1998). An example of variation and pattern in saltation and stasis growth dynamics. Annals of Human Biology, 25, 203–219. Lampl, M ., & J eanty, P. (2003). T iming i s e verything: A re consideration of fetal growth velocity patterns identifies th e i mportance o f i ndividual a nd sex differences. American Journal of H uman Biology, 15, 667–680. Lampl, M., & J ohnson, M. L. (1993). A c ase study in daily growth during adolescence: A si ngle spurt or changes i n t he dy namics of saltatory growth? Annals of Human Biology, 20, 595–603. Lampl, M., Johnson, M. L ., & F rongillo, E . A. (2001). Mixed d istribution analysis identifies saltation and stasis growth. Annals of Human Biology, 28, 403–411. Lampl, M., Veldhuis, J. D., & Johnson, M. L. (1992). Saltation and stasis: A model of human growth. Science, 258, 801–803. Leo, A. J., C hiu, J., & A dolph, K. E. (2000, July). Temporal and functional relationships of c rawling, c ruising, an d w alking. P oster p resented a t the International Conference on Infant Studies, Brighton, UK. Lockman, J. J. ( 1984). The de velopment of de tour a bility during i nfancy. Child Development, 55, 482–491. Lockman, J. J., & Adams, C. D. (2001). Going around transparent and gridlike barriers: Detour ability as a perception-action skill. Developmental Science, 4, 463–471. Mark, L . S ., Baillet, J. A ., Cr aver, K . D., Douglas, S . D., & F ox, T. (1990). What an actor must do in order to perceive the affordance for sitting. Ecological Psychology, 2, 325–366. Mark, L. S., Jiang, Y., King, S. S., & Paasche, J. (1999). The impact of visual exploration o n j udgments o f w hether a gap i s c rossable. Journal of Experimental P sychology: H uman P erception an d P erformance, 25 , 287–295. Mondschein, E . R ., Adolph, K. E ., & Tamis-LeMonda, C . S. (2000). Gender b ias i n m others’ e xpectations a bout i nfant c rawling. Journal of Experimental Child Psychology, 77, 304–316. Moore, K. L., & Persaud, T. V. N. (1998). The developing human: Clinically oriented embryology (6th ed.). Philadelphia: W. B. Saunders. Noonan, K. J., Farnum, C. E., Leiferman, E. M., Lampl, M., Markel, M. D., & Wi lsman, N. J. ( 2004). Growing pa ins: A re t hey due to i ncreased growth during recumbency as documented in a lamb model? Journal of Pediatric Orthopedics, 24, 726–731. Ounsted, M ., & M oar, V. A . (1986). P roportionality c hanges i n t he first year of life: The influence of weight for gestational age at birth. Acta Paediatrica Scandinavia, 75, 811–818. Palmer, C . E . (1944). St udies of t he center of g ravity i n t he human body. Child Development, 15, 99–163.
320
Embodiment, Ego-Space, and Action
Palmer, C. F. (1987, April). Between a rock and a hard place: Babies in tight spaces. Poster presented at the meeting of the Society for Research in Child Development, Baltimore, MD. Plumert, J. M., Kearney, J. K., & Cremer, J. F. (2004). Children’s perception of gap a ffordances: Bicycling ac ross t raffic-fi lled intersections in an immersive virtual environment. Child Development, 75, 1243–1253. Rader, N., B ausano, M ., & R ichards, J. E . ( 1980). O n t he na ture o f t he visual-cliff-avoidance r esponse in h uman inf ants. Child De velopment, 51, 61–68. Reed, E. S. (1989). Changing theories of postural development. In M. H. Woollacott & A. Shumway-Cook (Eds.), Development of posture and gait across the lifespan (pp. 3–24). Columbia, SC: University of South Carolina Press. Richards, J. E., & Rader, N. (1981). Crawling-onset age predicts visual cliff avoidance in inf ants. Journal of E xperimental P sychology: H uman Perception and Performance, 7, 382–387. Richards, J. E ., & R ader, N. ( 1983). A ffective, b ehavioral, a nd a voidance responses on t he v isual cliff: Effects of crawling onset age, crawling experience, and testing age. Psychophysiology, 20, 633–642. Robinson, S. R., Adolph, K. E., & Young, J. W. (2004, May). Continuity vs. discontinuity: H ow different ti me scal es o f beha vioral mea surement affect th e pat tern of d evelopmental ch ange. P oster p resented a t t he meeting of the International Conference on Infant Studies, Chicago, IL. Robinson, S. R., & Kleven, G. A. (2005). Learning to move before birth. In B. Hopkins & S. Johnson (Eds.), Advances in infancy research: Vol. 2. Prenatal development of postnatal functions (pp. 131–175). Westport, CT: Praeger. Schmidt, R. A., & L ee, T. D. (1999). Motor control and learning: A be havioral emphasis (3rd ed.). Champaign, IL: Human Kinetics. Stevenson, H. W. (1972). Children’s learning. New York: Appleton-CenturyCrofts. Thelen, E ., & F isher, D . M . (1983). The or ganization of s pontaneous le g movements i n ne wborn i nfants. Journal of M otor B ehavior, 1 5, 353–377. Titzer, R . ( 1995, Ma rch). The d evelopmental d ynamics of un derstanding transparency. Pap er p resented a t t he m eeting o f t he S ociety f or Research in Child Development, Indianapolis, IN. Vereijken, B ., & A dolph, K . E . (1999). Transitions i n t he de velopment o f locomotion. In G. J. P. Savelsbergh, H. L. J. van der Ma as, & P. C. L. van Geert (Eds.), Non-linear analyses of developmental processes (pp. 137–149). Amsterdam: Elsevier.
The Growing Body in Action
321
Vereijken, B., & Waardenburg, M. (1996, April). Changing patterns of interlimb coo rdination f rom su pported t o i ndependent w alking. Poster presented at t he meeting of t he I nternational C onference on I nfant Studies, Providence, RI. Walk, R. D., & Gibson, E. J. (1961). A comparative and analytical study of visual depth perception. Psychological Monographs, 75(15, Whole No. 519). Walk, R. D., Gibson, E. J., & Tighe, T. J. (1957). Behavior of light- and darkreared rats on a visual cliff. Science, 126, 80–81. Warren, W . H . ( 1984). P erceiving a ffordances: Vi sual g uidance of s tair climbing. Journal of Experimental Psychology: Human Perception and Performance, 10, 683–703. Warren, W. H., & W hang, S. (1987). Visual guidance of walking through apertures: Body-scaled information for affordances. Journal of Experimental Psychology: Human Perception and Performance, 13, 371–383. Witherington, D. C ., C ampos, J. J., A nderson, D. I., L ejeune, L ., & S eah, E. (2005). Avoidance of heig hts on t he v isual cliff i n newly walking infants. Infancy, 7, 3.
10 Motor Knowledge and Action Understanding A Developmental Perspective
Bennett I. Bertenthal and Matthew R. Longo
The prediction of where and how people are going to move has obvious relevance for social interaction. As adults, we are extremely adept at predicting at least some of t hese behaviors automatically i n real time. If, for example, we observe someone reaching in the direction of a half-filled glass on a table, we can predict with relative certainty that the reaching action is directed toward the glass. Often, we can also predict if the actor intends to drink from the glass or intends to remove t he glass depending on t he state of t he actor as well as t he context surrounding the action. How d o w e de tect t he g oals, i ntentions, a nd st ates o f o thers so rapidly w ith l ittle i f a ny a wareness o f t hese i mplicit i nferences? According to a g rowing number of social neuroscientists, there are specialized mechanisms in the brain for understanding actions and responding to them. Evidence from neuroimaging studies and neuropsychological studies of normal and brain damaged patients offer considerable support for t his cla im (Decety & S ommerville, 2 004; Frith & F rith, 2 006; Gr èzes, F rith, & P assingham, 2 004; Pelphrey, Morris, McCarthy, 2005; Saxe, Xiao, Kovacs, Perrett, & K anwisher, 2004). The a vailability o f spec ialized p rocesses su ggests t hat t he brain may be i ntrinsically prepared for this information and, thus, that action understanding should be evident early in development. 323
324
Embodiment, Ego-Space, and Action
Indeed, ev en n eonates s how ev idence o f r esponding t o h uman behaviors, such as speech, gaze, and touch (Lacerda, von Hofsten, & Heimann, 2001; von Hofsten, 2003). It is not entirely clear, however, that these responses are contingent on perceived actions as opposed to movements. In order to avoid unnecessary confusion, it behooves us t o beg in b y d istinguishing be tween m ovements a nd ac tions. Human actions a re comprised of a b road range of limb, head, a nd facial movements, but not all body movements are actions. We reserve this latter description for goal-directed movements. These are movements that are planned relative to an intrinsic or extrinsic goal prior to their execution. For example, a prototypical goal-directed action might i nvolve ex tending a n a rm to g rasp a ba ll resting on a t able. By contrast, waving an arm and accidentally hitting a ba ll does not represent an action. Recent findings suggest that infants understand the g oal-directed na ture o f ac tions b y t he s econd ha lf o f t he first year a nd per haps e ven e arlier (e.g., C sibra, G ergely, B iro, K oos, & Brockbank, 1999; Kiraly, Jovanovic, Prinz, Aschersleben, & Gergely, 2003; Luo & Ba illargeon, 2002; Woodward, 1998; 1999; Woodward & Sommerville, 2000). Two Modes for Understanding Actions What are the mechanisms that underlie action understanding? By action understanding, we mean the capacity to achieve an internal description or representation of a per ceived ac tion a nd to use it to organize a nd p redict a ppropriate f uture beha vior. Rec ent n europhysiologically motivated theories (Jeannerod, 1997, 2001; Rizzolatti, Fogassi, & Gallese, 2001) suggest that there are two mechanisms that might explain how action understanding occurs. The more conventional mechanism involves some form of visual analysis followed by categorization and inference. This type of analysis can be thought of as progressing through stages of processing comparable to those proposed by Marr (1982). These processes are mediated via the ventral visual pathway of the brain and are independent of the motor system. For ex ample, when we obs erve a ha nd g rasping a g lass, t he v isual system parses this scene into an object (i.e., the glass) and a moving hand that eventually contacts and grasps the glass. This visual input is recognized and associated with other information about the glass and the actor in order to understand the observed action.
Motor Knowledge and Action Understanding
325
Whereas this first mechanism applies to all visual information, the second mechanism is unique to the processing of actions (and, perhaps, ob ject a ffordances) a nd ha s be en r eferred t o a s a direct matching or observation-execution m atching system (Rizzolatti et al., 2001). With this mechanism, visual representations of observed actions are mapped directly onto our motor representation of the same action; an action is understood when its observation leads to simulation (i.e., representing the responses of others by covertly generating similar subthreshold responses in oneself) by the motor system. Thus, when we observe a ha nd grasping a g lass, the same neural ci rcuit t hat p lans o r ex ecutes t his g oal-directed a ction becomes active in the observer’s motor areas (Blakemore & Decety, 2001; Iacobo ni, M olnar-Szakacs, G allese, B uccino, Ma zziotta, & Rizzolatti, 2005; Rizzolatti et al., 2001). It is the “motor knowledge” of the observer that is used to understand the observed goaldirected a ction v ia c overt i mitation. For t his re ason, k nowledge of the action will depend in part on the observer’s specific motor experience with the same action (Calvo-Merino, Glaser, Grèzes, Passingham, & Ha ggard, 2005; Longo, Kosobud, & B ertenthal, in press). In co ntrast t o t he first v isual m echanism, t he flow of information b etween p erception an d a ction b y dir ect m atching e nables more than an appreciation of the surface properties of the perceived actions. Simulation enables an appreciation of the means (i.e., how the body parts are arranged to move) by which the action is executed as well as an appreciation of the goal or the effects of the action. This implies that the observer is able to covertly imitate as well as predict the outcome of an observed action.
Simulation of Actions Although this latter hypothesis for explaining how we understand actions d ates back t o t he i deomotor t heory o f J ames ( 1890) a nd Greenwald (1970), direct evidence supporting this view emerged only r ecently w ith t he d iscovery o f m irror n eurons i n t he v entral premotor c ortex of t he mon key’s br ain. The se neurons discharge when the monkey performs a goal-directed action as well as when the monkey observes a human or conspecific perform the same or a similar action (Gallese, Fadiga, Fogassi, & Rizzolatti, 1996; Rizzolatti,
326
Embodiment, Ego-Space, and Action
Fadiga, G allese, & F ogassi, 1 996). Thus, t hese n eurons p rovide a common internal representation for executing and observing goaldirected action. More recently, mirror neurons were also observed in the inferior parietal lobule which shares direct connections with the premotor cortex (Fogassi et al., 2005). This latter finding is important for explaining how visually perceived face and body movements are represented by mirror neurons in the premotor cortex. These movements a re coded b y t he superior tem poral su lcus, wh ich d oes n ot p roject d irectly t o t he m irror n eurons i n t he v entral p remotor co rtex. I nstead, t he su perior temporal sulcus projects to the inferior parietal lobule which is connected to the ventral premotor cortex (Rizzolatti & Cr aighero, 2004). Thus, mirror neurons are innervated by a fronto-parietal circuit in the motor system that also receives visual inputs from the superior temporal su lcus. Human neuroimaging a nd t ranscranial magnetic st imulation st udies have shown ac tivation of a h omologous fronto-parietal circuit during both the observation as well as the imitation of actions (Brass, Zysset, & v on Cramon, 2001; Buccino e t a l., 2 001; Dec ety, Cha minade, Gr èzes, & M eltzoff, 2002; Fadiga, Fogassi, Pavesi, & R izzolatti, 1995; Grèzes, Armony, Rowe, & Passingham, 2003; Iacoboni, Woods et al., 1999; Koski, Iacoboni, Dubeau, Woods, & Ma zziotta, 2003; for a r eview, see R izzolatti & Craighero, 2004). This ne urophysiological e vidence i s c omplemented by re cent behavioral evidence showing that the observation of actions facilitates o r p rimes r esponses i nvolving s imilar ac tions. F or ex ample, visuomotor priming is observed when grasping a bar in a horizontal or vertical orientation is preceded by a picture of a hand or the observation of an action congruent with the required response (Castiello, Lusher, Mari, Edwards, & Humphreys, 2002; Edwards, Humphreys, & Castiello, 2003; Vogt, Taylor, & Hopkins, 2003). Similarly, response facilitation is observed when the task involves responding to a t apping or lifting finger or an opening or closing hand that is preceded by a congruent (e.g., index finger responds to observation of t apping i ndex finger) a s opposed to i ncongruent st imulus (e.g., index finger responds to observation of tapping middle finger) (Bertenthal, Longo, & Kosobud, 2006; Brass, Bekkering, & Prinz, 2001; Brass, Bekkering, Wohlschlager, & Pr inz, 2000; Heyes, Bird, Johnson, & Haggard, 2005; Longo et al., in press).
Motor Knowledge and Action Understanding
327
Prediction of the Effects of Actions While t he p receding e vidence su ggests t hat ac tion obs ervation i s accompanied by covert imitation of the observed action, additional evidence su ggests t hat ac tion obs ervation a lso leads to t he prediction of t he effects or outcome of t he ac tion. I n a st udy by K andel, Orliaguet, a nd V iviani ( 2000), pa rticipants w ere sh own a po intlight tracing of the first letter of a t wo-letter sequence handwritten in cursive, and they were able to predict the second letter from the observation of t he pre ceding move ments. Ne uroimaging e vidence revealed t hat a f ronto-parietal c ircuit (associated w ith t he h uman mirror system) was activated when predicting the next letter from a point-light tracing, but this same circuit was not activated when predicting the terminus of a point-light tracing of a spring-driven ball after it bounced (Chaminade, Meary, Orliaguet, & Decety, 2001). The finding that the fronto-parietal circuit was only activated when predicting the outcome of a human action and not when predicting the outcome of a mechanical event is consistent with other research suggesting that the mirror system is restricted to biological movements (e.g., K ilner, Paulignan, & Bla kemore, 2003; Tai, S cherfler, Brooks, Sawamoto, & Castiello, 2004). This last finding implies that the direct matching system is sensitive to the perceived similarity between observed and executed responses, and t hus, observers should show an improved ability to predict t he effects o r o utcome o f a n ac tion ba sed o n t heir o wn a s o pposed t o others’ m ovements bec ause t he ma tch be tween t he obs erved a nd executed actions should be greatest when the same person is responsible for both (e.g., Knoblich, 2002; Knoblich, Elsner, Aschersleben, & Metzinger, 2003). A few recent studies manipulating the “authorship” of the movements find that predictability is indeed greater when predicting the outcomes of self-produced as opposed to other produced movements (Beardsworth & Buckner, 1981; Flach, Knoblich, & Prinz, 2003; Knoblich & Flach, 2001; Repp & Knoblich, 2004). Why s hould c overt i mitation or si mulation of move ments c ontribute t o p redicting t heir e ffects? The a nswer i s r elated t o t he inertial la gs a nd n eural co nduction dela ys t hat acco mpany l imb movements in the human body. As a co nsequence of these delays, it i s i nsufficient for movements to b e g uided by s ensory fe edback, because such m ovements would be per formed in a jer ky and stac-
328
Embodiment, Ego-Space, and Action
cato fashion as opposed to being smooth and fluid (Wolpert, Doya, & Kawato, 2003). Thus, the execution of most movements requires prospective control or planning. Based on computational studies, it has been proposed that this planning or control involves an internal model, d ubbed forward mo del, wh ich p redicts t he s ensory co nsequences of a motor command (Jordan, 1995; Kawato, Furawaka, & Suzuki, 1987; M iall & W olpert, 1996; Wolpert & Fla nagan, 2 001). Presumably, t he s imulation o f a g oal-directed ac tion i ncludes t he activation of t hese forward models enabling prediction a s well a s covert imitation of the behavior. As motor behaviors are practiced and l earned, t hese f orward m odels beco me be tter spec ified and enable m ore p recise p rediction o f t he en suing m otor c ommands. According t o Wolpert e t a l. (2003), s imilar forward models could be used to predict social behaviors, such as facial expression, gaze direction, or posture. Perception of the Structure of Human Movements One additional source of evidence suggesting that an observationexecution matching system is functional in human observers derives from their showing greater sensitivity to biological motions that are consistent as opposed to inconsistent with the causal structure of an action. A common technique for studying the perception of human movements i nvolves t he depiction of t hese movements w ith pointlight displays. These displays are created by filming a person in the dark with small lights attached to his or her major joints and head. (An ex ample of six s equential f rames f rom a po int-light d isplay i s presented in Figure 10.1.) It is also possible to synthesize these nested pendular motions, which is the technique that we have used in much of our previous research (see Bertenthal, 1993, for a review). Johansson (1973) was the first to systematically study the perception of these displays. He reported that adult observers perceive the human form and identify different actions (e.g., push-ups, jumping jacks, etc.) in displays lasting less than 200 ms, corresponding to about five frames of a film sequence. This finding is very impressive because these displays are devoid of a ll featural information, such a s clothing, sk in, or hair. It thus appears that recognition of actions can result exclusively from the extraction of a unique structure from motion-carried information.
Motor Knowledge and Action Understanding
329
Figure 10.1 Six sequentially sampled frames from a moving point-light display depicting a person walking.
In spite of t hese i mpressive findings, t he recognition of upsidedown biological motion displays as depicting a person or the person’s direction of gait is significantly impaired (Bertenthal & Pinto, 1994; Pavlova & Sokolov, 2000; Sumi, 1984; Verfaillie, 1993). If recognition was based only on the perception of structure from motion, then the orientation of the human form should not matter. It thus appears that some additional processes involving the causal structure of human movement contribute to the perception of biological motions. For similar reasons, point-light trajectories obeying the kinematics of human movement are perceived as moving at a uniform speed, even though the variations in speed can exceed 200%. By co ntrast, observers are much more sensitive to speed differences in point-light trajectories that do not conform to the kinematics of human movements (V iviani, Ba ud-Bovy, & Red olfi, 1997; V iviani & M ounoud, 1990; V iviani & S tucchi, 1 992). These la tter findings t hus su ggest that curvilinear movements are not perceived in terms of the actual changes that occur in velocity, but rather in terms of the biological movements that appear smooth and uniform when executed by one or more articulators of the human body. Although the preceding evidence is consistent with motor knowledge contributing to action understanding, it is difficult to rule out perceptual l earning a s t he p rincipal r eason f or obs ervers sh owing differential sensitivity to familiar and unfamiliar biological motions. Indeed, this same confound is present when interpreting much of the previously reviewed evidence suggesting that motor knowledge contributes to the prediction of human actions. This confound also adds to the challenge of testing whether motor knowledge contributes to infants’ understanding of others’ actions, which is why it is necessary to consider the contributions of visual attention and experience when studying the functionality of the observation-execution matching system during infancy.
330
Embodiment, Ego-Space, and Action
Developmental Evidence for an Observation– Execution Matching System For t hose u nfamiliar w ith t he preceding e vidence f or a n obs ervation–execution ma tching s ystem, i t w ill be u seful t o r ecap h ow a direct matching system contributes to action understanding. The traditional view is that others’ actions are understood via the same perceptual and conceptual processes as are all other visual events. If, however, t he perception of ac tions a lso ac tivates t he motor system (i.e., direct matching), then the specific motor knowledge associated with the perceived actions will contribute to understanding others’ actions. The specific m echanism f or t his u nderstanding i s co vert imitation or simulation of the observed action. Although the motor response is not overtly executed, the planning for specific movements (i.e., means) as well as the effects or perceptual consequences (i.e., the goal) are automatically activated in the motor cortex. Thi s activation imparts to the observer embodied knowledge of the perceived action (i.e., motor knowledge) without the need to rely on visual experience or logical inferences based on t he perceived information. The implication is that we can then understand others’ actions, such as lifting a c up, playing the piano, or displaying a d isposition, such a s displeasure, based on the internalized motor programs available to us for performing the same actions. In the remainder of this chapter, we will review a series of experiments designed to investigate whether motor knowledge contributes to action understanding by infants. Although it is certainly possible for i nfants t o ach ieve t his u nderstanding f rom v isual ex perience alone, t here ex ists a de velopmental adv antage for d irect matching, because such a m echanism does not necessitate specific conceptual or symbolic knowledge of actions, which demand the development of h igher-level cog nitive p rocesses. Thus, t he f unctioning o f t his system may offer an explanation for the precocious development of young i nfants’ u nderstanding o f ac tions, a nd soc ial de velopment more generally (Tomasello, 1999). The evidence is divided into three sections. First, we will review a series of experiments showing that infants demonstrate perseverative s earch er rors f ollowing obs ervation o f so meone el se’s ac tions. This evidence will be used to support the claim that an observation execution matching system is functional in infants and that action observation e licits c overt i mitation. S econd, we w ill re view re cent
Motor Knowledge and Action Understanding
331
experiments showing that infants visually orient in response to deictic gestures, and that this response is specifically a function of motor knowledge enabling prediction of t he effect of a n observed action. Finally, we will review evidence showing that infants perceive biological motions depicting fa miliar but not u nfamiliar ac tions, a nd that t he de velopment o f t his per ceptual sk ill i s a t l east co rrelated with the development of their motor skills.
Perseverative Errors in Searching for Hidden Objects The Piagetian A-not-B error, observed in 8- to 12-month-old infants, is among the most consistently replicable findings in developmental psychology. In t his task, i nfants first search correctly for a n object they see hidden in one location (A) on one or more trials, but then continue to search at the A location after the object has been hidden in a new location (B). A number of researchers attribute this search error to the formation of a prepotent response. For example, Smith, Thelen, and colleagues (Smith, Thelen, Titzer, & McLin, 1999; Thel en, Schoner, Scheier, & Smith, 2001) claim that the error arises from the task dynamics of reaching, which causes the motor memory of one reach to persist and influence subsequent reaches. Diamond (1985) argues that one cause of the error is the inability to inhibit a previously rewarded motor response. Zelazo and colleagues (Marcovitch, & Z elazo, 1 999; Z elazo, Re znick, & S pinazzola, 1 998) ac count f or perseverative r esponses i n y oung ch ildren i n ter ms o f t he r elative dominance of a response-based system “activated by motor experience” o ver a co nscious r epresentational s ystem ( Marcovitch & Zelazo, 1999, p. 1308). In each of these accounts, a history of reaches to the A location is a crucial aspect of the perseverative response. If an observation-execution matching system is functional in young infants, then simply observing someone else reach to the same location repeatedly may be sufficient for eliciting this error. This prediction follows from the claim that observing an action will lead to covert imitation of that same action, and thus will be functionally similar t o ex ecuting t he ac tion o neself. We ( Longo & B ertenthal, 2006) tested this prediction by administering the standard A-not-B hiding t ask t o a s ample o f 9 -month-old i nfants. Twenty 9 -monthold infants were tested with the canonical reaching task, and twenty
332
Embodiment, Ego-Space, and Action
were tested in a co ndition in which t hey watched a n experimenter reach, but did not reach themselves. Infants were seated on their mother’s lap in front of a t able covered with black felt and allowed to play with a toy (a rattle or a plastic Big-Bird) for several seconds. Four pretraining trials were administered using procedures similar to those used by Smith et al. (1999). On the first pretraining trial, the toy was placed on top of a covered well. On the second trial, the toy was placed in the well but with one end st icking out of t he well. On t he t hird t rial, t he toy was placed completely in the well but left uncovered. On the final trial, the toy was placed completely in the well and covered. The experimental trials used a t wo-well apparatus and consisted of three A trials and one B trial (see Figure 10.2). Infants in the reaching condition were allowed to search on all trials. Infants in the looking condition were only allowed to search on the B trial and observed the ex perimenter recover t he object on t he A t rials. On e ach t rial, the toy was waved and the infant’s name was called to attract his or her attention. The experimenter removed the lid with one hand and placed the toy in the well with the other hand. The location (right or left) of t he A t rials a nd ex perimenter’s a rm (right or left; coded as which arm the experimenter used to hide the toy) were counterbalanced between infants. Thus, t he experimenter’s reaches were ipsilateral ha lf o f t he t ime (right-handed r each t o t he l ocation o n t he right or left-handed reach to location on t he left) and contralateral
Figure 10.2 Infant searching for toy in one of the two hiding wells (from Longo & Bertenthal, 2006).
Motor Knowledge and Action Understanding
333
half of the time (right-handed reach to the location on the left or lefthanded reach to the location on the right). For A trials in the reaching condition, the apparatus was slid forward to within the baby’s reach following a 3 s delay. If, after 10 s, the infant had not retrieved the toy from the A location, the experimenter uncovered the well and encouraged the infant to retrieve the toy. For A trials in the looking condition, the experimenter did not slide the apparatus toward the infants and recovered the toy following a three second delay. The ex perimenter u sed t he s ame a rm to retrieve t he toy as was used to hide the toy. On B t rials, in both conditions, the experimenter hid the toy (using the same hand as on the A trials) and then the apparatus was moved to within the infant’s reach following a 3 s delay. The dependent measure was whether the infant searched for the hidden toy at the correct B l ocation or reverted to search at the A location where the toy had been previously found. The results revealed that infants in the Reaching condition were significantly more likely to make an error on B trials (15 of 20) than on any of the A trials, the canonical A-not-B error. Infants also made significantly more errors on looking B t rials (12 of 20) than infants in the reaching condition made on A trials (see Figure 10.3). These results dem onstrate t hat o vert r eaching t o t he A l ocation b y t he infant is not necessary to elicit the A-not-B error. During training, 80
Search Error (% Infants)
70 60 50 40 30 20 10 0 A1
A2
A3
Reaching B Looking B
Trial
Figure 10.3 Percentage of i nfants s earching i ncorrectly on first, second, and third reaching A t rials, searching B t rial, and looking B t rial (from Longo & Bertenthal, 2006).
334
Embodiment, Ego-Space, and Action
infants in the looking condition had reached four times to a central location u sing t he s ingle-well a pparatus, b ut had never r eached t o the A location. Still, they were found to “perseverate” on their very first reach using t he t wo-well apparatus. The l ikelihood of ma king an error did not differ between the two B conditions. In sum, these data suggest that looking A trials influenced performance on the B t rials. Nonetheless, in order to establish that these responses a re t ruly perseverative, a s o pposed t o s imply r andom, i t is necessary to demonstrate errors on significantly more than 50% of the trials. Binomial tests revealed greater than chance perseveration on reaching, but not on looking B t rials. Thus, t hese d ata a re consistent with findings from previous studies showing that infants perseverate after reaching A trials, but the evidence of perseveration following looking A trials was somewhat equivocal. In t he s econd e xperiment, we s ought to prov ide more d efinitive evidence for pers everative s earch i n t he looking condition. Ma rcovitch, Z elazo, a nd S chmuckler ( 2002) f ound t hat t he l ikelihood of a pers everative s earch er ror i ncreased a s t he number of A t rials increased, at least between the range of one to six. If the A-not-B error is indeed a function of similar mechanisms inducing perseverative search in both the reaching and looking conditions, then we would expect search errors in the looking condition to be more robust as the number of A trials increases. Thus, in order to increase the likelihood of finding perseveration at greater than chance levels, the following experiment included six looking A trials instead of three. Thirty 9-month-old infants were tested following the procedures described for the first experiment, except that all infants were tested in t he looking condition a nd t here were six i nstead of t hree A t rials. The r esults f rom t his ex periment r evealed t hat 70% (21 of 3 0) of the infants made t he A-not-B error, significantly more than half of t he s ample. This finding s uggests t hat ob servation of a re aching ac tion is su fficient to elicit perseverative search. As such, t hese findings are consistent with infants covertly imitating the observed action of searching for the toy in the covered well. Can these results be explained by other mechanisms? Some researchers suggest that the crucial factor leading to search errors at the B location is not a history of reaching to the A location, but rather a history of visually attending to or planning to reach to the A l ocation (e.g., Di edrich, H ighlands, S pahr, Thelen, & S mith, 2001; Munakata, 1997; Ruffman & L angman, 2002). In t he current
Motor Knowledge and Action Understanding
335
experiments, g reater a ttention t o o ne l ocation t han t he o ther w ill likely covary with the history of simulated reaching to that location, and t hus i t i s d ifficult t o d isambiguate t hese t wo i nterpretations. Although th is attentional c onfound i s o ften a p roblem wh en te sting the contribution of motor knowledge to performance, converging analyses assessing infants’ own reaching behavior were helpful in showing t hat a n attentional i nterpretation was not su fficient for explaining their search errors. It is well documented in the literature that infants show an ipsilateral bias in their reaching. Bruner (1969), for example, referred to the apparent inability of young infants to reach across the body midline as t he “mysterious midline ba rrier,” a rguing t hat contralateral reaches do not oc cur before 7 m onths of a ge. C ontralateral reaching becomes more frequent with age both on reaching tasks during infancy (van Hof, van der Kamp, & Savelsbergh, 2002) and in “hand, eye, and ear” tasks in later childhood (Bekkering, Wohlschläger, & Gattis, 2000; Schofield, 1976; Wapner & Cirillo, 1968). Nevertheless, a clear preference for ipsilateral reaches is consistently observed in early development. In Experiment 1, infants showed an ipsilateral bias in their reaching. On l ooking B t rials, 9 0% (18 o f 2 0) o f t he i nfants made ipsilateral r eaches, wh ich w as sig nificantly more than would be expected by chance. Similar ipsilateral biases were observed on the three reaching A t rials (81.7%), and the reaching B t rials (73.7%, 14 of 19 one-handed reaches). I ntriguingly, infants’ simulation of t he experimenter’s actions mirrored t heir motor bias, as infants i n t he looking condition were significantly more likely to reach to location A than to location B wh en the experimenter had r eached ipsilaterally (8 of 10), rather than contralaterally (4 of 10) (see Figure 10.4). This result suggests that infants’ responses following observation of the ex perimenter’s reaches were not random, at least for ipsilateral reaches. In Experiment 2, an ipsilateral bias in infants’ reaching was again observed, with 85% (23 of 27) of one-handed reaches scored as ipsilateral. Furthermore, this ipsilateral bias was again mirrored by infants’ representation of the experimenter’s reaching. Perseveration was observed more often than predicted by chance when the experimenter had r eached i psilaterally o n t he A t rials (13 o f 1 5 i nfants made the error), but not when the experimenter had reached contralaterally (8 of 15 made the error). This difference between conditions was significant (see Figure 10.4).
336
Embodiment, Ego-Space, and Action
100
Search Error (% Infants)
90
Ipsilateral Contralateral
80 70 60 50 40 30 20 10 0
Exp. 1
Exp. 2
Figure 10.4 Percentage of i nfants s earching i ncorrectly fol lowing ob servation of ipsi- and contralateral reaches by t he experimenter in the looking condition of Experiments 1 and 2 (from Longo & Bertenthal, 2006).
It is very possible that the infants’ ipsilateral bias may have influenced their likelihood of covertly imitating the experimenter. A number of recent studies suggest that simulation by the mirror system is significantly st ronger wh en obs erved ac tions a re w ithin t he m otor repertoire of the observer (e.g., Calvo-Merino et al., 2005; Longo et al., in press). If an observer does not possess the motor skill to precisely and reliably perform an action, then he or she cannot simulate it with the same level of specificity as a skilled performer. Since infants have difficulty r eaching co ntralaterally, s imulation o f o bserved co ntralateral reaches should be w eaker than that for ipsilateral reaches, or perhaps absent entirely. Thus, if an observation-execution matching mechanism is operative, then infants should perseverate more often following observation of ipsi- rather than contralateral reaches to the A location by the experimenter, as we found. Although the preceding findings a re not meant t o d iscount t he relative contributions of at tention to response perseveration, at t he very least the current evidence appears to challenge the sufficiency of an attentional explanation focused on spatial coding of the hidden object. I n pa rticular, it i s not at a ll apparent how such a n ac count would ex plain wh y i nfants sh owed g reater pers everation a fter observing the experimenter reach ipsilaterally than contralaterally.
Motor Knowledge and Action Understanding
337
Other po tential ex planations i nvolving, f or ex ample, ob ject r epresentations (e.g., Munakata, 1998), have similar difficulty accounting for this effect. By contrast, a direct matching interpretation accounts for this effect in terms of infants’ own difficulties with contralateral reaching, which should lead to weaker or absent motor simulation following observed contralateral, compared with ipsilateral reaches, and consequently less perseveration. The final experiment in this series was designed to probe whether a simulative response to an observed action is initiated only when the ac tion i s per formed b y a h uman a gent o r a lso wh en i t i s per formed by a r obot or some other mechanical a gent. A s previously discussed, research with adults reveals that the execution of an action is often facilitated when observing t hat same ac tion, a nd i mpaired when observing a d ifferent action (Brass et al., 2001; Bertenthal et al., 2006). It appears, however, that this conclusion applies only when observing actions performed by human agents. For example, Kilner et al. (2003) instructed participants to make vertical or horizontal arm movements while observing either a human or a robot making the same or t he opposite a rm movements. The results showed t hat observation of incongruent arm movements interfered significantly with the performance by the observer, but this effect was limited to observation of t he human a gent. W hen obs erving t he robot, t here was no evidence of an interference effect on the performance of the observer. In order to test this same question with infants, we modified t he testing situation so t hat the human experimenter would be h idden behind a c urtain, but would be ab le t o ma nipulate t he covers a nd the toy with two large mechanical claws that were held vertically in front of the curtain. Thus, t he i nfant w as only able to obs erve t he mechanical claws, a nd even t he experimenter’s ha nds t hat g ripped the mechanical claws were not visible (see Figure 10.5). A total of 30 infants were tested, and the procedure was similar to that used in the preceding experiment. After the training trials, there were six A trials followed by one B trial. From the infants’ perspective, the hiding and finding of the toy was identical to the previous two experiments, except t hat t he ex perimenter w as n ot v isible a nd t wo m echanical claws a ppeared i n h er p lace. Unlike t he r esults f rom t he p revious experiments, t he pers everative er ror w as made b y only 4 0% (12 of 30) of the infants, which was significantly less than occurred in the previous two experiments. Moreover, the likelihood of the error was
338
Embodiment, Ego-Space, and Action
essentially the same for ipsilateral and contralateral searches by the claw. Thus, the substitution of a mechanical agent for a human agent reduced the frequency of the perseverative error. Our i nterpretation f or t his finding i s t hat inf ants’ t endency t o simulate observed ac tions is less l ikely when t he ac tion is not performed by a human. We realize, however, that the mechanical agent is less fa miliar t han t he human agent, a nd t hus fa miliarity, per s e, may be r esponsible for t hese differences in t he likelihood of a per severative sea rch e rror. I n f uture r esearch, w e p lan t o m anipulate whether the mechanical claw is observed as an independent agent or as a t ool used by the human agent. If perseverative performance is greater when the mechanical claw is perceived as a tool, the importance o f fa miliarity f or ex plaining pers everative per formance w ill be diminished, because familiarity will remain constant in both the tool and agent conditions. Although support for t his prediction awaits a n empirical te st, a recent study by Hofer, Hauf, and Aschersleben (2005) suggests that infants d istinguish be tween t ools a nd m echanical a gents. I n t his experiment, 9 -month-old i nfants d id n ot i nterpret a n ac tion per formed by a m echanical claw as goal-directed but did interpret the action as goal-directed when the mechanical claw was perceived as a tool. Presumably the tool is interpreted as goal-directed at an earlier age than the mechanical agent because infants perceive it as an
Figure 10.5 Infant observing the toy being hidden by two mechanical claws.
Motor Knowledge and Action Understanding
339
extension of the human arm, and thus are better able to simulate and understand its effects. Until r ecently, t his finding ma y ha ve s eemed a t odd s w ith t he evidence f or m irror n eurons i n t he m onkey’s b rain. W hen t hese neurons w ere first d iscovered, it w as reported that they d ischarge to goal-directed actions performed by conspecifics or humans, but not to actions performed by tools, such as a pair of pliers (Rizzolatti et al., 2001). Recently, however, Ferrari, Rozzi, and Fogassi (2005) reported identifying a po pulation of neurons in t he monkey’s ventral p remotor c ortex th at s pecifically d ischarges to g oal-directed actions executed by tools. Taken together, this evidence suggests that an action performed by a tool perceived as an extension of a human agent will be more likely to induce motor simulation than an action performed by the same tool perceived as a mechanical agent. Visual Orienting in Response to Deictic Gestures Joint attention to objects and events in the world is a necessary prerequisite for sharing experiences with others and negotiating shared meanings. As Baldwin (1995, p. 132) puts it: “ joint attention simply means the simultaneous engagement of two or more individuals in mental focus on one and the same external thing.” A critical component in establishing joint attention involves following the direction of someone else’s gaze or pointing gesture (Deak, Flom, & Pick, 2000). Both of these behaviors require that the deictic gesture is interpreted not a s t he g oal, i tself, b ut r ather a s t he m eans t o t he g oal. W hen responding to a redirection of gaze or to the appearance of a pointing gesture, the observer does not fi xate on the eyes or the hand but rather focuses on the referent of these gestures (Woodward & Guarjardo, 2002). In this case the behavior is communicative and the goal is some distal object or event (“there’s something over there”). Thus, it is necessary for the observer to predict the referent or the goal of the action from observing its execution by someone else. Until recently, the empirical evidence suggested that infants were unable t o f ollow t he d irection o f a g aze o r a po int u ntil a pproximately 9 to 12 months of age (e.g., Corkum & Moore, 1998; Leung & Rheingold, 1981; Scaife & Bruner, 1975). If, however, these behaviors are m ediated b y a n obs ervation-execution ma tching s ystem, t hen gaze-following should precede following a pointing gesture because
340
Embodiment, Ego-Space, and Action
control of eye movements and saccadic localization appear at birth or soon thereafter (von Hofsten, 2003), whereas the extension of the arm a nd i ndex finger t o f orm a po inting g esture d oes n ot a ppear until approximately 9 months of age (Butterworth, 2003). Indeed, it should be possible to show evidence of gaze-following long before 9 months of age. This prediction has now been validated. Hood, Willen, and Driver (1998) ad apted a m ethod po pularized b y P osner ( 1978, 1 980) f or studying spatial orienting. In a prototypical study, adult subjects are instructed to detect visual targets, which may appear on either side of a fixation point. Their attention can be cued to one side or the other before the target appears (e.g., by a brief but uninformative flash on that side). The consistent finding f rom t his paradigm is t hat target detection i s more r apid on t he c ued s ide, bec ause attention i s oriented in that direction. It is relevant to note that the attentional cueing preceding the orienting response is covert and does not involve an overt eye movement. As such, this attentional cueing is consistent with a simulation of an eye movement that enables prediction before the visual target appears. Hood et al. tested 4-month-old infants to determine if they would shift their visual attention in the direction toward which an adult’s eyes turn . The d irection o f per ceived g aze w as ma nipulated i n a digitized adult face. After infants oriented to blinking eyes focused straight ahead, the eyes shifted to the right or to the left. A key innovation of this paradigm was that the central face disappeared after the eyes were averted to avoid difficulties with infants disengaging their fi xation of the face. An attentional shift w as measured by the latency and direction of infants’ orienting to peripheral probes presented after t he face d isappeared. Infants oriented faster and made fewer errors when presented with a probe congruent with the direction of gaze than when presented with an incongruent probe. The se findings suggest that young infants interpret the direction of gaze as a cue to shift attention in a specified direction. It is significant to note t hat t he attentional cue was not i n t he location of t he probe, but simply pointed to t hat location. Thus, faster responding to t he spa tially co ngruent c ue r equired t hat i nfants u nderstand a t some level the meaning of the change in gaze direction in order to predict the future location of the probe. More recent research reveals that gaze following was restricted to a face in which infants observed a dynamic shift in gaze direction (Farroni, Johnson, Brockbank, &
Motor Knowledge and Action Understanding
341
Simion, 2000). When the averted gaze was presented statically, there was no evidence of i nfants following t he gaze sh ift. Intriguingly, a new report by Farroni and colleagues (Farroni, Massaccesi, Pividori, & J ohnson, 2 004) su ggests t hat t hese findings a re re plicable w ith newborn infants. With K atharina R ohlfing, we sought to extend this paradigm to study whether infants younger than 9 m onths of age would also reorient their attention in response to the direction that another person is pointing with their hand. Interestingly, Amano, Kezuka, and Yamamoto (2004) co nducted a n obs ervational st udy sh owing t hat 4-month-old infants responded differently to the pointing done by an ex perimenter a nd t heir mothers. I n order to redirect attention, infants more often needed t he combination of eye gaze a nd pointing while interacting with the experimenter than while interacting with their mothers. W hen i nteracting with their mothers, i nfants were ab le t o f ollow t he po inting g esture a lone wh ile t he m others maintained eye contact. One interpretation for these findings is that infants are more familiar with their mothers’ gestures and thus are more likely to correctly interpret them. A different interpretation is that i nfants a re more l ikely to follow t he pointing gesture of t heir mothers because their mother’s face is more familiar and thus it is easier for them to disengage from the face. By adapting the same paradigm used by Hood et al. (1998) to study gaze-following, we were able to eliminate some of the possible confounds present in previous studies of pointing because infants did not have to disengage from a central stimulus of a face. Infants between 4.5- and 6.5-months-old were tested. They were shown a series of computerized stimuli on a projection screen while sitting o n t heir pa rents’ la ps. E ach t rial co nsisted o f t he f ollowing sequence of events (see Figure 10.6): 1.
The hand appeared on the screen with fingers oriented upward. The fingers waved up and down and were accompanied by a voice saying “look baby, look!” to recruit the baby’s attention. This segment was played until the infant fi xated the hand. 2 . Once the finger waving ended, t he ha nd was s een t ransforming into a c anonical pointing gesture that moved a sho rt distance in the direction of the pointing finger. This segment lasted 1000 ms. 3. After the hand disappeared, a digitized picture of a toy appeared randomly on the left or right side of the screen. On half of the trials, this probe was congruent with the direction of the point and
342
Embodiment, Ego-Space, and Action
1. Finger waving
2. Finger pointing (1000 ms)
3. Probe appears
Figure 10.6 Stimulus sequence for e ach t rial. 1. Fi ngers wave up a nd down. 2 . Index finger points toward left or right side of screen. 3. Probe appears on left or right side. (This sequence corresponds to an incongruent trial.)
on the other half of the trials it was incongruent with the pointing direction. The probe rema ined v isible for 3 s a nd was accompanied by a voice saying “wow!” Two different toys (a clown and an
Ernie puppet) were presented i n a r andomized order (Figure 10.6).
Based on the videotapes of the infants’ behavior, we measured the response t ime to s hift a ttention i n t he d irection o f t he per ipheral probe. The probe that appeared in the direction cued by the pointing finger is referred to as the congruent probe, a nd t he probe t hat appeared in the opposite direction is referred to as the incongruent probe. A total of 20 infants completed an average of 26 trials (S =6.2, range: 10–32). As can be seen in Figure 10.7 (Dynamic condition), infants oriented toward the congruent probe significantly faster than they d id t oward t he i ncongruent p robe. These r esults su ggest t hat infants as young as 4.5 months of age respond to a dynamic pointing gesture by shifting their visual attention to a shared referent. The second experiment tested whether the movement of the pointing finger w as n ecessary t o el icit t he sh ift in attention or whether a st atic po inting g esture w ould be su fficient. A n ew s ample o f 14 infants between 4.5 to 6.5 months of age was tested with t he same
Motor Knowledge and Action Understanding
343
Figure 10.7 Mean response times for infants to orient toward the congruent and incongruent probes. In the dynamic condition, infants were shown a pointing finger that moved a short distance in the same direction that the fi nger was pointing. In the static condition, infants were shown a pointing fi nger that didn’t move.
procedure, except that the pointing finger remained stationary when it appeared on t he screen for 1000 ms . I nfants completed a n average of 22 trials (SD=7.7, range: 10–32). Unlike t he responses to t he dynamic pointing finger, infants showed no difference in responding to the congruent and incongruent probe (see Figure 10.7, Static condition). It thus appears that observation of an action and not just the final state i s n ecessary f or y oung i nfants t o f ollow a po int, a nalogous to t he findings of i nfants following e ye gaze (Farroni e t a l., 2 000). Two possible interpretations for these results are that the movement associated w ith t he gesture i ncreased t he s alience of t he st imulus, or increased the likelihood of visual tracking in the direction of the moving finger. Either of these two possibilities would bias infants to shift their attention in the direction of the pointing gesture. If the results were exclusively a function of following the principal direction of the moving finger, then we would expect a reversal in the reaction time results when the finger moved backwards rather than forwards. The n ext ex periment te sted t his i nterpretation. A t otal of 25 i nfants between 4.5 and 6.5 months of age were tested in two conditions. In t he forward condition, t he d irection of t he pointing finger and the movement of the finger were compatible, whereas in
344
Embodiment, Ego-Space, and Action
Figure 10.8 Mean response times for infants to orient toward the congruent and incongruent probes. In the forward condition the finger moved in the same direction it w as pointing. In the backward condition the finger moved in the opposite direction that it was pointing.
the backward condition, the direction of the pointing finger and the movement of the finger were incompatible. Half of the trials in each condition involved a congruent probe and half of the trials involved an incongruent probe. Infants completed an average of 25 trials (SD=5.68, range: 14–31). In t he f orward co ndition, i nfants sh owed t he s ame adv antage f or responding to the congruent vs. incongruent probe that they showed in the first experiment (see Figure 10.8). In the backward condition, infants were ex pected to show t he opposite pattern of responses i f they were responding only to the direction of movement and not to the direction of the pointing finger. As can be observed in Figure 10.8, this pattern of results was not obtained—infants responded just as fast to the congruent as to the incongruent probe. The se results thus suggest t hat i nfants as young as 4.5 months of age do not respond to a sha red referent simply by following the direction of movement irrespective of the direction of the point. A third possibility for why infants were able to follow the direction of a dy namic point is that the perceived point is mapped onto infants’ own motor representations for pointing. This hypothesis is supported by at least two sources of evidence. In the case of pointing, the goal of the observed action is to reorient the attention of another
Motor Knowledge and Action Understanding
345
person so t hat a n ob ject beco mes t he sha red f ocus o f a ttention (Woodward & G uajardo, 2 002). Re search w ith ad ults r eveals t hat action s imulation fac ilitates t he ab ility o f obs ervers t o p redict t he effect of an action (e.g., Chaminade et al., 2001; Knoblich, & Flach, 2001; L ouis-Dam, Orliaguet, & C oello, 1999; Orliaguet, K andel, & Boe, 1997). In addition, recent research suggests that slightly older infants, 6 to 12 months of age, also understand the goals or effects of an action (e.g., Gergely, Nádasdy, Csibra, & Biró, 2003; Luo & Ba illargeon, 2006; Woodward, 1998). If visual attention is the principal factor responsible for the preceding results, then it should not matter whether the action is carried out by a h uman or a m echanical a gent a s long a s both a gents are equally salient. If, however, the preceding results are a f unction of ac tion simulation, t hen t he d istinction be tween a h uman a nd a mechanical agent should ma ke a d ifference. Recall t hat simulation depends o n t he per ceived s imilarity be tween t he obs erved ac tion and the specific motor responses available to the observer. In the last experiment, we put this hypothesis to the test by repeating t he p revious ex periment w ith a st ick m oving t o t he l eft or t o the right in place of a h uman ha nd. The stick initially appeared to be pointing toward t he i nfant a nd w aved up a nd down i n a ma nner similar to the fingers waving up and down. After the infant fixated the stick, it was rotated so that it pointed toward the left or the right side of t he screen, a nd t hen moved a sh ort d istance either i n the direction it was pointing or in the opposite direction. This movement lasted 1000 ms. Similar to the previous experiments, the stick then disappeared and was replaced by a toy probe that appeared on the left or right side of the screen. A new sample of 18 infants between 4.5 and 6.5 months of age was tested. Infants completed an average of 22 trials (SD=5.3, range: 12– 30). In both t he forward a nd back ward conditions, infants did not show a sig nificant response time advantage to either t he congruent direction of pointing or the congruent direction of movement (i.e., forward m ovement: co ngruent co ndition o r back ward m ovement: incongruent condition) (see Figure 10.9). It is possible that this result is attributable t o t he dec reased fa miliarity or s alience of t he st ick, because, on average, response times were higher. However, this factor, by itself, is unlikely sufficient to explain the null results, because the critical finding is not the absolute difference in response times, but rather the relative difference in response times. Accordingly, we conclude that infants, like adults, are more likely to predict the effect
346
Embodiment, Ego-Space, and Action
Figure 10.9 Stick experiment. Mean response times for infants to orient toward the congruent and incongruent probes. In the forward condition the finger moved in the same direction it was pointing. In the backward condition the finger moved in the opposite direction that it was pointing.
of a n a ction w hen it i s p erformed by a h uman e ffector, s uch a s a hand, as opposed to a mechanical effector with little resemblance to the morphology or movements of the human action. In su m, t hese results on following t he d irection of a po int su ggest that infants as young as 4.5 months of age are capable of shifting their attention in response to this action as long as the action is performed by a h uman agent. This finding t hus appears consistent with infants predicting the goal of the deictic gesture, but there is a problem with attributing this prediction to simulating the observed action. A n umber of studies agree that canonical pointing emerges on average at 11 months of age, although some babies as young as 8.5 months of age have been observed to point (Butterworth & M orissette, 1996). If action simulation requires that the observed action is already in the motor repertoire of the observer, then the interpretation of simulation as the basis for predicting the referent cannot be correct. There is, however, another way to interpret the basis for simulation which is more compatible with the current findings. Infants as young as 4.5 months of age are not able to differentiate t heir ha nd and fingers so that only the index finger is extended in the direction of t he a rm, but it is entirely possible t hat t he ex tension of t he a rm
Motor Knowledge and Action Understanding
347
and hand is sometimes performed for the same purpose as a pointing gesture at an older age. Consistent with this hypothesis, Leung and R heingold ( 1981) d irectly co mpared 1 0.5- t o 1 6.5-month-old infants’ arm extensions with open or closed hands and arm extensions with index finger extended toward objects located at a distance from where they were sitting. The authors report that at the younger ages the majority of responses were reaches rather than pointing gestures to the distal objects. Although reaches are typically associated with an instrumental response, they were interpreted in this context as serving a social communicative function because they accompanied looking at and vocalizing to the mother. The preceding evidence thus suggests that two actions executed by infants can share the same goal even though the means for achieving that goal differ. Interestingly, most of the evidence for an observationexecution matching system is based on shared goals and not shared means to achieve these goals. Consider, for example, the previously discussed n europhysiological findings on mirror neurons. It was shown that these neurons discharge when observing or executing the same goal-directed action regardless of whether or not the specific movements ma tched ( Rizzolatti e t a l., 1 996). L ikewise, beha vioral research with human adults shows that response priming following observation of an action depends primarily on the observation of the goal and not the specific means to the goal. For example, Longo et al. (in press) tested human adults in a choice reaction time task involving imitation of an index or middle finger tapping downwards. The results showed equivalent levels of response priming following t he observation of a b iomechanically possible or impossible finger tapping movement. In this example, the movements were different, but the goal of tapping downwards was the same in both conditions. If the movements associated with the observation of a goal-directed action need not be identical in order to simulate an observed action, then it may be sufficient that the two representations share some similar features. Indeed, this is the basis for the theory of event coding as presented by Hommel, Müsseler, Aschersleben, and Prinz (2001). It is well established that infants are capable of predictive reaching for moving objects by 4.5 months of age (Bertenthal & von Hofsten, 1998; R ose & B ertenthal, 1995). The m otor r epresentation f or predictive reaching may be su fficient to make contact with the goal of specifying a distal referent for young infants to index and predict the goal of the point. Clearly, more research is needed to fully evaluate this h ypothesis, b ut t he pos sibility t hat a co mmon code u nderlies
348
Embodiment, Ego-Space, and Action
the observation and execution of a ma nual gesture for specifying a distal referent by 4.5 months of age is certainly consistent with the available evidence. Perception of Point-Light Displays of Biological Motion The first step i n u nderstanding h uman ac tions is t o per ceptually organize the constituent movements in a manner consistent with the causal structure of the action. We have relied on moving point-light displays depicting biological motions to study this question, because observers a re t hen f orced t o per ceptually o rganize t he st imuli i n terms of t he movements of t he l imbs w ithout a ny contextual c ues specified by featural information. In spite of the apparent ambiguity in these displays, adult observers are quite adept at extracting a coherent a nd unique structure f rom t he moving point-lights (Bertenthal & P into, 1994; Cutting, Moore, & M orrison, 1988; Johansson, 1973; Proffitt, Bertenthal, & Roberts, 1984). This conclusion is true even when the stimulus displays are masked by a large number of additional point-lights that share the same absolute motion vectors w ith t he point-lights comprising t he biological m otion d isplay. I n one ex periment (Bertenthal & P into, 1994) obs ervers w ere i nstructed t o j udge wh ether t he b iological motion display depicted a person walking to the right or to the left. The t arget w as comprised of 11 point-lights t hat moved i n a ma nner consistent with the spatiotemporal patterning of a person walking, but was masked by an additional 66 moving point-lights that preserved the absolute motions and temporal phase relations of the stimulus display (see Figure 10.10). In spite of the similarity between the point-lights comprising the target and those comprising the distracters, ad ult obs ervers d isplayed v ery h igh r ecognition r ates f or judging t he correct d irection of t he g ait. This ju dgment c ould not be a ttributed t o t he per ception o f i ndividual po int-lights, bec ause recognition performance declined to chance levels when the stimuli were rotated 180°. I f per formance was ba sed on t he movements of individual p oint-lights, th en th e o rientation o f th e d isplay s hould not have mattered. Apparently, obs ervers were de tecting a n orientation specific spatiotemporal st ructure of t he moving point-lights because it matched their internal representation which was limited by ecological constraints to a person walking upright.
Motor Knowledge and Action Understanding
349
Figure 10.10 Left panel depicts point-light walker display appearing to walk to the right. (Outline of human form is drawn for illustrative purposes only.) Right panel depicts point-light walker display masked by moving point-lights preserving the a bsolute mot ions a nd t emporal ph ase re lations of t he t arget s timulus (from Bertenthal & Pinto, 1994).
This orientation specificity appears to be the norm with regard to the perception of biological motions (e.g., Pavlova & Sokolov, 2000; Sumi, 1984). One intriguing interpretation for this repeated finding is that motor experience contributes to the perception of biological motions. Of course, this finding is also consistent with visual experience contributing to the perception of biological motions in a pointlight display. Although there is still insufficient evidence to reach any firm conclusions, some recent experiments by Shiffrar and colleagues (Jacobs, P into, S hiffrar, 2 005; L oula, Pr asad, Ha rber, & Sh iffrar, 2005; S hiffrar & P into, 2 002; S tevens, F onlupt, Sh iffrar, & Dec ety, 2000) have suggested that motor experience provides a unique contribution t o t he per ception o f b iological m otions b y ad ults. I n t he remainder of this section, we will explore whether the same conclusion holds for infants’ perception of biological motions. In a s eries o f st udies beg un i n t he 1 980s, m y co lleagues a nd I (Bertenthal, Proffitt, & Cutting, 1984; Bertenthal, Proffitt, & Kramer, 1987; Bertenthal, Proffitt, Kramer, & Spetner, 1987; Bertenthal, Proffitt, Spetner, & Thomas, 1985; Pinto & Bertenthal, 1992) showed that infants are sensitive to the spatial and temporal structure in biologi-
350
Embodiment, Ego-Space, and Action
cal m otion d isplays. F or ex ample, 3 - a nd 5 -month-old i nfants a re able to discriminate a canonical point-light walker display from one in which the spatial arrangement of the point-lights is scrambled or the temporal phase relations of the point lights are perturbed (Bertenthal, Proffitt et al., 1987a; Bertenthal, Proffit, Spetner et al., 1987b). Similar to adults, infants show an orientation specific response by 5 months o f a ge ( Bertenthal, 1 993; P into & B ertenthal, 1 993; P into, Shrager, & Bertenthal, 1994). At 3 months of age, infants discriminate a c anonical f rom a per turbed po int-light w alker d isplay wh en t he displays are presented upright or upside down (Pinto et al., 1994). By 5 m onths of age, i nfants only d iscriminate t hese d isplays when they a re p resented u pright ( Pinto e t a l., 1 994). O ur i nterpretation for t his de velopmental sh ift i s t hat i nfants a re responding to local configural differences at 3 months of age, but they are responding to global configural differences at 5 m onths of age (Bertenthal, 1993). In essence, the local configural differences can be detected independent of orientation. Converging e vidence f or t his i nterpretation co mes f rom t wo experiments sh owing t hat i nfants d o n ot d iscriminate po int-light walker d isplays r equiring a g lobal per cept u ntil 5 m onths o f a ge. In the first of these experiments (Pinto, 1997), 3- and 5-month-old infants were tested for discrimination of two point-light walker displays w ith a hab ituation pa radigm. I n t his pa radigm, i nfants a re presented with one of the two stimulus displays for a series of trials until their visual attention to the stimulus declines significantly, and then t hey a re p resented w ith a n ovel st imulus d isplay f or t wo t rials. Increased visual attention to the novel display is interpreted as discrimination. Infants were presented with a point-light walker display translating across the screen during the habituation phase of the experiment. Following habituation, infants were shown a t ranslating point-light walker display in which the point-lights corresponding to the upper portion of the body were spatially shifted relative to the point-lights corresponding to the lower portion of the body. According to adult observers, this spatial displacement resulted in the perception of two point-light walkers in which one appeared to have its legs occluded and the other appeared to have its torso, arms, and head occluded. If i nfants d id n ot per ceive a po int-light w alker d isplay a s a g lobal percept, then they would be less likely to detect this perturbation because bo th t he spa tially a ligned a nd sh ifted d isplays w ould be
Motor Knowledge and Action Understanding
351
perceived as a n umber of subgroupings of point lights. If, however, infants d id per ceive t he hab ituation st imulus d isplay a s a u nitary object, t hen t he spa tially sh ifted t ranslating w alker wou ld b e d iscriminated f rom t he p receding t ranslating po int-light w alker. The results revealed that 3-month-old infants did not discriminate these two displays, but 5-month-old infants did. In the second study, Amy Booth, Jeannine Pinto, and I conducted two ex periments testing 3 - a nd 5 -month-old i nfants’ sensitivity to the symmetrical patterning of human gait (Booth, Pinto, & Bertenthal, 2 001). I n t his c ase, s ensitivity t o t he pa tterning o f t he l imbs implies t hat d iscrimination be tween d isplays co uld n ot oc cur o n the basis of the perceived structure of any individual limb. If infants were primarily s ensitive to t his patterning, t hen we predicted t hat they w ould n ot d iscriminate be tween a po int-light d isplay dep icting a perso n walking a nd a perso n r unning bec ause both d isplays share the same symmetrical gait pattern (even though they differ on many additional dimensions). By contrast, infants should discriminate between two displays in which the symmetrical phase relations of the limbs were perturbed. A habituation paradigm was again used to test infants’ discrimination of the point-light displays. In Experiment 1 infants were presented with a po int-light display depicting a perso n running and a second display depicting a person walking (see Figure 10.11). Unlike previous experiments in which the stimulus displays were synthesized with a computational algorithm, the stimulus displays in this study were created with a m otion analysis system that tracked and stored the three-dimensional coordinates of discrete markers on the major joints of a person walking or running on a treadmill. Infants’ discrimination per formance r evealed t hat 3 -month-old i nfants
Figure 10.11 Four frames of the walker (in gray) superimposed over the runner (in bl ack). Point-lights a re c onnected for e ase of c omparison (from B ooth e t a l., 2001).
352
Embodiment, Ego-Space, and Action
(a)
(b)
Figure 10.12 (a) Top panel; Walker vs. Runner. Mean looking times on t he last three habituation trials and on the two test trials as a function of age. The stimuli included a point-light walker and a point-light runner, each of which served as the habituation stimulus for half of the infants. (b) Bottom panel; Walker vs. PhaseShifted Runner. Mean looking times on the last three habituation trials and on the two test trials as a function of age. The stimuli included a point-light walker and a phase-shifted point-light runner, each of which served as the habituation stimulus for half of the infants (from Booth et al., 2001).
Motor Knowledge and Action Understanding
353
discriminated t he w alker a nd t he r unner, wh ereas 5 -month-old infants did not (see Figure 10.12a). These results are consistent with the possibility that 3-month-old infants were responsive to the differences i n t he speed a nd joint a ngles of t he t wo d isplays, but t hat 5-month-old i nfants w ere m ore s ensitive t o t he s ymmetrical pa tterning of both d isplays a nd t herefore were less attentive to lowerlevel differences. In order to confirm this interpretation, a second experiment was conducted to assess whether 3- and 5-month-old infants were sensitive to d ifferences i n t he s ymmetrical pa tterning o f g ait. I nfants were tested for discrimination of a canonical point-light walker and a point-light runner in which the point-lights corresponding to the right leg and the left arm were temporally phase shifted by 90o. The effect of this manipulation was to create a display in which one pair of di agonally opposite limbs reversed direction at 90 o a nd 2 70o of the gait cycle, whereas the other pair of limbs reversed direction at 0o and 180o of the gait cycle (see Figure 10.13). The results from this experiment revealed that both 3- and 5-month-old infants discriminated t he t wo point-light d isplays (see Figure 10.12b). Presumably, the younger i nfants d iscriminated t hese d isplays for t he same reason that they discriminated the two displays in the first experiment, although st rictly spe aking w e c annot r ule o ut t he pos sibility t hat they also detected the change in the symmetrical patterning of the displays. By contrast, the older infants had not discriminated the two displays in the first experiment, and thus their discrimination performance can only be explained in terms of detecting the changes in the gait pattern.
Figure 10.13 Four frames of the walker (in gray) superimposed over the phaseshifted runner (in black). Point-lights are connected for ease of comparison (from Booth et al., 2001).
354
Embodiment, Ego-Space, and Action
Taken t ogether, t he r esults f rom t hese la st t wo ex periments a re illuminating for a number of reasons. First, they confirm that by 5 months of age, infants respond perceptually to the global structure of a moving point-light display. Second, the results from the second study suggest that infants by 5 months of age are sensitive to a fairly subtle h igher-level property of t he d isplays—symmetrical pa tterning of the gait pattern. One interpretation for these findings is that the visual system becomes more spatially integrative with development and also becomes more sensitive to the temporal properties of moving point-light displays. A second interpretation is that infants’ sensitivity to a point-light walker display as a u nitary object or as a hierarchical nesting of pendular motions w ith dy namic s ymmetry is specifically related to this stimulus depicting a familiar event. We have previously argued that the orientation specificity of our findings suggests that visual experience is an important factor in the perceptual organization of these displays (Bertenthal, 1993). The third and final interpretation is that a globally coherent and hierarchically nested moving object with bilateral symmetry corresponds to an internal representation that is relevant not only to the perception of human gait, but to the production of this behavior as well. Previous research reveals that infants capable of stepping on a split treadmill are biased to maintain a 180° phase relation between their two legs even when both sides of the treadmill are moving at different speeds (Thelen, U lrich, & N iles, 1 987). I n a l ongitudinal investigation of t his phenomenon, Thelen a nd Ulrich (1991) report that infants’ performance shows rapid improvement between 3 a nd 6 months of age (see Figure 10.14). Interestingly, this is the same period of development during which infants show increased perceptual sensitivity to the global coherence of biological motions, especially as defined by bilateral symmetry or 0˚ and 180˚ phase modes. This correspondence in age gives further credence to the suggestion of a shared representation between the perception and production of biological motions. As previously discussed by Bertenthal and Pinto (1993), t he perception a nd production of human movements sha re similar processing constraints relating to the phase relations of the limb movements; thus the development of one skill should facilitate the development of the other skill and vice versa. The evidence for a shared representation is also supported by the results from a neuroimaging study conducted by Grèzes, Fonlupt et al.(2001) showing t hat perception of point-light walker d isplays by
Motor Knowledge and Action Understanding
355
Figure 10.14 Mean number of a lternating steps by t rial and age, pooled across all infants. Trials 1 and 9 are baseline trials (from Thelen & Ulrich, 1991).
adults activates an area in the occipital-temporal junction as well as an area in the intraparietal sulcus. Whereas the former cortical area is associated with the perception of objects, the latter area is part of the neural circuit involved in the planning and execution of actions. Converging evidence supporting this finding has also been reported by Saygin, Wilson, Hagler, Bates, & Sereno (2004). It is thus possible that the perception of these point-light displays by infants also activates t he m otor s ystem wh ich co nfers v ia s imulation a n a ppreciation o f t he d ifferences be tween a c anonical a nd a n u nnatural g ait pattern. One final source of evidence to support this conjecture comes from an experiment testing infants’ discrimination of a canonical pointlight cat and a phase-shifted point-light cat (Bertenthal, 1993). Similar to the findings on discrimination of inverted point-light displays, 3-month-old infants discriminated these two point-light cat displays, but neither 5- nor 8-month-old infants discriminated these displays. Presumably, 3-month-old infants discriminated these displays based on local differences that were not specific to the identity of the stimulus. Although it is reasonable to suggest that older infants did not
356
Embodiment, Ego-Space, and Action
discriminate these displays at a global level because they lacked sufficient v isual ex perience, co rrelative e vidence i s i nconsistent w ith this hypothesis (Pinto et al., 1996). When infants were divided into two groups based on whether one or more cats or dogs lived in their home, the results revealed absolutely no difference in discrimination performance as a function of whether or not infants had daily visual experience with a cat or dog. Thus, it appears that infants’ insensitivity to t he spatiotemporal per turbations i n t hese d isplays may have not be en attributable to t heir l imited v isual ex perience, but r ather to their limited motor knowledge of quadrupedal gait. Interestingly, infants begin crawling on hands-and-knees between 7 and 9 months of age, which suggests that crawling experience as opposed to visual experience may have been a better predictor of their discrimination performance. Conclusions The findings su mmarized i n t his chapter provide some of t he first evidence to suggest that an observation-execution matching system is functional during infancy and contributes to action understanding through: (1) online simulation of observed actions; (2) prediction of the effects of observed actions; and (3) perceptual organization of component movements comprising an action. In order to avoid any confusion, we want to emphasize that suggesting a possible contribution o f t he m otor s ystem f or t he per ception a nd u nderstanding of actions is not meant to exclude the very important contributions of m ore t raditional m echanisms ( e.g., v isual a ttention a nd v isual experience) f or u nderstanding ac tions. F urthermore, t he e vidence presented is by no means definitive, but it is buttressed by recent neurophysiological, neuroimaging, and behavioral findings suggesting a shared representation for t he observation a nd execution of ac tions by primates and human adults (e.g., Bertenthal et al., 2006; Decety & Grèzes, 1999; Rizzolatti et al., 2001; Rizzolatti & Craighero, 2004). The spec ialized c ircuitry a nd a utomatic ac tivation f ollowing observation of an action suggest that the neural mechanisms mediating these functions may be part of the intrinsic organization of the brain. Indeed, this hypothesis is supported by evidence showing neonatal imitation (Meltzoff and Moore, 1994). The best known example of imitation in young infants is the evidence for oro-facial gestures
Motor Knowledge and Action Understanding
357
(mouth opening and tongue protrusion) by infants who have never seen their own face (Meltzoff & Moore, 1977). Unlike true imitation (cf. Tomasello & Call, 1997), only actions already in the motor repertoire can be facilitated. Still, visual information about the perceived action must somehow be mapped onto the infants’ own motor representations (Meltzoff & Moore, 1994). In essence, this is the function of an observation-execution matching system. Corollary evidence on t he prenatal and early postnatal development of these oro-facial gestures is consistent with this conjecture. It is well established that fetuses perform mouth opening and closing and tongue protrusion while in utero (Prechtl, 1986). Thus, these gestures are already part of the neonate’s behavioral repertoire at birth. The evidence also suggests that neonates are more likely to match the modeled gesture after it’s been presented for some period of time (~40 s), rather than immediately (Anisfeld, 1991). This finding is consistent with a motor priming explanation in which activation would be expected to build up gradually as the gesture is modeled, as opposed to an explanation claiming the availability of higher-level processes from birth (cf. Meltzoff and Moore, 1994). Finally, the empirical evidence suggests that the likelihood of automatic imitation increases until around 2 months of age, and then declines and virtually disappears by 5 months of age (Fontaine, 1984; Maratos, 1982). It is during this same window of time, approximately 2 and 6 months of age, that neonatal re flexes a re g radually i nhibited (McGraw, 1943), su ggesting that similar cortical inhibitory processes may serve to suppress neonatal imitation. As t he spo ntaneous el icitation o f t hese o vert fac ial g estures becomes gradually inhibited with age, they do not disappear entirely. Instead they become subject to volitional control such that the infant determines when and how they are elicited—imitation is no longer automatic, and the observation of a facial gesture will not lead to its execution by the infant. Thus, rather than reflecting a precocial social ability of the infant as suggested by Meltzoff and Moore (1994), neonatal imitation may reflect a striking inability of the infant to inhibit activation of the motor system by direct matching mechanisms. (See Nakagawa, S ukigara, a nd B enga (2003) f or so me p reliminary e vidence supporting this interpretation.) Similar compulsive imitation is observed in adults after lesions of areas of the frontal lobe involved in inhibitory control (Lhermitte, Pillon, & Serdaru, 1986), and even in healthy adults when attention is diverted (Stengel, 1947).
358
Embodiment, Ego-Space, and Action
Although overt imitation of facial gestures ceases with the development of inhibition, covert imitation continues and provides specific k nowledge about t hese gestures when observed i n others. We suggest that this same developmental process is played out at different ages for many other important behaviors (e.g., direction of gaze, visually directed reaching and grasping, vocalizations of sounds). As these behaviors are practiced, the infant develops greater control of their ex ecution a s w ell a s k nowledge o f t heir e ffects o r o utcomes. The development of these motor schemas enables infants to covertly simulate and predict the effects of similar actions performed by others. This reliance on the developing control of self-produced actions explains why action understanding continues to develop throughout the lifespan. Finally, t he findings reviewed in t his chapter a re relevant to t he current debate in the literature regarding the development of action understanding. The e arly de velopment o f r epresenting ac tions a s goal-directed ha s be en st udied f rom t wo d ifferent t heoretical perspectives: ( 1) ac tion u nderstanding i s r eciprocally co upled t o t he capacity t o produce g oal-directed ac tions ( Hofer, Ha uf, & A schersleben, 2 005; L ongo & B ertenthal, 2 006; S ommerville, Woodward, & Needham, 2005), or (2) recognizing, interpreting, and predicting goal directed actions is an innately based, abstract, and domain-specific representational system, specialized for identifying intentional agents or for representing and interpreting actions as goal-directed (e.g., Ba ron-Cohen, 1 994; C sibra & G ergely, 1 998; G ergely e t a l., 1995; P remack, 19 90). The first perspec tive i s co nsonant w ith t he views discussed in this chapter. The seco nd perspec tive s uggests that i nfants a re i nnately s ensitive to abstract be havioral c ues (s uch as self-propulsion, direction of movement or eye gaze) that indicate agency, intentionality, or goal-directedness, irrespective of previous experience with the types of agents or actions that exhibit these cues. Infants are presumed sensitive to unfamiliar actions of humans or unfamiliar agents with no human features from very early in development as long as the actions are consistent with one or more of the proposed abstract cues. Although the findings presented here cannot resolve this controversy, at the very least they cast doubt on the assertion that understanding goal-directed actions is based on an abstract representation that is independent of whether the agents are human or mechanical. In both object search a nd follow t he point st udies, i nfants showed
Motor Knowledge and Action Understanding
359
different levels of responding to human and mechanical agents. Moreover, the evidence reviewed on infants’ sensitivities to biological motions suggests that the perceived structure of a point-light display is significantly greater when the display depicts a human action as opposed to a fa miliar or u nfamiliar quadrupedal ac tion. A s we and o ur co lleagues co ntinue t o i nvestigate t he e vidence f or co mmon coding of observed and executed actions by infants, we hope to develop a more finely nuanced theory that will reveal what specific knowledge abo ut ac tions i s i nnate a nd wha t k nowledge de velops with age and experience. References Anisfeld, M. (1991). Neonatal imitation. Developmental Review, 11, 60–97. Amano, S., Kezuka, E ., & Yamamoto, A. (2004). Infant shifting attention from an adult’s face to an adult’s hand: A precursor of joint attention. Infant Behavior and Development, 27, 64–80. Baldwin, D. A. (1995). Understanding the link between joint attention and language. In C. Moore & P. Dunham (Eds.), Joint attention: Its origins and role in development (pp. 131–158). Hillsdale, NJ: Erlbaum. Baron-Cohen, S. (1994). How to build a baby that can read minds: Cognitive mechanisms in mindreading. Cahiers de Psychologie Cognitive/ Current Psychology of Cognition, 13, 1–40. Beardsworth, T., & Buckner, T. (1981). The ability to recognize oneself from a video recording of one’s movements without seeing one’s body. Bulletin of the Psychonomic Society, 18, 19–22. Bekkering, H., Wohlschlager, A., & Gattis, M. (2000). Imitation of gestures in children is goal-directed. Quarterly Journal of E xperimental Psychology, 53A, 153–164. Bertenthal, B . I . (1993). Perception of biomechanical motions by i nfants: Intrinsic image and knowledge-based constraints. In C. E. Granrud (Ed.), Carnegie symposium on cognition: Visual perception and cognition (pp. 175–214). Hillsdale, NJ: Erlbaum. Bertenthal, B. I., Longo, M. R., & K osobud, A. (2006). Imitative response tendencies f ollowing obs ervation o f i ntransitive ac tions. Journal of Experimental P sychology: H uman P erception an d P erformance, 3 2, 210–225. Bertenthal, B. I., & P into, J. (1993). Complimentary processes in t he perception a nd production of human movements. I n L . B . Sm ith & E . Thel en (Eds.), A dynamic systems approach to d evelopment: Applications (pp. 209–239). Cambridge, MA: MIT Press.
360
Embodiment, Ego-Space, and Action
Bertenthal, B. I., & Pinto, J. (1994). Global Processing of Biological Motions. Psychological Science, 5, 221–225. Bertenthal, B. I., Proffitt, D. R., & Cutting, J. E. (1984). Infant sensitivity to figural coherence in biomechanical motions. Journal of Experimental Child Psychology, 37, 213–230. Bertenthal, B. I., Proffitt, D. R., & Kramer, S. J. (1987). Perception of biomechanical motions by i nfants—Implementation of various processing constraints. Journal of E xperimental Psychology: Human Perception and Performance, 13, 577–585. Bertenthal, B . I ., P roffitt, D . R ., K ramer, S . J., & Sp etner, N. B . ( 1987). Infants’ en coding o f k inetic d isplays v arying i n rel ative c oherence. Developmental Psychology, 23, 171–178. Bertenthal, B. I., Proffitt, D. R., Spetner, N. B., & Thomas, M. A. (1985). The development o f i nfant se nsitivity t o b iomechanical m otions. Child Development, 56, 264–298. Bertenthal, B. I., & Von Hofsten, C . (1998). Eye, head a nd t runk control: The foundation for manual development. Neuroscience and Biobehavioral Reviews, 22, 515–520. Blakemore, S . J., & De cety, J. ( 2001). F rom t he p erception o f ac tion to the u nderstanding o f i ntention. Nature Re views N euroscience, 2 , 561–567. Booth, A., Pinto, J., & Bertenthal, B. I. (2002). Perception of the symmetrical patterning of human ga it by i nfants. Developmental Psychology, 38, 554–563. Brass, M., Bekkering, H., & Prinz, W. (2001). Movement observation affects movement e xecution i n a si mple re sponse t ask. Acta Ps ychologica, 106, 3–22. Brass, M., Bekkering, H., Wohlschlager, A., & P rinz, W. (2000). Compatibility b etween obs erved a nd e xecuted finger movements: C omparing s ymbolic, spa tial, a nd i mitative c ues. Brain and C ognition, 4 4, 124–143. Brass, M., Zysset, S., & von Cramon, D. Y. (2001). The inhibition of imitative response tendencies. NeuroImage, 14, 1416–1423. Bruner, J. S. (1969). Eye, hand, and mind. In D. Elkind & J. H. Flavell (Eds.), Studies in cognitive development: Essays in honor of Jean Piaget (pp. 223–235). Oxford: Oxford University Press. Buccino, G., Binkofski, F., Fink, G. R., Fadiga, L., Fogassi, L., Gallese, V. et al. (2001). Action observation ac tivates premotor a nd pa rietal a reas in a somatotopic manner: An fMRI study. European Journal of Neuroscience, 13, 400–404. Butterworth, G. (2003). Pointing is t he royal road to l anguage for babies. In S. Kita (Ed.), Pointing: Where language, culture, and cognition meet (pp. 61–83). Mahwah, NJ: Erlbaum.
Motor Knowledge and Action Understanding
361
Butterworth, G., & Morisette, P. (1996). Onset of pointing and the acquisition of language in infancy. Journal of Reproductive and Infant Psychology, 14, 219–231. Calvo-Merino, B., Glaser, D. E., Grezes, J., Passingham, R. E., & Haggard, P. ( 2005). A ction obs ervation a nd ac quired m otor s kills: A n f MRI study with expert dancers. Cerebral Cortex, 15, 1243–1249. Castiello, U ., Lu sher, D ., Ma ri, M ., E dwards, M ., & H umphreys, G . W. (2002). Obs erving a h uman o r a rob otic ha nd g rasping a n ob ject: differential motor priming effects. In W. Prinz & B. Hommel (Eds.), Common mechanisms in perception and action: Attention and performance (Vol. 12, pp. 315–333). Oxford: Oxford University Press. Chaminade, T., Meary, D., Orliaguet, J.-P., & De cety, J. ( 2001). Is p erceptual anticipation a motor simulation? A PET study. NeuroReport, 12, 3669–3674. Corkum, V., & Moore, C. (1998). The origins of visual attention in infants. Developmental Psychology, 34, 28–38. Csibra, G ., & G ergely, G . ( 1998). The tele ological o rigins o f m entalistic action explanations: A developmental hypothesis. Developmental Science, 1, 255–259. Csibra, G., G ergely, G., Bi ro, S ., Koos, O., & B rockbank, M . (1999). G oal attribution without agency cues: The perception of “pure reason” in infancy. Cognition, 72, 237–267. Cutting, J. E ., Moore, C., & M orrison, R. (1988). Masking the motions of human gait. Perception and Psychophysics, 44, 339–347. Deak, G. O., Flom, R. A., & Pick, A. D. (2000). Effects of gesture and target on 12- and 18-month-olds’ joint visual attention to objects in front of or behind them. Developmental psychology, 36, 511–523. Decety, J., Chaminade, T., Grezes, J., & Meltzoff, A. N. (2002). A PET exploration of the neural mechanisms involved in reciprocal imitation. NeuroImage, 15, 265–272. Decety, J., & Grèzes, J. (1999). Neural mechanisms subserving the perception of human actions. Trends in Cognitive Sciences, 3, 172–178. Decety, J., & Sommerville, J. A. (2004). Shared representations between self and other: A social cognitive neuroscience view. Trends i n C ognitive Sciences, 7, 527–533. Diamond, A. (1985). The development of the ability to u se recall to g uide action, as indicated by infants’ performance on A-not-B. Child Development, 56, 868–883. Diedrich, F. J., H ighlands, T. M., Spa hr, K . A ., Thelen, E ., & Sm ith, L . B. (2001). The role of target distinctiveness in infant perseverative reaching. Journal of Experimental Child Psychology, 78, 263–290. Edwards, M. G., Humphreys, G. W., & Castiello, U. (2003). Motor facilitation following ac tion obs ervation: A b ehavioral s tudy i n prehensile action. Brain and Cognition, 53, 495–502.
362
Embodiment, Ego-Space, and Action
Fadiga,L., Fogassi, L., Pavesi, G., & Rizzolatti, G. (1995). Motor facilitation during action observation: A magnetic stimulation study. Journal of Neurophysiology, 73, 2608–2611. Farroni, T., Johnson, M. H., Brockbank, M., & Si mion, F. (2000). Infants’ use of ga ze d irection to c ue attention: The i mportance of perceived motion. Visual Cognition, 7, 705–718. Farroni, T., Massaccesi, S., Pividori, D., & Johnson, M. H. (2004). Gaze following in newborns. Infancy, 5, 39–60. Ferrari, F., Rozzi, S ., & F ogassi, L . (2005). M irror neurons re sponding to observation of actions made with tools in monkeys ventral premotor cortex. Journal of Cognitive Neuroscience, 17, 221–226. Flach, R., Knoblich, G., & P rinz, W. (2003). Off-line aut horship effects in action perception. Brain and Cognition, 53, 503–513. Fogassi, L., Ferrari, P. F., Gesierich, B., Rozzi, S., Chersi, F., & Rizzolatti, G. (2005). Pa rietal lob e: From ac tion o rganization to i ntention u nderstanding. Science, 308, 662–667. Fontaine, R. (1984). Imitative skills between birth and six months. Infant Behavior and Development, 7, 323–333. Frith, C., & Frith, U. (2006). How we predict what other people are going to do. Brain Research, 1079, 36–46. Gallese, V., Fadiga, L., Fogassi, L., & Rizzolatti, G. (1996). Action recognition in the premotor cortex. Brain, 119, 593–609. Gergely, G ., Nádasdy, Z ., C sibra, G ., & B írό, S . (1995). Taking t he i ntentional stance at 12 months of age. Cognition, 56, 165–193. Greenwald, A . G . (1970). S ensory f eedback m echanisms i n p erformance control: Wit h s pecial re ference to t he id eo-motor me chanism. Psychological Review, 77, 73–99. Grèzes, J., Armony, J. L., Rowe, J., & Passingham, R. E. (2003). Activations related to “mirror” and “canonical” neurones in the human brain: An fMRI study. NeuroImage, 18, 928–937. Grèzes, J., Fonlupt, P., Bertenthal, B. I., Delon-Martin, C., Segebarth, C., & Decety, J. (2001). Does perception of biological motion rely on specific brain regions? NeuroImage, 13, 775–785. Grèzes, J., Frith, C., & Passingham, R. E. (2004). Inferring false beliefs from the ac tions o f o neself a nd o thers: A n f MRI s tudy. NeuroImage, 21, 744–750. Heyes, C., Bird, G., Johnson, H., & Haggard, P. (2005). Experience modulates automatic imitation. Cognitive Brain Research, 22, 233–240. Hofer, T., Hauf, P., & Aschersleben, G. (2005). Infant’s perception of goaldirected ac tions p erformed by a m echanical c law de vice. Infant Behavior and Development, 28, 466–480. Hommel, B., Musseler, J., Aschersleben, G., & Prinz, W. (2001). The theory of event coding (TEC): A framework for perception and action planning. Behavioral and Brain Sciences, 24, 849–937.
Motor Knowledge and Action Understanding
363
Hood, B. M., Willen, J. D., & Driver, J. (1998). Adult’s eyes trigger shifts of visual attention in human infants. Psychological Science, 9, 131–134. Iacoboni, M., Molnar-Szakacs, I., Gallese, V., Buccino, G., Mazziotta, J. C., & Rizzolatti, G. (2005). Grasping the intentions of others with one’s own mirror neuron system. PLoS Biology, 3, 529–535. Iacoboni, M., Woods, R . P., Brass, M., Bekkering, H., Ma zziotta, J. C ., & Rizzolatti, G. (1999). Cortical mechanisms of human imitation. Science, 286, 2526–2528. Jacobs, A ., P into, J., & Sh iffrar, M . ( 2004). E xperience, c ontext, a nd t he visual perception of human movement. Journal of Experimental Psychology: Human Perception and Performance, 30, 822–835. James, W. (1890). The principles of psychology (Vol. 1). New York: Dover. Jeannerod, M . ( 1997). The co gnitive n euroscience o f acti on. C ambridge, MA: Blackwell. Jeannerod, M. (2001). Neural simulation of action: A unifying mechanism for motor cognition. NeuroImage, 14, S103–S109. Johansson, G. (1973). Visual perception of biological motion and a model for its analysis. Perception and Psychophysics, 14, 201–211. Jordan, M . I . (1996). C omputational a spects of motor c ontrol a nd motor learning. In H. Heuer & S. Keele (Eds.), Handbook of perception and action, Vol. 2: Motor skills (pp. 71–120). New York: Academic Press. Kandel, S., Orliaguet, J. P., & Viviani, P. (2000). Perceptual anticipation in handwriting: The role of i mplicit motor c ompetence. Perception and Psychophysics, 62, 706–716. Kawato, M., Furawaka, K., & Suzuki, R. (1987). A hierarchical neural network m odel f or t he c ontrol a nd le arning o f vol untary m ovements. Biological Cybernetics, 56, 1–17. Kilner, J. M ., Pa ulignan, Y ., & Bl akemore, S .-J. ( 2003). A n i nterference effect of ob served biolo gical move ment on a ction. Current Biol ogy, 13, 522–525. Kiraly, I., Jovanovic, B., Prinz, W., Aschersleben, G., & Gergely, G. (2003). The e arly o rigins o f g oal a ttribution i n i nfancy. Consciousness an d Cognition, 12, 752–769. Knoblich, G. (2002). Self-recognition: Body and action. Trends in Cognitive Sciences, 6, 447–449. Knoblich, G., Elsner, B., Aschersleben, G., & Metzinger, T. (2003). Grounding the self in action. Consciousness and Cognition, 12, 487–494. Knoblich, G., & Flach, R. (2001). Predicting the effects of actions: Interactions of perception and action. Psychological Science, 12, 467–472. Koski, L ., Iacoboni, M., Dubeau, M.-C., Woods, R . P., & Ma zziotta, J. C . (2003). M odulation o f c ortical ac tivity d uring d ifferent imitative behaviors. Journal of Neurophysiology, 89, 460–471.
364
Embodiment, Ego-Space, and Action
Lacerda, F., von Hofsten, C., & Heimann, M. (Eds.). (2001). Emerging cognitive abilities in early infancy. Hillsdale, NJ: Erlbaum. Leung, E. H., & Rheingold, H. L. (1981). Development of pointing as a social gesture. Developmental Psychology, 17, 215–220. Lhermitte, F., Pillon, B., & Serdaru, M. (1986). Human autonomy and the frontal lobes. Part I: Imitation and utilization behavior: A neuropsychological study of 75 patients. Annals of Neurology, 19, 326–334. Longo, M. R., & B ertenthal, B. I. (2006). Common coding of observation and execution of action in 9-month-old infants. Infancy, 10, 43–59. Longo, M . R ., Kosobud, A, & B ertenthal, B. I. (in press). Automatic i mitation of biome chanically i mpossible a ctions: E ffects o f p riming movements v s. g oals. Journal of E xperimental P sychology: H uman Perceptions and Performance. Louis-Dam, A ., O rliaguet, J.-P., & C oello, Y. (1999). Perceptual a nticipation in grasping movement: When does it become possible? In M. A. Grealy & J. A. Thom son (Eds.), Studies in perception and action V (pp. 135–139). Mahwah, NJ: Erlbaum. Loula, F., Prasad, S., Harber, K., & Shiff rar, M. (2005). Recognizing people from t heir m ovement. Journal of E xperimental P sychology: H uman Perception and Performance, 31, 210–220. Luo, Y., & Baillargeon, R. (2005). Can a self-propelled box have a goal? Psychological Science, 16, 601–608. Maratos, O. (1982). Trends in the development of imitation in early infancy. In T. G. Bever (Ed.), Regressions in m ental de velopment: Ba sic phenomena and theories (pp. 81–102). Hillsdale, NJ: Erlbaum. Marcovitch, S., & Z elazo, P. D. (1999). The A-not-B er ror: Results f rom a logistic meta-analysis. Child Development, 70, 1297–1313. Marcovitch, S., Zelazo, P. D., & Schmuckler, M. A. (2002). The effect of the number of A t rials on performance on t he A-not-B task. Infancy, 3, 519–529. Marr, D. (1982). Vision: A computational investigation into the human representation and processing of information. New York: W. H. Freeman. McGraw, M . B . (1943). Neuromuscular m aturation of th e hum an infant . New York: Columbia University Press. Meltzoff, A. N., & Moore, M. K. (1977). Imitation of facial and manual gestures by human neonates. Science, 198, 75–78. Meltzoff, A. N., & Moore, M. K. (1994). Imitation, memory, and the representation of persons. Infant Behavior and Development, 17, 83–99. Miall, R . C ., & W olpert, D. M . (1996). Forward models for physiological motor control. Neural Networks, 9, 1265–1279. Munakata, Y. (1997). Perseverative reaching in infancy: The roles of hidden toys and motor history in the AB task. Infant Behavior and Development, 20, 405–415.
Motor Knowledge and Action Understanding
365
Munakata, Y. (1998). Infant perseveration and implications for object permanence theories: A PDP model of the AB task. Developmental Science, 1, 161–184. Nakagawa, A., Sukigara, M., & Benga, O. (2003). The temporal relationship between reduction of early imitative responses and the development of attention mechanisms. BMC Neuroscience, 4, http://www.biomedcentral.com/1471-2202/4/33. Orliaguet, J.-P., Kandel, S., & Boe, L. J. (1997). Visual perception of motor anticipation in cursive handwriting: influence of spatial a nd movement information on the prediction of forthcoming letters. Perception, 26, 905–912. Pavlova, M ., & S okolov, M . ( 2000). O rientation sp ecificity i n biolo gical motion perception. Perception and Psychophysics, 62, 889–899. Pelphrey, K. A., Morris, J. P., & M cCarthy, G. (2005). Neural basis of eye gaze processing deficits in autism. Brain, 128, 1038–1048. Pinto, J. ( 1997). De velopmental c hanges o f i nfants’ p erceptions o f p ointlight d isplays o f h uman ga it. Dissertation A bstracts I nternational: Section B: The Sciences and Engineering, 57(8-B), 5365. Pinto, J., & B ertenthal, B. I. (1992). E ffects of phase relations on t he perception of biomechanical motions. Investigative Ophthalmology and Visual Science, 33Suppl., 1139. Pinto, J., & B ertenthal, B. I. (1993). Infants’ perception of u nity i n pointlight displays of human gait. Investigative Ophthalmology and Visual Science, 34 Suppl., 1084. Pinto, J., B ertenthal, B. I., & B ooth, A. (1996). Developmental changes in infants’ re sponses to b iological m otion d isplays. Investigative O phthalmology and Visual Science, 37 Suppl, S63. Pinto, J., Sh rager, J., & B ertenthal, B. I. (1993). Developmental changes in infants’ perceptual processing of biomechanical motions. In Proceedings of th e Annual Conference of th e Cognitive Science Society. Hillsdale, NJ: Erlbaum. Posner, M . I . ( 1978). Chronometric e xplorations of mind . H illsdale, N J: Erlbaum. Posner, M . I . (1980). O rienting of attention. Quarterly Journal of E xperimental Psychology, 32A, 3–25. Prechtl, H . F. R . (1986). N ew p erspectives i n e arly h uman de velopment. European Journal of Obstetrics, Gynecology and Reproductive Biology, 21, 347–355. Premack, D. (1990). The infant’s theory of self-propelled objects. Cognition, 36, 1–16. Proffitt, D. R., Bertenthal, B. I., & Roberts, R. J. (1984). The role of occlusion in reducing multistability in moving point light displays. Perception and Psychophysics, 35, 315–323.
366
Embodiment, Ego-Space, and Action
Repp, B ., & K noblich, G . ( 2004). P erceiving ac tion i dentity: H ow p ianists re cognize t heir o wn p erformances. Psychological S cience, 1 5, 604–609. Rizzolatti, G., & Cr aighero, L. (2004). The mirror-neuron system. Annual Review of Neuroscience, 27, 169–192. Rizzolatti, G., Fadiga, L ., Gallese, V., & F ogassi, L . (1996). Premotor cortex a nd t he recognition of motor ac tions. Cognitive Brain Re search, 3, 131–141. Rizzolatti, G., Fogassi, L., & Gallese, V. (2001). Neurophysiological mechanisms underlying the understanding and imitation of action. Nature Reviews Neuroscience, 2, 661–670. Rose, J. L., & Bertenthal, B. I. (1995). A longitudinal study of the visual control of posture in infancy. In R. J. Bootsma & Y. Guiard (Eds.), Studies in perception and action (pp. 251–253). Mahwah, NJ: Erlbaum. Ruff man, T., & Langman, L. (2002). Infants’ reaching in a multi-well A not B task. Infant Behavior and Development, 25, 237–246. Saxe, R ., X iao, D.-K., K ovacs, G ., Perrett, D. I ., & K anwisher, N. ( 2004). A re gion o f r ight p osterior su perior tem poral su lcus re sponds to observed intentional actions. Neuropsychologia, 42, 1435–1446. Saygin, A. P., Wilson, S. M., Hagler, D. J., Bates, E., & Sereno, M. I. (2004). Point-light biological motion p erception ac tivates human premotor cortex. Journal of Neuroscience, 24, 6181–6188. Scaife, M., & Bruner, J. (1975). The capacity for joint attention in the infant. Nature, 253, 265–266. Schofield, W. N. (1976). Ha nd movements which cross t he body midline: Findings rel ating a ge d ifferences to handedness. Perceptual a nd Motor Skills, 42, 643–646. Shiff rar, M ., & P into, J. ( 2002). The visual analysis of bodily motion. In W. P rinz & B . H ommel ( Eds.), Common me chanisms in pe rception and action: Attention and performance (Vol. 19, pp. 381–399). Oxford: Oxford University Press. Smith, L. B., Thelen, E., Titzer, R., & McLin, D. (1999). Knowing in the context of acting: The task dynamics of the A-not-B error. Psychological Review, 106, 235–260. Sommerville, J. A., Woodward, A. L., & Needham, A. (2005). Action experience alters 3-month-old infants’ perception of others’ actions. Cognition, 96, B1–B11. Stengel, E . (1947). A c linical a nd p sychological s tudy of e cho-reactions. Journal of Mental Science, 93, 598–612. Stevens, J. A ., F onlupt, P., Sh iff rar, M ., & De cety, J. ( 2000). N ew a spects of motion perception: Selective neural encoding of apparent human movements. NeuroReport, 11, 109–115. Sumi, S. (1984). Upside-down presentation of the Johansson moving lightspot pattern. Perception, 13, 283–286.
Motor Knowledge and Action Understanding
367
Tai, Y. F., Scherfler, C., Brooks, D. J., Sawamoto, N., & Castiello, U. (2004). The h uman p remotor c ortex i s ‘ mirror’ o nly f or b iological ac tions. Current Biology, 14, 117–120. Thelen, E., Schoner, G., Scheier, C., & Smith, L. B. (2001). The dynamics of embodiment: A field theory of infant perseverative reaching. Behavioral and Brain Sciences, 24, 1–86. Thelen, E., & Ulrich, B. (1991). Hidden skills. Monographs of the Society for Research in Child Development, 56 (No. 1 Serial No. 223). Thelen, E., Ulrich, B. D., & Niles, D. (1987). Bilateral coordination in human infants: Ste pping o n a spl it-belt t readmill. Journal of E xperimental Psychology: Human Perception and Performance, 13, 405–410. Tomasello, M. (1999). The cultural origins of human cognition. Cambridge, MA: Harvard University Press. Tomasello, M., & C all, J. ( 1997). Primate cog nition. O xford: O xford University Press. van Hof, P., van der Kamp, J., & Savelsbergh, G. (2002). The relation of unimanual and bimanual reaching to crossing the midline. Child Development, 73, 1353–1362. Verfaillie, K. (1993). Orientation-dependent priming effects in the perception of biological motion. Journal of Experimental Psychology: Human Perception and Performance, 19, 992–1013. Viviani, P., Baud-Bovy, G., & Re dolfi, M. (1997). Perceiving a nd tracking kinesthetic s timuli: Fu rther e vidence of motor -perceptual i nteractions. Journal of E xperimental P sychology: H uman P erception an d Performance, 23, 1232–1252. Viviani, P ., & M ounoud, P . ( 1990). P erceptual-motor c ompatibility i n pursuit t racking o f t wo-dimensional m ovements. Journal of M otor Behavior, 22, 407–443. Viviani, P., & Stucchi, N. (1992). Biological movements look uniform: Evidence of motor-perceptual interactions. Journal of Experimental Psychology: Human Perception and Performance, 18, 603–623. Vogt, S., Taylor, P., & Hopkins, B. (2003). Visuomotor priming by pictures of hand postures: Perspective matters. Neuropsychologia, 41, 941–951. von Hofsten, C. (2003). On the development of perception and action. In J. Valsiner & K. J. Connolly (Eds.), Handbook of developmental psychology (pp. 114–171). London: Sage. Wapner, S., & Ci rillo, L . (1968). Imitation of a m odel’s ha nd movements: Age c hanges i n t ransposition o f le ft-right relations. Child De velopment, 39, 887–894. Wolpert, D. M., Doya, K., & Kawato, M. (2003). A unifying computational framework f or m otor c ontrol a nd s ocial i nteraction. Philosophical Transactions of the Royal Society of London B, 358, 593–602. Wolpert, D. M., & Fl anagan, J. R . (2001). Motor prediction. Current Biology, 11, R729–732.
368
Embodiment, Ego-Space, and Action
Woodward, A . L . (1998). I nfants s electively encode t he goal object of a n actor’s reach. Cognition, 69, 1–34. Woodward, A. L. (1999). Infants’ ability to distinguish between purposeful and nonpurposeful behaviors. Infant Behavior and Development, 22, 145–160. Woodward, A . L ., & Gu ajardo, J. J. ( 2002). I nfants’ u nderstanding of t he point gesture as an object-directed action. Cognitive Development, 17, 1061–1084. Woodward, A. L., & S ommerville, J. A . (2000). Twelve-month-old infants interpret action in context. Psychological Science, 11, 73–77. Zelazo, P. D., Reznick, J. S ., & Spinazzola, J. (1998). Representational flexibility and response control in a multistep multilocation search task. Developmental Psychology, 34, 203–214.
11 How Mental Models Encode Embodied Linguistic Perspectives
Brian MacWhinney
Humans d emonstrate a re markable a bility to t ake ot her p eople’s perspectives. W hen w e w atch m ovies, w e find o urselves i dentifying with the actors, sensing their joys, hopes, fears, and sorrows. As viewers, we c an be m oved t o ex hilaration a s we w atch our h eroes overcome obstacles; or we can be moved to tears when they suffer losses and defeats. This process of identification does not always have to be l inked to i ntense emotional i nvolvement. At a soc cer match, we c an follow t he movements of a p layer moving i n to shoot for a goal. We can identify with the player’s position, stance, and maneuvers a gainst t he cha llenges offered by t he defenders. We c an t rack the actions, as t he player drives toward t he goal a nd k icks t he ba ll into the net. This ability to take the perspective of another person is very general. Just as we follow the movements of dancers, actors, and athletes, we can also follow the thoughts and emotions expressed by others in language. In this paper, we will explore the ways in which language builds upon our basic system for projecting the body image to support a r ich system of perspective tracking and mental model construction.
369
370
Embodiment, Ego-Space, and Action
Projection It is useful to think of projection as relying on the interaction of four systems, each of which is fundamental to psychological functioning. These four systems involve body i mage, localization, empathy, a nd perspective tracking.
Body Image Matching In order to assume t he perspective of another actor, one must first be ab le t o co nstruct a f ull i mage f or o ne’s o wn body. This image expresses not only t he positions of our organs a nd l imbs, but a lso their movements and configurations. For most of us, the notion of a body schema is something natural and inescapable. However, there are v arious n eural d isorders a nd i njuries ( Ramachandran, 2 000; Ramachandran & Hubbard, 2001) that can lead to the disruption of the body image. The construction of the body image is based on processing in a w ide variety of sites including the cerebellum (Middleton & S trick, 1998), medial prefrontal cortex (Macrae, Heatherton, & Kelley, 2 004), primary motor cortex (Kakei, Hoffman, & St rick, 1999), and insula (Adolphs, 2003). In order to achieve perspective taking, we need a system that can project our body i mage t o o ther a gents. The first step in this process must involve body part mapping. We have to identify specific organs or body parts on others that we then map onto parallel parts in our own body i mage. Meltzoff (1995) and others have traced the roots of body part matching back to infancy, when the baby can already demonstrate matching by imitating actions such a s tongue protrusion or head motions. We have learned t hat, when a sub ject in an f MRI experiment tracks t he movements of a pa rticular limb or organ, there is corresponding activation of the same neural pathways t he subject would use to produce t hose actions. For example, when w e i magine per forming b icep c urls, t here a re d ischarges t o the biceps (Jeannerod, 1997). When a t rained marksman imagines shooting a gun, the discharges to the muscles mimic those found in real target practice. When we imagine eating, there is an increase in salivation. Neuroimaging studies by Parsons et al. (1995) and Martin, Wiggs, Ungerleider, a nd Ha xby (1996) a nd Cohen et a l. (1996) have shown that, when subjects are asked to engage in mental imag-
How Mental Models Encode Embodied Linguistic Perspectives
371
ery, t hey use m odality-specific s ensorimotor c ortical s ystems. For example, in the study by Martin et al., the naming of tool words specifically activated t he areas of t he left premotor cortex that control hand movements. Similarly, when we observe pain in a particular finger i n a nother person, t ranscranial ma gnetic st imulation shows activation of the precise motor area that controls the muscle for that same finger (Avenanti, Bueti, G alati, & Ag lioti, 2 005). The general conclusion from this research is that, once we have achieved body part matching, we can perceive t he ac tions of others by projecting them onto the mechanisms we use for creating and perceiving these actions ou rselves. I n e ffect, w e r un a cog nitive s imulation ( Bailey, Chang, Feldman, & Narayanan, 1998) and we match the products of this cognitive simulation to our perception of the actions of others. This ability to compute a cognitive simulation is at the heart of both projection and imagery. From recent single-cell work with monkeys, we know that perceptual-motor ma tching i s a v ery g eneral a spect o f t he p rimate brain. Apart from the mirror neurons (Rizzolatti, Fadiga, Gallese, & Fogassi, 1996) that were originally identified in the monkey counterpart to Broca’s area, there are also perceptual-motor matching systems in other areas of both frontal and parietal cortex. The facility with wh ich we a re able to project perspec tive onto others suggests that the body matching system is probably quite extensive, but the exact extent of this system is not yet known. Localization In order to assume t he perspective of another actor, one most a lso position that actor correctly in mental space. To do this we rely on the same system that allows us to locate our own bodies in space. This system must be robust against eye movements and turns of the head (Ballard, Hayhoe, Pook, & Rao, 1997); it should allow us to maintain facing and update our position vis-à-vis landmarks and other spatial configurations (Klatzky & Wu; Proffitt, this volume). Once we have succeeded i n projecting our body i mage to t he other, we c an next begin to position this projected image in the distal physical location (Loomis & Philbeck, this volume). One difficult part of this projection i s t he r equirement t hat w e r otate o ur body i mage 180º i f t he other person is facing us.
372
Embodiment, Ego-Space, and Action
Empathy In order to assume the emotional and social perspective of another, we must first be ab le to have access to our own emotions. We c an then project or transfer these feelings to others. Of course, there is no doubt that humans have a well-organized system of emotions that maps into physiological expression through the face, body, and autonomic nervous s ystem. Once we have ach ieved projection of body image and spatial location, we can begin to identify with the emotions of the other person. Studies of patients with lesions (Adolphs & Spezio, 2006) have pointed to a role for the amygdala in the processing of emotional expression. Studies w ith normals using f MRI (Meltzoff & Decety, 2003) have shown activation in the amygdala and temporal areas (Pelphrey et al., 2003; Pelphrey, Morris, & McCarthy, 2005; Pelphrey, Viola, & McCarthy, 2004) for the processing of social expressions and actions.
Perspective Tracking To obtain smooth projection, we must be able to constantly update, shift, a nd t rack t he other actor on each of t hese d imensions (body image, spa tial l ocation, em pathy). To d o t his, w e ha ve t o t reat t he other body as obeying the same kinematic principles that govern our own body (Knoblich, this volume). At the same time, we must be able to distinguish the projected or imagined image from our own body image. This means that the tracking system must allow us to treat the projected image as “fictive” or “simulated” and our own body image as real. Ma intaining t his t ype of clear s eparation be tween simulation and reality would seem to require special neuronal adaptations. These ad aptations i nclude m irror n eurons a s o ne co mponent, b ut they also require additional support from frontal areas for attention and r egulation. This additional support is required to allow us to shift perspective between multiple ex ternal agents. For t his system to function smoothly, we must store a coherent fictive representation of t he other person. In order to switch smoothly between perspectives, it is crucial that the projected agent be easily accessed and constructed as a clearly separate fictive representation. Researchers such a s Do nald (1991) a nd Giv on (2002) ha ve p roposed that, two million years ago, homo erectus relied on a system of
How Mental Models Encode Embodied Linguistic Perspectives
373
gestural and mimetic communication to further social goals. Adaptations that supported this tracking of visual perspectives would have f ormed a n e volutionary subst rate f or t he la ter elabo ration o f perspective tracking in language (MacWhinney, 2005). There is still a close temporal a nd conceptual l inkage between gesture a nd la nguage (McNeill, 1992), suggesting that both systems continue to rely on a core underlying set of abilities to track perspective. Subjective evidence from drama, movies, and literature suggests that identification is a dynamic and integrated process. This process produces a flowing match between our own motoric and emotional functions and the motoric and emotional functions of others. However, n eurological e vidence r egarding t he dy namic f unctioning o f this system is still missing. We know that there is neuronal support for each of the four projection systems. However, we cannot yet follow perspective tracking in real time. In large part, this gap in our knowledge i s a f unction of t he l imitations of m easurements t aken through methods such as fMRI and ERP.
The Perspective Hypothesis Fortunately, there is evidence of a very different type that can help us understand the nature of ongoing projection and perspective taking. This is the evidence that is encoded in the structure of grammatical systems. Close a nalysis shows how g rammar reflects the cognitive operations u nderlying perspec tive t racking. I r efer t o t his g eneral view o f m ental p rojection a s t he perspec tive h ypothesis. W e c an decompose the overall perspective hypothesis into six more specific assumptions or claims: 1. Language functions by p romoting the sharing of mental models between the speaker and the listener. 2. To build up complex mental models, referents a nd ac tions must be connected dynamically through temporal, spatial, and causal linkages. 3. The l inks i n m ental m odels a re s tructured by p erspective tracking. 4. Perspective tracking is supported by specific neuronal projection systems that keep these cognitive simulations separate from direct perception and action.
374
5.
Embodiment, Ego-Space, and Action
The primary function of grammatical devices is to mark perspective tracking, as it emerges in conversation and narrative. 6. Perspective tracking, as realized in language, facilitates the linking o f m ental s ystems a nd t he c ultural t ransmission o f l inked structures.
For ps ychologists a nd cog nitive l inguists, t he first t wo a ssumptions a re la rgely u ncontroversial. S ince t he beg inning o f t he C ognitive R evolution, r esearchers h ave a ssumed t hat c omprehension involves the construction of linkages in complex mental models. In regard to these first two claims, the perspective hypothesis is simply building on traditional, well-supported, assumptions. The traditional form of assumption 3 i s that that discourse linkages i nvolve coreference between nodes i n semantic st ructure. For example, we could have a discourse composed of two simple propositions in a sentence such as: the boy kicked the ball, and the ball rolled into the gutter. In the mental model or semantic memory constructed from this sentence, the ball of the first clause is linked through coreference to the ball of the second clause. In classic models of sentence interpretation (Budiu & Anderson, 2004; Kintsch & Van Dijk, 1978), the ma in m echanism f or d iscourse l inkage w as co reference. E ven models t hat d ecomposed s emantic s tructure ( Miller & J ohnsonLaird, 1976) still maintained a reliance on coreference as the method for linking propositions in mental models. The perspective hypothesis introduces a fundamental shift in our understanding o f t he co nstruction o f m ental m odels. I n t his n ew view, wh ich i s en coded a s a ssumption 3 , p ropositions a re em bodied representations constructed from a specific perspective, which is initially the perspective of the sentential subject. We can refer to the combined i nteraction o f perspec tive t aking a nd perspec tive sh ifting a s perspec tive t racking. A lthough co reference p lays a s econdary r ole i n l icensing l inkage be tween propositions, mental models use perspec tive t racking a s t heir p rimary i ntegrating m echanism. In the case of our simple example sentence, this means that the perspective of the boy in the first sentence is sh ifted to that of the ball in the second clause and we then track the motion of the ball as it rolls into the gutter. In other words, we construct mental models by taking and shifting perspectives. Within this larger process of perspective tracking, deixis, or verbal pointing, plays a role of bringing new referents to our attention or locating referents in either working memory or long-term memory. However, the structuring of proposi-
How Mental Models Encode Embodied Linguistic Perspectives
375
tions into mental models depends primarily on taking the perspective of the entities discussed in a discourse and tracking this flow of perspective between the various discourse participants. Assumption 4 is grounded on the growing evidence from cognitive neuroscience regarding specific neuronal systems t hat manage the projection of body image, spatial location, and emotion. Without making t his assumption, it would be v ery difficult to imagine how one could believe that language processing relies intimately and continually on neuronal support for perspective tracking. Assumption 5 i s t he c enterpiece o f t he perspec tive h ypothesis. When this assumption is linked to assumption 3, it takes on a particularly st rong f orm. The co mbination o f t hese t wo a ssumptions represents a n ovel pos ition i n cog nitive s cience. Re searchers wh o emphasize the formal determination of linguistic structure (Chomsky, 1975) have repeatedly rejected links between grammatical structure a nd “pragmatic” fac tors such a s perspec tive t aking. A lthough cognitive linguistics provides a role for perspective in the theory of subjectivisation (Stein & Wright, 1995), there is little acceptance of assumption 3. Despite the importance of embodied cognition to cognitive linguistics, there are no current models in this tradition that rely on perspec tive t racking a s t he major force i ntegrating m ental models. Within experimental psychology, recent work has focused on demonstrating the embodied nature of mental models. For example, Stanfield and Zwaan (2001) found t hat, when given sentences such as John poun ded th e n ail into th e floor, subjects a re fa ster to na me pictures of nails pointing downward than nails pointing sideways. This indicates that they construct interpretations with a na il pointing downward. Results of this type, summarized in the chapters in Pecher and Zwaan (2005), provide clear and important evidence for the embodied nature of mental models, but they tell us little about perspective t racking. To ex amine t he co urse o f perspec tive t racking experimentally, we will need to use online measures of interpretation, such a s c ross-modal na ming, wh ich w ill a llow u s t o p robe ongoing changes in mental models as they are being constructed. Assumption 6 focuses on the consequences of perspectival mental models for promoting conceptual integration and cultural transmission. Perspectival mental models provide a g eneral rubric for k nitting together all of cognition. Consider a sentence, such as Last night, my sister’s friend reminded me I had dropped my keys under the table behind th e ga rage. H ere, w e s ee h ow a s ingle u tterance integrates
376
Embodiment, Ego-Space, and Action
information about time (last night), social relations (sister’s friend), mental a cts ( remind), spac e ( under, be hind), ob jects ( keys, tabl e, garage), a nd ev ents ( drop). The sentence produces an integrated tracking across the perspectives of the sister, the friend, the speaker, and t he v arious l ocations. A lthough t his i nformation ma y be i nitially activated in different regions of the brain, language achieves an integration of this information across all of these domains. The p rimary f ocus o f t his pa per i s o n a ssumption 5 . W e w ill conduct this exploration in three parts. We will begin with a psycholinguistic account of how perspectives are shifted t hrough sentence st ructures. Second, we w ill conduct a l inguistic examination of a w ide r ange o f g rammatical co nstructions t o u nderstand h ow they mark perspective tracking. Finally, we will consider the consequences of this analysis for theories of cognition and development in accord with assumption 6. Perspective Tracking Modern psycholinguistic research has tended to focus on the process of sentence comprehension or interpretation, rather than sentence production. It is much easier to achieve experimental control over comprehension t han over production. B ecause our models of sentence i nterpretation a re m ore de tailed, i t i s e asiest t o ex plain the function of perspective tracking first for comprehension and to extend this account later to perspective marking in production. On the comprehension side, there are five important principles that have a direct bearing on perspective tracking. 1.
2.
Incremental In terpretation. The first p rinciple, w hich ha s b een widely supported in recent research, is the principle of incrementalism. According to this principle, the listener attempts to go from words to mental models as soon as material is unambiguously recognized. In processing terms, incrementalism is equivalent to the notion of c ascading (McClelland, 1979) i n w hich processes feed into each other as soon as t hey reach a p oint where data can be passed. Load R eduction. I n s ome c ases, p rocessing ma y i nvolve pl acing phrases i nto w orking m emory w ithout ye t c ommitting to t heir grammatical role o r w ithout a ttaching t hem to o ther p hrases. However, if the listener can attach material to a f ully interpreted mental model, then this will reduce processing load.
How Mental Models Encode Embodied Linguistic Perspectives
3.
4.
5.
377
Starting Points. This principle of load re duction t hrough attachment ( Gibson, 1 998) i nteracts w ith a t hird p rinciple t hat g overns t he centrality of t he starting point (MacWhinney, 1977) or the A dvantage o f F irst M ention (Gernsbacher, 1 990). W hen w e begin a sentence, we use the first nominal phrase as the basis for structure building (Gernsbacher, 1990). As we move through the sentence from left to right incrementally, we add to this structure through attachments. When a phrase or word cannot be attached, it i ncreases t he load . S o, w e a re m otivated to a ttach p hrases a s soon as possible to reduce these costs (O’Grady, 2005). Role Slots. The process of phrasal attachment is driven by role slot assignments. In d ifferent m odels, t his p rinciple ha s v ery d ifferent names, varying from thematic role assignment to theta-binding. The basic idea (MacWhinney, 1987a) is that predicates (verbs, adjectives, prepositions) e xpect to at tach to no minal a rguments that fi ll various thematic or case roles, such as agent, object, or recipient. For l anguages t hat rely o n SVO, S OV, or VSO orders, the first noun is placed tentatively into the role of the perspective. This perspective t hen actively searches for a v erb t hat w ill a llow it to assume a dynamic perspective. For example, in the sentence the runner fell, we begin with the perspective of the runner as the starting p oint. We t hen m ove o n i ncrementally to t he v erb fall. At that point, the linkage of runner to the role slot for a perspective for fell allows us to build a mental model in which the runner engages in the action of falling. Competition and Cu es. Role s lot fi lling i s a c ompetitive p rocess ( MacWhinney, 1 987a). I n c omprehension, s everal n ominal phrases o r “ slot fi llers” ma y c ompete f or a ssignment to a g iven slot and role. Only one of the fi llers can win this competition, and the losers must then compete for other slots. The outcome of the competition is determined by the presence of grammatical, lexical, and semantic cues that favor one or the other competitor. The process of cue summation obeys basic Bayesian principles.
At this point, it would perhaps be useful to think about how these five principles lay the groundwork for the Perspective Hypothesis. In a sense all of these principles are integrated by the basic need during comprehension to construct a m ental model. The pr inciples of incremental i nterpretation a nd load reduction a re d irect responses to t he fac t t hat t here i s a cost a ssociated w ith ma intaining u nattached verbal chunks in short-term memory. In order to reduce this cost, we adopt the initial hypothesis that the first nominal is the perspective. Because the construction of mental models is perspectival
378
Embodiment, Ego-Space, and Action
(assumption 3 abo ve), we are able to take this starting point as the foundation for a larger mental model. The status of the starting point as the perspective is further supported if it can compete successfully for the perspective slot on the verb. In some marked constructions and word orders, it may lose out in this competition. But to understand how this happens, we need to look more closely at the dynamics o f perspec tive t racking t hrough g rammatical co nstructions. Some of the forces that work to either maintain or shift this initial perspective include: 1. Shift. If the verb takes multiple thematic roles, then it may shift perspective from the starting point to secondary perspectives. This can o ccur f or t ransitives, pa ssives, c lefts, a nd a v ariety o f o ther constructions. In most cases, this shift i s not c omplete, a nd t he perspective of the starting point is at least partially maintained. 2. Modification. I ndividual ref erents ma y b e f urther m odified or el aborated by a ttached p hrases a nd c lauses. This occ urs i n relativization, c omplex N P f ormation, app ositives, a nd o ther constructions. 3. Maintenance. P erspective c an b e ma intained ac ross c lauses by devices (or cues) such as anaphoric pronouns, gerunds, resumptives, and conjunctions.
Structure vs. Function Having now listed the general assumptions of the perspective hypothesis, we are in a position to explore the application of this hypothesis to a wide variety of syntactic constructions. However, before beginning t hat ex ploration, w e n eed t o co nsider a co mpeting a pproach that accounts for some, but not all of the phenomena to be discussed. This approach, developed within generative linguistics over the last 50 years (Chomsky, 1957), which attempts to account for syntactic patterns in terms of structural relationships. Of t he various structural r elationships ex plored d uring t his ha lf c entury, per haps t he most prominent is the relation called c-command. We will therefore begin our exploration by comparing two different accounts of coreference—one based on the constraints imposed by c-command and another based on t he constraints imposed by perspective tracking. This analysis is not intended to reject a possible role for c-command in linguistic description. Rather, the goal here is illustrate the impact of perspective tracking on grammatical constructions.
How Mental Models Encode Embodied Linguistic Perspectives
379
Coreference We will begin our explorations with a consideration of a few selected aspects of the grammar of coreference. Consider sentences 1 and 2: Coreference b etween he and Bill is possible in (1), but blocked in 2. 1. 2.
*Hei says Billi came early. Billi says hei came early.
The coreferential reading of these sentences is marked by the presence of subscripts on the nominals. Without these subscripts and the coreference they require, the pronoun he in 1 c an refer to someone mentioned outside of the sentence, such as Tom. What is specifically blocked in 1 is coreference between he and Bill as indicated by their subscripts. The perspec tive h ypothesis acco unts f or t he u ngrammaticality of 1 by invoking the principle of referential commitment. When we hear he in 1, we need to take it as the starting point for the construction of a m ental model for the clause. This means that we must make a referential commitment. We cannot wait until later to identify he. In (2), on the other hand, Bill is already available for coreference and so it is easy to link he to Bill. The theory of government and binding (Chomsky, 1982; Grodzinsky & Rei nhart, 1993; Rei nhart, 1981) s eeks to ex plain t his phenomenon (and ma ny o thers) i n ter ms o f st ructural r elations i n a phrase-marker tree. The backbone of this account is a relation known as c-command. Each element in the tree is said to “c-command” its siblings a nd t heir de scendants. Pr inciple C o f t he b inding t heory requires that lexical NPs such as Bill or the man be “free”; that is, not coindexed with a c-commanding element. This principle excludes a coreferential reading for 1 in which Bill is coindexed with the c-commanding pronoun, but not for 2 i n which t he i ntended a ntecedent c-commands the pronoun rather than vice versa. Grammatical subjects are always in a syntactic position that allows them to c-command other elements in the same clause. This means that there is a very close parallel between the patterns expressed in ccommand and the principle of referential commitment from the perspective hypothesis. Because initial nominals serve as the bases for structure building and perspective propagation, it is initially difficult to distinguish the structural account from the functionalist account. However, i f we l ook cl osely, we w ill s ee t hat t here a re a variety of
380
Embodiment, Ego-Space, and Action
phenomena that can be understood better in terms of mental model construction than in terms of c-command. Noncentrality One i nteresting contrast i s t he r elative i ncrease i n ac ceptability of coreference t hat oc curs a s o ne m oves f rom c entral t o per ipheral arguments. C-command blocks coreference in 3 and 4. This prediction of c-command is correct for 3. However, for many speakers, 4 is possible, although c-command disallows it. 3. 4 . 5. 6.
*Hei said Billi was crazy. *John told himi that Billi was crazy. John said to himi that Billi was crazy. John told hisi mother that Billi was crazy.
Although c -command ma kes t he w rong p redictions f or 4 , i t co rrectly allows for coreference in 5 a nd 6, since the pronouns here do not c-command Bill. The perspective hypothesis views this contrast in a very different way. As a further corollary of assumption 3 and its corollary of referential commitment, we have the following: Principle o f N oncentrality: I f a n e lement i s not c entral to t he bu ilding of a me ntal mo del, t hen it ne ed not b e re ferentially c ommitted a nd i s therefore op en for b ackward a naphora. The le ss c entrally i nvolved t he element, the more it is open for backward anaphora.
The most central argument is the subject (he in 3), followed by the object (him in 4), the oblique or prepositional object (him in 5), and finally a possessor in a complex NP (his in 6). As these roles become less and less central to the process of structure building, they become more and more open to backward anaphora. Delaying Commitment C-command p rovides n o ac count f or t he g rammaticality o f s entences such as (7–9). This is because pronouns in subordinate clauses are too low in the structure to have NPs in other clauses as siblings. 7. 8. 9.
After Lesteri drank the third vodka, hei dropped his cup. After hei drank the third vodka, Lesteri dropped his cup. *Hei drank the third vodka, and Lesteri dropped his cup.
How Mental Models Encode Embodied Linguistic Perspectives
381
Although bo th 8 a nd 9 ha ve a p ronoun t hat p recedes t he t arget noun Lester, t he coreferential reading of 8 i s e asier to get t han the coreferential reading of 10. A lthough c-command provides no account for this contrast, it can be explained by reference to the perspective hypothesis. To do this, we can rely on the following corollary derived from assumption 5. Principle of Cue Marking: The default process of perspective tracking can be modified by grammatical cues that signal delays in referential commitment, clausal backgrounding, or shifts in perspective.
In t he c ase of 8 , t he c rucial g rammatical c ue i s t he subordinating conjunction after, which signals the beginning of a background clause. I n t he co nstruction o f a m ental m odel, back ground ma terial is placed “on hold” for later attachment to foreground material. However, the storage of the backgrounded initial clause of 8 does not incur a la rge p rocessing cost , s ince i ts co mponent p ieces a re f ully structured. Moreover, as long as it remains in the background, the pronoun can be i nvolved in backward coreference. Because there is no cue i n 9 t o protect t he pronoun, it must be co mmitted referentially and backward anaphora is blocked. As a further illustration of the effect of grammatical cues in the delaying of commitment, consider the contrast between 10 and 11. 10 . *Shei jumped inside the car, when Debra i saw a large man lurking in the shadows. 1 1. Shei w as j umping i nside t he c ar, w hen Deb ra i s aw a l arge ma n lurking in the shadows.
Here the presence of progressive aspect in 11 places the information in the main clause “on hold” in the mental model, because this information is being judged as relevant to the interpretation of the subsequent clause. In a s eries of experiments, Harris a nd Bates (2002) show that progressive marking leads to greater acceptance of backward anaphora. Thus, the progressive functions like the subordinating conjunction as a cue to delaying of commitment. A similar effect is illustrated by 12 and 13 from Reinhart (1983). 1 2. In Carter’si hometown, hei is still considered a genius. 1 3. In Carter’si hometown, hei is considered a genius.
382
Embodiment, Ego-Space, and Action
Here, it is easier to get a coreferential reading for 12 than for 13. This is because still serves as a c ue that forces perspective promotion in the preposed prepositional phrase. The opposite side of this coin involves the way in which indefinite reference blocks forward anaphora (i.e., the establishment of coreference between a f ull nominal and a f ollowing pronoun). It is somewhat easier to achieve forward anaphora in 14 than in 15. 14. While Ruth argued with the mani, hei cooked dinner. 15. *While Ruth argued with a mani, hei cooked dinner.
In the case of 14, once we shift perspec tive f rom Ruth to the man, we now have a definite person in mind and it makes good sense to continue that perspective with he. In 15, on the other hand, we have no clear commitment to the identity of a man and using this unclear referent as the binder of he seems strange.
Reflexivization The c-command relation is also used to account for patterns of grammaticality in the use of reflexive pronouns, such as herself or myself. The most common use of these pronouns is to mark coreference to a “clausemate,” which is often the subject of the current clause, as in 16 and 17. 1 6. *Mary i pinched heri. 17 . Mary i pinched herself i.
The perspective of t he reflexive is a r ather remarkable one, since it forces the actor to look back on herself as both the cause of the action and the recipient of the action at the same time. When bo th r eferents a re c entral a rguments, r eflexivization is mandatory. S entences l ike 1 6 a re i mpossible i f t he t wo n ominals are coreferential. However, i f one of t he nominals is central to t he process of s entence building, a nd i f t he other material i n t he s entence serves to shift perspective away from the starting point, then a clausemate coreferent can use a nonreflexive pronoun. Sentences 18 and 19 illustrate this.
How Mental Models Encode Embodied Linguistic Perspectives
383
18 . Phili hid the book behind himi/himself i. 19 . Phili ignored the oil on himi/himself i*.
In 18, n onreflexive co reference a nd r eflexive c oreference a re b oth possible. In 19 only anaphoric coreference is possible. This is because the act of hiding tends to maintain the causal perspective of “Phil” more than the act of ignoring. When Phil hides the book, it is still very much in his mind and so its position vis-à-vis his body still triggers self-reference. However, when Phil ignores the oil, it is no longer in his mind. At this point, the observation of the oil is dependent on an outside viewer and no longer subject to reflexivity. Nouns such as story or picture can also trigger perspective shifting within clauses. In 20 and 21, reflexives are required if there is coreference, because there is no intervening material that shifts perspective away from either John or Mary. In 22, however, the nonreflexive is possible, since the story sets up a new perspective from which John is viewed as an actor in the story and not a l istener to the story. In 23, the action of telling involves Max so deeply in the story itself that the full perspective shift is impossible and the non-reflexive cannot be used. 20. 21. 22 . 23 .
John talked to Mary i about *heri/herself i. John talked to Mary i about *himi/himself i. Johni heard a story about himi/himself i. Max i told a story about *himi/himself i.
The p resence o f i ntervening perspec tives fac ilitates t he u se o f short distance pronouns that would otherwise be blocked by reflexives. Consider these examples: 24 . Johni saw a snake near himi/himself i. 25 . Jessiei stole a photo of heri/herself i out of the archives.
The material that detracts from the reflexive perspective may also follow t he p ronoun, a s i n t hese ex amples f rom Tenny a nd S peas (2002). 26. Johni signaled behind himi/himself i to the pedestrians. 27 . Billi pointed next to himi/himself i at the mildew on the roses. 2 8. Luciei talked about the operation on heri/herself i that Dr. Edward performed.
384
Embodiment, Ego-Space, and Action
In these sentences, use of the nonreflexive prepares the listener for a shift of perspective following the pronoun. Without that additional material, the nonreflexive would be strange. Finally, perspective shift can also be induced by evaluative adjectives such as beloved or silly, as in these examples from Tenny and Speas (2002): 29 . Jessiei stole a photo of *heri/herself i out of the archives. 30 . Jessiei stole a silly photo of heri/herself i out of the archives.
In all of these cases, creation of an additional perspective can serve to shift attention away from the core reflexive relation, licensing use of a nonreflexive pronoun. Ambiguity Syntactic a mbiguities a nd g arden pa ths a re t ypically de scribed i n terms of the construction of alternative structural trees. However, we can also view ambiguities as arising from the competition (MacDonald, Pearlmutter, & Seidenberg, 1994; MacWhinney, 1987b) between alternative perspectives. Moreover, if we look closely at the processing of t hese a mbiguities, t here is evidence for perspec tive t racking effects that go beyond simple structural competition. Consider the examples in sentences 31 to 34. 31. 32. 33. 34.
Visiting relatives can be a nuisance. Crying babies can be a nuisance. Teasing babies is unfair. If they arrive in the middle of a workday, visiting relatives can be a nuisance.
Looking at each of these ambiguities in terms of the principle of incremental interpretation, we can see how alternative perspectives organize alternative interpretations. In each case, there is a competition be tween t he overtly ex pressed noun following t he pa rticiple and a n u nexpressed subject. I n each c ase, t he pa rticiple is looking to fill the subject/perspective role. In example 31, it is plausible that relatives could be visiting. With relatives filling t he role of t he perspective, the interpretation is that “if relatives visit you, they can be a nuisance.” At the same time, we are also able to imagine that some unspecified person serves as the omitted perspective of visit. In this
How Mental Models Encode Embodied Linguistic Perspectives
385
case, the relatives fill the role of the object, y ielding the interpretation that “it can be a nuisance to pay a visit to one’s relatives.” In example 32, on t he other ha nd, t he verb is intransitive. If we were to associate the perspective with an unexpressed subject, then we wou ld have no role for babies. So, here, only one interpretation is possible, and it involves the babies as the initial perspective. That initial perspec tive is e ventually sh ifted at t he word nuisance, since we ha ve t o t ake t he perspec tive o f t he perso n bei ng a nnoyed t o understand how the babies become a nuisance. In 33, the babies are unlikely to be d oing t he teasing a nd t hey serve as good objects, so the unexpressed subject wins the role of perspective. In 35, the foregrounding of they in the first clause prepares the way for perspective continuation t o t he seco nd cla use wh ich p romotes relatives as t he subject of v isiting. A lthough, we c an st ill find a n a mbiguity i n 34, we are less likely to notice it than in 31. As we trace through these various competitions, we see that the demands for incremental construction of a perspectival mental model work to shape the extent to which we can maintain alternative ambiguous perspectives. Perspectival a mbiguity c an a lso a rise f rom t he co mpetition between alternative phrasal attachments. In example 35, the initial perspective resides with Brendan. Although the verb fly would prefer to have a preverbal noun serve as its perspective, the implausibility of seeing a c anyon fly t hrough t he a ir tends to force us away f rom this syntactically preferred option. 35. 36. 3 7. 38 .
Brendan saw the Grand Canyon flying to New York. Brendan saw the dogs running to the beach. The women discussed the dogs on the beach. The women discussed the dogs chasing the cats.
However, t he sh ift t o t he perspec tive o f the d ogs is easier in 36, although a gain w e c an ma intain t he perspec tive o f Brendan if we wish. In cases of prepositional phrase attachment competitions, such as 37, we can maintain the perspective of the starting point or shift to the direct object. If we identify with the women, then we have to use the beach as the location of their discussion. If we shift perspective to the dogs then we can imagine the women looking out their kitchen window and talking about the dogs as they run around on the beach. In 38, on the other hand, we have a harder time imagining that the women, instead of the dogs, are chasing the cats. Sentences such a s
386
Embodiment, Ego-Space, and Action
37 a nd 38 ha ve motivated a v ariety of formal accounts of sentence processing within the framework of the Garden Path account (Frazier, 1987). For these sentences, the perspective hypothesis provides an acco unt t hat f ocuses o n co nceptual co mpetitions, r ather t han recovery from nonconceptual parsing decisions. It is possible to shift perspective abruptly between clauses by treating the verb of the first clause as intransitive and the following noun as a n ew subject. A sh ift of this type is most likely to occur with a verb like jog that is biased toward an intransitive reading, although it can also function as a transitive. Examples 39 to 41 illustrate this effect: 39. Although John frequently jogs, a mile is a long distance for him. 40. Although John frequently jogs a m ile, the marathon is too much for him. 41. Although John f requently smokes, a m ile i s a sho rt d istance for him.
Detailed s elf-paced r eading a nd e ye-movement st udies o f s entences like 39, with the comma removed, show that subjects often slow down just after reading a mile. This slow-down has been taken as evidence for the garden-path theory of sentence processing (Mitchell, 1994). However, it can also be interpreted as reflecting what happens during the time spent in shifting t o a n ew perspec tive wh en the cues preparing the processor for the shift are weak. Examples of this type show that perspectival shift ing is an integral part of online, incremental sentence processing (Marslen-Wilson & Tyler, 1980). This description of the processing of these ambiguities has relied on the six assumptions stated earlier. In particular, these perspective shifts are triggered by alternative activations of role fi llers and phrasal attachments, a s s pecified i n g reater de tail i n de scriptive ac counts such as MacWhinney (1987a) or O’Grady (2005) or in computationally explicit models such a s Kempen a nd Hoenkamp (1987) or Hausser (1999). These competitions play out as we add incrementally to the ongoing perspectival mental models we are creating. Syntactic processes play out t heir r ole t hrough t he competitive operation of role fi lling and attachment, but the actual shift of perspective occurs within t he m ental m odel t hat i s bei ng co nstructed i n a s imulated perspectival space.
How Mental Models Encode Embodied Linguistic Perspectives
387
Scope Ambiguities i n q uantifier s cope p rovide a v ery d ifferent w ay o f understanding the operation of perspective tracking. To understand what is at issue here, consider this example: 42. Every boy climbed a tree.
For s entences t hat beg in w ith q uantified n ominals l ike every boy, the construction of a n i nitial perspec tive is more complex t han i n the case of sentences that begin with a s imple nominal like my dog or Bill. For simple nominals, we only have to create a single unified imagined agent i n perspec tival space. For quantified nominals, we have to create a perspective that allows for multiple agents. Moreover, right from the beginning, we have to take into account the nature of the quantifier. We can t hink of quantified perspectives in terms of Venn diagrams (Johnson-Laird, 1983). For the phrase every boy, we set up a Venn diagram that includes several nodes characterized as boys. We do not need to actually count these nodes in our imagination. Instead, we use an automated procedure for perspective activation that sets up enough imagined nodes to satisfy us that there are several boys. We then link every to climb by duplicating the acts of climbing across the multiple perspectives. We do not have to actually make each boy engage in climbing in our mind. Instead, we can rely on an automated procedure that makes one boy climb and then assumes that the others will “do the same.” The ambiguity in sentence 42 arises when we come to a tree. At this point, we have the option of imagining climbing either a single tree or multiple trees (O’Grady, 2006). We can think of this ambiguity as involving a single unified multiple perspective with many boys and one tree versus a d ivided multiple perspective with many boys and many trees. If we imagine that the initial perspective constitutes a unified group, we are relatively less likely to imagine multiple trees. In fact, the nature of the verb determines the extent to which we keep a unified or divided multiple perspective. Consider t hese examples from O’Grady (2006): 43. Everyone gathered at a restaurant. 44. Everyone surrounded a dog.
388
Embodiment, Ego-Space, and Action
Here, the activity of gathering or surrounding requires the individuals i n t he i nitial multiple perspec tive t o ac t i n concert. W hen they ac t t his w ay, t he co mponents o f t he m ultiple perspec tive a re more likely to focus on a single object, rather than multiple objects. Thus, t he sh ift to a u nified s ingle r epresentation f or t he ob ject i s determined by the embodied representation of the subject’s perspective as it combines with the activity of the verb. Scope ambiguities also display interesting interactions with grammatical constructions t hat sh ift t he order of s entence elements. I n example 45, we can imagine either that the students are all reading the same books or that each student is reading a different set of three books. This contrast is much like the contrast in 42 between a multiple perspective that remains unified when processing the object and a multiple perspective that divides when processing the object. 45. Two students read three books. 4 6. Three books are read by two students.
In 46, on the other hand, it is difficult to imagine more than one set of three books. As a result, 46 does not permit the ambiguity that we find i n 45. This effect i llustrates t he operation of a ssumption 3 regarding the conceptual centrality of the starting point. According to assumption 3, starting points constitute the foundation stone for the construction of the rest of the edifice of the sentence. Because the remaining edifice rests on this foundation stone, we need to make as full a commitment as we possibly can to the referential clarity of the starting point. We can express this particular corollary of assumption 3 in these terms: Principle of R eferential C ommitment: N ominals t hat a re b eing u sed a s starting points should be linked to unambiguous referents in mental models. I f f ull d efinite c oreference c annot b e a chieved, t hen nom inal starting points are assumed to be uniquely identifiable new referents.
In t he c ase o f 4 6, t his m eans t hat, o nce three book s a re post ulated in mental space they cannot be multiplied into two sets of three books as in 45. This same principle is involved in the construction of interpretations for sentences like 47 to 50: 4 7. The devoted environmentalist tracked every mountain goat. 48. A devoted environmentalist tracked every mountain goat.
How Mental Models Encode Embodied Linguistic Perspectives
389
4 9. The boy ate every peanut. 50. A boy ate every peanut.
The contrast in these sentences is between starting points that are fully referential, as in 47 and 49, and those that are indefinite, as in 48 and 50. When the starting point is fully referential, it cannot later be divided by backwards multiplication of perspectives. As a result, 47 and 49 a re not ambiguous. However, in 48 we can imagine each mountain goat being tracked by a different environmentalist. To do this, we return to the perspective of the starting point and multiply that perspec tive. This multiplication of perspectives is possible for 48, bec ause i ndefinite perspec tives a re not a s f ully committed referentially a s de finite perspec tives. I n 50, we c an i magine a s imilar ambiguity, although the idea of many boys each eating only one peanut is perhaps a bit silly. Perspective t racking t heory a lso ex plains wh y 5 1 a nd 5 2 a re acceptable, whereas 53 is questionable. In 51 the perspective of every farmer is distributed so that each of the farmers ends up owning a well-fed donkey. In this perspective, there are many donkeys being fed. This means t hat we can continue in (52) by asking whether or not all of these donkeys will grow. Sentence (53), on the other hand, forces us to break this distributive scoping and to think suddenly in terms of a single donkey, which violates the mental model set up in the main clause. 51. Every farmer who owns a donkey feeds it. 52. Every farmer who owns a donkey feeds it, but will they grow? 53. Every farmer who owns a donkey feeds it, but will it grow?
Relativization Restrictive r elative cla uses p rovide f urther e vidence o f t he i mpact of perspective shift ing on sentence processing difficulty. Processing these st ructures c an r equire u s t o co mpute m ultiple sh ifts o f per spective. Consider these four types of restrictive relative clauses: 5 5 5 5
4. 5. 6. 7.
SS: OS: OO: SO:
The dog that chased the cat kicked the horse. The dog chased the cat that kicked the horse. The dog chased the cat the horse kicked. The dog the cat chased kicked the horse.
0 switches 1- switch 1+ switch 2 switches
390
Embodiment, Ego-Space, and Action
In the SS type, the perspective of the main clause is also the perspective of the relative clause. This means that there are no true perspective switches in the SS relative type. In the OS type, perspective flows f rom t he ma in clause subject (dog) to t he ma in clause object (cat) in accord with the general principle of partial shift of perspective to the object. At the word that perspective then flows further to the cat as the subject of the relative clause. This perspective shift is made less abrupt by the fact that cat had already received secondary focus b efore t he s hift w as made . I n t he OO t ype, perspec tive a lso switches once. However, in this case, it switches more abruptly to the subject of the relative clause. In the SO relative clause type, there is a double perspective shift. P erspective begins with the main clause subject (dog). When the next noun (cat) is encountered, perspective shifts once. However, at the second verb (kicked), perspective has to shift back t o the initial perspective (dog) to complete the construction of the interpretation. Sentences w ith m ultiple ce nter e mbeddings h ave e ven m ore switches. Consider an example like (58) which has four difficult perspective switches (dog -> cat -> boy -> cat -> dog). 58 . The dog the cat the boy liked chased snarled. 59. M y m other’s b rother’s w ife’s si ster’s do ctor’s f riend had a he art attack.
Sentences that have as much perspective shift ing as 58, without additional lexical or pragmatic support, are incomprehensible, at least at first hearing. But note that the mere stacking of nouns by itself is not enough to t rigger perspec tive sh ift overload. C onsider example 59. In that example, we do not really succeed in taking each perspective and s witching to t he next. I nstead, we just a llow ourselves to sk ip over e ach perspec tive a nd la nd o n t he la st o ne m entioned. I n t he end, we just know that someone’s friend had a heart attack and fail to track the relation of that friend to the mother’s brother. In the terms of a p rocessing model, we can say t hat we continue to push words onto a stack and end up losing track, in terms of our mental model, of the items that were pushed down to the bottom of the stack. In the case of 59, we still end up with an interpretable sentence. In the case of 58, the result makes little sense. Researchers have often t ried to account for these processing difficulties in structural terms. However, fMRI work (Booth et al., 2001;
How Mental Models Encode Embodied Linguistic Perspectives
391
Just, Carpenter, Keller, Eddy, & Thulborn, 1996) has contrasted the processing of object relative sentences like 57 with the processing of subject r elatives l ike 5 4. These st udies ha ve sh own t hat 57 p roduces g reater ac tivation i n a w ide variety of left-hemisphere areas. Models such as that proposed by Grodzinsky and Amunts (2006) have attempted to link structural complexity to processing in Broca’s area. But the f MRI results show that complexity leads to activation across a far wider area. This wider profile of activation is consistent with t he i dea t hat t he c omplexity in volved i s n ot s tructural, b ut rather involves the fuller construction of switched perspectives in a full mental model. Perhaps n o s entence ha s figured m ore h eavily i n d iscussions o f sentence processing than example 60. 6 0. The horse raced past the barn fell.
The co mpetition m odel ac count o f p rocessing i n r educed r elative sentences of this t ype emphasizes the dual morphological f unction of verbal suffi x -ed. This suffi x can mark either the past tense or the past pa rticiple. W hen -ed ma rks t he pa st tense, t he verb raced is a simple intransitive with horse as its subject or perspective. However, when -ed marks the participle, then the verb raced allows for a nonexpressed subject and takes the horse as the object, much as in the shift of sentence 32 above with the visiting relatives. Because the resting activation of the past tense interpretation of the suffi x is higher than that of the participle, listeners may not pick up the participle interpretation until they realize that the sentence will not parse with the pa st tens e i nterpretation. I n t his c ase, w e s ense a g arden pa th because we only ac tivate t he weak perspective configuration when the strong configuration fails. However, similar configurations will behave very differently. For example, in 61, we sense no garden path because there is no noun following kept that would allow the transitive r eading a nd w e t herefore ha ve t o r ely o n t he r educed r elative reading. In 62, there is no ambiguity, because the irregular participle cannot be confused with the past tense. 6 1. The bird kept in the cage sang. 6 2. The car driven past the barn honked.
392
Embodiment, Ego-Space, and Action
Clitic Assimilation As a further, detailed example of the impact of perspective taking on grammar, let us consider the process of clitic assimilation. In English, the infinitive to often fuses with a preceding modal verb to produce contractions such as wanna from want to in cases such as 64. However, this assimilation is blocked in environments like the one in 65, making 66 unacceptable. 63. 64. 65. 66.
Who do you want to see? Who do you wanna see? Who do you want to go? *Who do you wanna go?
The perspective hypothesis views the reduced infinitive in 64 as a cue that marks perspective maintenance. In 65, this reduction is impossible, because there is a forced processing shift from who to you and then b ack t o who(m). I nfinitive r eduction a lso ma rks perspec tive maintenance in examples 67 to 69. 67. I get ta go. (Privilege) 68. I got ta go. (Privilege, past tense) 69. I gotta go. (Obligation)
In 67 and 68 , t he privilege of going is due presumably to the intercession o f a n o utside pa rty. The perspec tive o f t his o utside pa rty interrupts perspec tive ma intenance. I n 6 9, o n t he o ther ha nd, t he obligation is i nternal to t he speaker a nd perspec tive is ma intained across the reduced infinitive.
Implicit Causality Much of our a nalysis here ha s focused on perspec tive ma rking by highly grammaticalized forms like pronouns, participles, gerundives, relativizers, a nd i nfinitives. However, perspec tive ma rking ex tends far beyond g rammatical forms, appearing w idely i nside adjectives, verbs, nouns, and prepositions. Individual lexical items can characterize complex social roles and mental acts. Items like libel, Internet, or solidarity, encode social scenarios organized about t he perspective of social actors. Let us take the noun libel as an example. When
How Mental Models Encode Embodied Linguistic Perspectives
393
we spe ak o f so me co mmunication a s bei ng “ libelous,” w e a re t aking t he perspec tive o f a n “ accused” perso n wh o decla res t o so me general audience t hat t he (purported) l ibeler ha s a sserted t hat t he accused has engaged in some illegal or immoral activity. Moreover, the accused wishes to convince the general audience that the libeler’s claims are false and designed to make the audience think poorly of the accused in ways t hat influence his or her ability to f unction in public life. This single word conveys a complex set of interacting and shifting social perspectives. To evaluate whether or not a st atement is libelous, we have to assume the perspective of the accused, the purported l ibeler, a nd t he audience t o e valuate t he v arious cla ims and possible counterclaims. All of this requires continual integration and shifting of social roles and mental acts. Verbs l ike promise, forgive, admire, a nd persuade al so e ncode multiple relations of expectation, benefit, evaluation, and prediction between soc ial ac tors. To e valuate t he u ses of t hese verbs requires flexible perspec tive t aking a nd coo rdination. W ithin t his la rger group of me ntal s tate ve rbs, one d imension of c ontrast i s k nown as “ implicit causality.” Sentence 70 illustrates the implicit causality configuration of t he ex periencer-stimulus verb admire. The causal configuration is revealed in the second clause where the subject (she) is t he c ause o f t he ad miration. I n s entence 7 1, w ith t he st imulusexperiencer verb apologize, causality remains with the subject of the first clause (John). 70. John admired Mary i, because shei was calm under stress. 71 . Johni apologized to Mary, because hei had cracked under stress.
According to the perspective hypothesis, shifts in causality should lead to s hifts i n perspec tive. To t rack t hese sh ifts experimentally, McDonald and MacWhinney (1995) asked subjects to listen to sentences like 70 and 71, while making a cross-modal probe recognition judgment. Probe targets included old nouns (John, Mary) new nouns (Frank, Jill), old verbs (admire, apologize), and new verbs (criticize, resemble). The probes were placed at various points before and after the pronoun (he and she). The task was to j udge whether the probe was old or new. McDonald and MacWhinney found that stimulusexperiencer verbs like apologize in 71 tend to p reserve the reaction time advantage for the first noun (John) as a p robe throughout the sentence. I n ter ms o f t he perspec tive h ypothesis, t his m eans t hat
394
Embodiment, Ego-Space, and Action
perspective is not shifted away from the starting point in these sentences. However, experiencer-stimulus verbs like admired in 69 tend to force a shift in perspective away from the starting point (John) to the stimulus (Mary) right at the pronoun. This leads to a per iod of time a round t he pr onoun du ring w hich Mary ha s r elatively fa ster probe recognition times. However, by the end of the sentence in 70, the advantage of the first noun reappears. The fact that these shifts are bei ng processed i mmediately on-line i s e vidence i n support of the perspective hypothesis. In addition to encoding implicit causality, verbs can also encode information regarding implicit source of knowledge. 72. 73. 74. 75.
Minnie told Dorothy that she knew Superman. Minnie asked Dorothy if she knew Superman. Minnie reminded Dorothy that she knew Superman. Minnie told Dorothy that she made Superman cry.
In 72, we assume that Minnie has access to knowledge about herself which she provides to Dorothy. In 73, on the other hand, we assume that Dorothy must be t he source of t he i nformation, since M innie would certainly have access to her own knowledge. A lthough both 74 and 75 could be read ambiguously, the most probable reading in each case is one that maintains the perspective of the starting point. Adults are able to maintain the viewpoint of the initial subject even in t he co mplement cla use. A t t he s ame t ime, t hey u se fac ts abo ut the verb ask to sh ift perspec tive i n 73. However, be tween 5 a nd 8 , children (Franks & Connell, 1996) are more likely to shift to the perspective of Dorothy in all of these sentences. This tendency to shift perspective arises from a general preference for local attachment evident at this age. It is possible that children are not able to coordinate the dual perspectives of the main and subordinate clause efficiently for these structures during this age range (Huttenlocher & Presson, 1973).
Perspectival Overlays In t he p receding s ections, w e ha ve f ocused o n t he w ays i n wh ich grammatical devices provide cues that help listeners track shifts o f perspective. These shifts have i nvolved t he ac tions a nd motions of agents, as they operate on other objects. This system of causal action
How Mental Models Encode Embodied Linguistic Perspectives
395
marking i s at t he core of perspec tive t racking a nd it i s t he s ystem that is marked most overtly through grammatical constructions and forms. However, t here are at least five other systems of perspective shifting that function as linguistic overlays on this basic system of perspective tracking. These include the systems for marking perspective in space, time, empathy, evidentiality, and metaphor. Although these s ystems a re of g reat i mportance for mental model construction, t hey h ave r elatively li ttle im pact o n gr ammatical s tructure, relying instead on marking through individual lexical items.
Space Spatial l ocalization i s f undamental t o perspec tive t racking. The linguistic ma rking o f spac e r elies o n p repositions, m otion v erbs, and names for landmarks. Sometimes t he marking of location can involve perspe ctival a mbiguity. C onsider t his c lassic i llustration from Cantrall (1974): 7 6. The adults in the picture are facing away from us, with the children behind them.
In this example, we see a competition between alternative reference points. If we take the perspective of the adults as our reference point, then t he children are located between t he adults and t he v iewer of the picture. If we take the perspective of the viewers of the picture as the reference point, the children are located on the other side of the adults, farther away from us. Ambiguities of this type are reminiscent of the shifts in reference point in sentences like 35 where we can imagine either imagine ourselves flying to New York or else see the Grand Canyon flying to New York. However, on a structural level, 76 is not ambiguous, whereas 35 is. In other words, the ambiguities we find in spatial perspective taking are not reflected in the grammar. However, they are clearly reflected in our mental models. This is why we can consider perspective taking in space as an overlay on grammar. Perspectival competitions can arise even from what seems to be a single reference point. For example, i f we a re ly ing down on our backs in a h ospital bed, we might refer to the area beyond our feet as in f ront o f me , e ven t hough t he a rea be yond t he f eet i s u sually
396
Embodiment, Ego-Space, and Action
referred to as under me. To do this, we may even imagine raising our head a b it to correct t he reference field, so t hat at least our head is still upright. Because spatial reference is so prone to ambiguity of this type, we have developed many linguistic devices for reducing such ambiguities. One way of reducing ambiguity is to use a third position as the reference point. For example, i f we describe a pos ition a s bei ng 50 yards behind the school, it is not clear whether we are taking our own position as the reference point for behind or whether we are using the facing of the school as the reference point. To avoid this problem we can describe the position as 50 yards toward the mountain from the school. In this case, we are taking the perspective of the mountain, rather than that of the speaker or the school. We then construct a temporary Cartesian grid based on the mountain and perform allocentric projection to the school. Then we compute a d istance of 50 yards from the school in the direction of the mountain. Languages such as Guugu Yimithirr (Haviland, 1993) and Mayan take this solution yet one step fa rther by setting up per manent maplike coordinates against which all locations can be pinpointed. Time Perspective t aking i n t ime i s closely a nalogous t o perspec tive t aking in space. Like space, time has an extent through which we track events in terms of their relation to reference moments. Just as spatial objects have positions and extents, events have locations in time and durations. Just as we tend to view events as occurring in front of us, rather than behind us, we also tend to view time as moving forwards from past to future. As a result, it is easier to process sentences like 77 with an iconic temporal order than ones like 78 with a reversed order. However, sentences like 79 which require no foreshadowing of an upcoming event are the most natural of all. 77 . After we ate our dinner, we went to the movie. 78. Before we went to the movie, we ate our dinner. 79. We ate our dinner and then we went to the movie.
Temporal r eference i n na rrative a ssumes a st rict i conic r elation between the flow of the discourse and the flow of time. Processing of
How Mental Models Encode Embodied Linguistic Perspectives
397
sequences that violate temporal iconicity by placing the consequent before the antecedent is relatively more difficult (Zwaan, 1996). However, in practice, it is difficult to describe events in a fully linear fashion a nd we need to ma rk flashbacks a nd ot her d iversions t hrough tense, aspect, and temporal adverbials. Empathy The third system of perspectival overlays involves marking for empathy. Here, Tenny and Speas (2002) have conducted a useful survey of devices used by various languages. Among the most marked of these devices a re e valuative adjectives such a s beloved or damned. C onsider these examples: 80. John was looking for Sarah’s beloved cat. 81. John was looking for Sarah’s damned cat.
In 80, the cat is beloved from Sarah’s point of view. In 81, however, the cat is damned from either John’s point of view or the speaker’s point of view. Language is laden with evaluative perspectives of this type. However, much of the evaluation we convey in everyday interaction is encoded equally well through intonation and gesture—areas that lie outside the scope of our current exploration.
Evidentiality A fourth area of perspectival overlay involves the marking of the evidence sources a nd t ypes. For ex ample, we may k now some t hings because we saw them directly with our own eyes, whereas we know others t hings bec ause w e ha ve h eard t hem f rom t rusted so urces. Often, w e s imply ma ke a ssertions w ithout p roviding i nformation regarding e vidence. In other c ases, we may a sk questions, i ndicate doubt, express belief, and so on. An example of the role of evidential perspective is t he contrast between statements and questions. Consider the contrast between 82 and 83: 82 . The bicyclist appears to have escaped injury. 83. Did the bicyclist appear to have escaped injury?
398
Embodiment, Ego-Space, and Action
8 4. The re porter s aid t hat t he b icyclist app eared to ha ve e scaped injury. 8 5. The re porter a sked i f t he b icyclist app eared to ha ve e scaped injury.
In 82 the evidence is evaluated on the basis of evidence available to the speaker. In 83, on the other hand, the evidence is evaluated on the ba sis of e vidence available to t he l istener. E xamples 8 4 a nd 85 display a similar asymmetry. In English, the marking of finer d imensions o f e videntiality i s conveyed by particles and adverbs such a s well, sure, still, and just. In other languages, these same forms can appear as markings on the verb. Some languages pay close attention to fine distinctions in the source a nd na ture of t hese e vidences. Japanese d isplays a pa rticularly interesting restriction on evidence reflected in 86 and 87: 86. You are sick. 87. You seem sick.
In Japanese, one cannot say 86, because it is presumptuous to imagine that one has access to inner states of another person. In fact, it might e ven be a b it i nappropriate t o p roduce 8 7. This constraint, known as “speaker’s territory of knowledge” (Kamio, 1995), involves aspects of both evidentiality and empathy. The tracking of evidential perspectives occurs primarily on the level of t he f uller discourse or narrative. On t his level, we use particular constructions to help our listeners locate objects in their own mental models. In effect, we construct mental models of the mental models of our listeners a nd use t hese to determine t he marking of evidentiality. As Givon (2005) puts it, speakers select grammatical constructions on the basis of their “mental models of the interlocutor’s current deontic and epistemic states.” Metaphor A fift h area of perspectival overlay on language involves the use of metaphor. The production and comprehension of metaphors, similes, and analogies has been a lively topic now for nearly three decades in both the linguistic and psychological literature. Examples 88 and
How Mental Models Encode Embodied Linguistic Perspectives
399
89 illustrate some of the typical patterns studied in this vast descriptive and experimental literature. 8 8. 89. 90. 91.
The road runs down to the river. Headline: Congress stumbles in debate on tax reform. Stocks took a plunge at the end of the day. His marriage was like a glacier.
Everyday conversational speech makes very little use of metaphor, but other genres, ranging from financial reporting to pop psychology, rely heavily on extensions of the type illustrated in 89 and 90 to liven up otherwise boring prose. Lakoff and Johnson (1980) have shown how a small set of core metaphors, based on embodied cognition, dominates our construction of mental models. Fauconnier and Turner (1996) have further described the blending of multiple perspectives t hat oc curs i n mental models. Perhaps t he most remarkable of t hese blends a rise i n d rama a nd poe try when we t rack t he perspectives of actors in plays within plays within stories, as in Macbeth and The Story of the Tailor in The Arabian Nights. The processing of perspec tive i n metaphors a nd blends i nvolves conceptual overlays on language, much like the processing of perspective in space, time, empathy, or evidentiality. Like these other systems, the processing of metaphor makes no direct contact with grammar, f unctioning i nstead t hrough s emantic ex tension o f t he meanings of individual words such as stumble or glacier. However, it would be a m istake to ig nore t he ramifications of t hese t ypes of processing for our theories of neuronal encoding of perspectives in mental models. The perspectival complexity of Macbeth or The Arabian Nights marks only the beginning of complex mental models that must be en coded i n verbal form. For t ruly d azzling levels of complexity we can turn to Grigori Perelman’s solution of the Poincaré conjecture or the chains of reasoning in cases argued at the Supreme Court of the United States.
Production and Integration Having examined the marking of perspective through grammar, we are now ready to consider how language integrates across these various levels or layers of perspective taking. The best way to explore this
400
Embodiment, Ego-Space, and Action
issue is to consider snippets of actual narratives and conversations. Consider this passage from an Associated Press release in 2001. 92. A c yclone ha mmered t he B angladesh c oast M onday w ith t he force of “hundreds of demons” leveling entire villages of mud and thatch huts, flooding crops, and killing at least six people.
This passage beg ins f rom t he perspec tive of t he c yclone. It t hen uses a metaphorical image to allow the storm to act as an agent that wields a hammer. The past tense suffi x on the verb hammered places this ac tion i nto t he pa st. The object of t he ha mmering is t he Ba ngladeshi coast. The coast is brought on stage, but there is no shift of perspective away from the cyclone and its hammering. Immediately after coast we have the introduction through Monday of a temporal overlay. Then, t he metaphor of hammering is linked to the force of hundreds of demons. The perspective of the cyclone continues with leveling,flooding, and killing. In each case, our attention moves briefly to the objects without really shifting away from the cyclone. In this sentence, there is a rich combination of images from many modalities. The c yclone i s a v isual i mage, t he ha mmer i s a v isual image l inked t o motor movements of t he a rm a nd perceptions of noise and percussion. The image of the Bangladesh coast brings to mind the position of Bangladesh on a map of the Bay of Bengal, along with a t race o f t he del ta f ormed b y t he B rahmaputra. The image of Monday forces us to refer to our recent calendrical memory to locate this event in a spatial analog to time. The images of demons bring to mind scenes from Indian art with black demon faces flying through the air and stories from the Vedas. We do not count out the demons i n detail, but roughly i magine a n a rray of ma ny demons, perhaps ex tending o ver a w ide p hysical spac e a long t he coa st t o accommodate their great number. When we link entire to villages, we have to engage in a mental activity of leveling that is completive and leaves no huts standing. Similarly, when we flood the crops, we have to imagine a whole scene with plants under water and when we envision the killing, we actually envision six dead bodies, although we are not exactly sure how they died. Together, the construction of the image for just this sentence relies on diverse cognitive systems for motor action, space, time, enumeration, quantifier scope, visual imagery, s emantic me mory, me taphor, ge ography, ge ometry, a nd biology.
How Mental Models Encode Embodied Linguistic Perspectives
401
Some might like to t hink of t hese systems as cognitive modules (Pinker, 1997). It is certainly true t hat t hese diverse cognitions a re supported by widely separated brain regions with highly differentiated functions, but it is misleading to think of language as popping together beads produced by encapsulated modules, as suggested by Carruthers (2002). I nstead, la nguage a llows d iverse a reas t o w ork in concert and to achieve communication by writing to the “blackboard” of the sentence currently under construction. In this sense, language does i ndeed promote i ntegration ac ross t hese nonmodular cog nitive s ystems i n a w ay t hat c an promote cog nitive g rowth (Spelke, 2002). In fact, it makes sense to think of language as providing a springboard for recent human cultural evolution, by allowing us to construct integrated mental models (MacWhinney, 2005; Mithen, 1996) t hat led t o t he f urther elaboration of consciousness (Dennett, 1991), social structure, and religious imagination (Campbell, 1949). The se reflections regarding information integration through perspective taking have been grounded so far on the single example sentence 92. This same passage continues with the material in 93. 9 3. Three men and two children were crushed under collapsed buildings o r h it by flying p ieces o f t in ro ofs i n t he s outhern p ort o f Chittagong. One man died in Teknaf, about 110 miles down t he coast, when he w as blown off h is roof, while t rying to s ecure it. The storm roared in from the Bay of Bengal with wind gusts of 125 mph, forcing a half-million people to flee their huts and huddle in concrete shelters. Many power and telephone lines were down, so a full account of casualties and damage was not available.
In this continuation, we shift from the initial perspective of the cyclone in 92 to the perspective of people being crushed. This involves a passive perspective, as marked by the -ed suffi x. The actual causal agents follow later. The first agent is totally missing, since the fact that it was the cyclone that caused the buildings to collapse is not expressed. For the set passive verb hit the agent is flying pieces of tin roofs. Here, we begin with the notion of flying even before we know what might be flying. We soon realize that what is flying are pieces of tin roofs, and we try to imagine how the cyclone pulled these pieces off of their roofs. As we read through a pa ssage quickly, we may decide not to perform the extra mental work needed to fill out this further detail
402
Embodiment, Ego-Space, and Action
of t he mental model. Even before we shift t o the perspective of the flying pieces, we must s ee t he v ictims c rushed u nder buildings or being hit. We do not know which victims were crushed and which were hit by flying pieces of roof, so we just imagine some of each in both positions without trying to do any actual count. Next we s hift from Chittagong to Teknaf. The shift i n spac e i s accompanied by the introduction of the new perspective of one man. For this man, we first imagine him dead, then we see how he is passively blown off his roof by t he c yclone as t he u nmentioned agent, and then we must put him back on his roof and imagine him trying to secure the roof. This order of events is the opposite of the actual order and building up a mental model in opposite order can be difficult. An alternative version of this sentence would read as in 94. 94. In Teknaf, 110 miles down the coast, a ma n was trying to s ecure his roof when the cyclone blew him off to his death.
Next we shift perspective back to the storm as in now acts not just on six people, but half a million. Finally, we move to viewpoint of the reporter, who explains that it was not possible to give a full report of the casualties because telephone lines were down. This analysis of a news story selected at random illustrates the extent to which language weaves together diverse cognitions into the common grid of the sentence. Sentences achieve this integration by mixing together adjectives, lexical metaphors, quantifiers, descriptive v erbs, n umerals, t emporals, p repositional p hrases, p assives, omitted subjects, participles, adverbs, conjunctions, and a wide variety of other grammatical devices. Some of these devices mark referents, some mark perspective shifts, and others add spatial, temporal, and evaluative overlays to the basic causal grid. If we move beyond examples like this, to examination of spontaneous conversation, the landscape changes markedly. Instead of weaving together places, times, and actors, conversations weave together diverse v iewpoints, u nderstandings, a nd g oals. C onversations a re heavily dedicated to the maintenance of interpersonal relations and the e stablishment o f mutual k nowledge. Ag ainst t his back ground, perspective t racking st ill plays a n i mportant role, but t he perspective being tracked is one that is under continual negotiation between the conversational participants. A fuller examination of these issues is c urrently u nderway i n t he context of a nalyses of conversational interactions in classrooms (Greeno & MacWhinney, 2006).
How Mental Models Encode Embodied Linguistic Perspectives
403
Cultural Transmission Vygotsky (1934) believed that children’s conversational interactions played a fundamental role in their cognitive development. He viewed these interactions as setting a model for mental structures that children would then internalize as “inner speech.” He characterized inner speech in terms of processes such a s topic-comment structure and ellipsis, but provided no additional linguistic or cognitive detail. The perspective hypothesis can be viewed as an elaboration of the initial Vygotskyan program. The idea here is that, by tracking perspective in conversations and narratives, children construct mental models that encode perspectival patterns in long term memory. Because these models are extracted from adult input, the perspective tracking and causal reasoning they contain will reflect the st andards o f t he ad ult co mmunity. F or ex ample, i f fa iry t ales begin w ith p hrases s uch a s once upon a time , f ar, f ar a way, t hen children will also learn to construct mental models for fairy tales in a cognitive space t hat is far away or imaginary in space and time. If the efforts of the valiant prince lead eventually to happiness and marriage, t hen ch ildren w ill b uild m ental m odels i n wh ich t hese expectations are linked. If the efforts of a determined little locomotive allow it to pull a train up a hill, children will also imagine that they c an ach ieve g oals t hrough de termination. I n e ffect, children will use perspectivally constructed models extracted from conversation and narrative as a method for learning the systems and values o f t heir c ulture. I n t his p rocess, t he l inks de veloped t hrough perspective t aking on a ll t he levels we have d iscussed a re c rucial. Without language and the perspective tracking it allows, this level of cultural transmission would not be possible. Developmentalists have extended the Vygotskyan vision by linking the construction of mental models to play (Eckler & Weininger, 1989), games (Ratner & Bruner, 1978), narration (Bruner, 1992; Nelson, 1998), apprenticeship (Lave, 1991), learning contexts (Rogoff, 2003), and conversational sequencing (Bateson, 1975). What is common in all of these accounts is the idea that children are exposed to interactions in which they track the logical flow of ideas perspectivally. From this process, they extract internalized mental models that are specific to their cultures and social groups (Spradley, 1972) that they can then transmit to others (Blackmore, 2000).
404
Embodiment, Ego-Space, and Action
Conclusion In this paper we have examined the ways in which the perspective hypothesis c an o ffer n ew ex planations f or a v ariety o f pa tterns i n grammar a nd s entence p rocessing. I n t his n ew f ormulation, t he links in mental models are viewed as inherently perspectival and grounded on simulated, embodied cognition. This cognitive system relies on a wide range of neuronal structures for body image matching, spatial projection, empathy, and perspective tracking. Language uses t his u nderlying s ystem to ach ieve st ill f urther cog nitive i ntegration. W hen spe akers p roduce s entences, t hey u se g rammatical devices to integrate diverse perspectives and shifts. When listeners process t hese s entences, t hey u se g rammatical ma rkers, co nstructions, and lexical forms to decode these various shifted and overlaid perspectives. Because perspective taking and shift ing are fundamental to communication, language provides a wide array of grammatical devices for specifically marking perspective and perspective shift. Language a llows u s t o i ntegrate i nformation f rom t he d omains o f direct experience, space, time, plans, causality, evidentiality, evaluation, empathy, and mental acts. Across each of these dimensions, we assume and shift between perspectives in order to construct a f ully human, unified conscious awareness. Acknowledgments Thanks t o W illiam O ’Grady, J ames Gr eeno, R oberta K latzky, a nd Marnie Arkenberg for their comments on this paper. This work was supported by NSF Award SBE-0354420.
References Adolphs, R . ( 2003). C ognitive n euroscience o f h uman s ocial b ehavior. Nature Reviews of Neuroscience, 4, 165–178. Adolphs, R., & Spezio, M. (2006). Role of the amygdala in processing visual social stimuli. Progress in Brain Research, 156, 363–378. Avenanti, A., Bueti, D., Galati, G., & Aglioti, S. (2005). Transcranial magnetic s timulation h ighlights t he s ensorimotor s ide of e mpathy for pain. Nature Neuroscience, 8, 955–960.
How Mental Models Encode Embodied Linguistic Perspectives
405
Bailey, D ., C hang, N., F eldman, J., & N arayanan, S . ( 1998). E xtending embodied lexical development. Proceedings of the 20th Annual Meeting of the Cognitive Science Society, 20, 64–69. Ballard, D. H ., Ha yhoe, M . M ., P ook, P. K ., & R ao, R . P. (1997). Dei ctic codes f or t he emb odiment o f c ognition. Behavioral and Br ain S ciences, 20, 723–767. Bateson, M. (1975). Mother–infant exchanges: The epigenesis of conversational interaction. In D. Aaronson & R. Rieber (Eds.), Developmental psycholinguistics an d c ommunication di sorders ( pp. 1 12–140). N ew York: New York Academy of Sciences. Blackmore, S. (2000). The power of memes. Scientific Ame rican, October, 64–73. Booth, J. R ., MacWhinney, B., Thulborn, K. R., Sacco, K., Voyvodic, J. T., & F eldman, H . M . (2001). De velopmental a nd le sion e ffects during brain ac tivation f or s entence c omprehension a nd m ental ro tation. Developmental Neuropsychology, 18, 139–169. Bruner, J. (1992). Acts o f m eaning. C ambridge, M A: Ha rvard U niversity Press. Budiu, R., & A nderson, J. (2004). Interpretation-based processing: A u nified theory of semantic sentence comprehension. Cognitive S cience, 28, 1–44. Campbell, J. (1949). The hero with a thou sand faces. Princeton, NJ: Princeton University Press. Cantrall, W. (1974). View point, reflexives and the nature of noun phr ases. The Hague: Mouton. Carruthers, P. (2002). The cognitive functions of language. Behavioral and Brain Sciences, 33, 657–674. Chomsky, N. (1957). Syntactic structures. The Hague: Mouton. Chomsky, N. (1975). Reflections on language. New York: Random House. Chomsky, N. (1982). Some concepts and consequences of the theory of government and binding. Cambridge, MA: MIT Press. Cohen, M. S., Kosslyn, S. M., Breiter, H. C., DiGirolamo, G. J., Thomp son, W. L., Anderson, A. K. et al. (1996). Changes in cortical activity during mental rotation. A mapp ing study using functional MRI. Brain, 119, 89–100. Dennett, D. (1991). Consciousness explained. New York: Penguin Press. Donald, M. (1991). Origins of the modern mind. Cambridge, MA: Harvard University Press. Eckler, J., & Weininger, O. (1989). Structural parallels between pretend play and narratives. Developmental Psychology, 25, 736–743. Fauconnier, G., & Turner, M. (1996). Blending as a central process of grammar. In A. Goldberg (Ed.), Conceptual structure, discourse, and language (pp. 113–130). Stanford, CA: CSLI.
406
Embodiment, Ego-Space, and Action
Franks, S. L., & Connell, P. J. (1996). Knowledge of binding in normal and SLI children. Journal of Child Language, 23, 431–464. Frazier, L. (1987). Sentence processing: A t utorial review. In M. Coltheart (Ed.), Attention an d pe rformance (Vol. 1 2, pp . 6 01–681). L ondon: Erlbaum. Gernsbacher, M. A. (1990). Language comprehension as structure building. Hillsdale, NJ: Erlbaum. Gibson, E . (1998). Linguistic complexity: Locality of syntactic dependencies. Cognition, 68, 1–76. Givon, T. (2002). The v isual i nformation-processing s ystem a s a n e volutionary precursor of human language. In T. Givon & B. Malle (Eds.), The evolution of language out of pre-language (pp. 3–51). Amsterdam: Benjamins. Givon, T. (2005). Context as other minds: The pragmatics of sociality, cognition, and communication. Philadelphia: Benjamins. Greeno, J., & Mac Whinney, B . ( 2006). Perspective s hifting in cl assroom interactions. Paper presented at the AERA Meeting. Grodzinsky, Y., & Amunts, K. (2006). Broca’s region. Oxford: Oxford University Press. Grodzinsky, Y., & Reinhart, T. (1993). The innateness of binding and coreference. Linguistic Inquiry, 24, 187–222. Harris, C. L., & Bates, E. A. (2002). Clausal backgrounding and pronominal c orefence: A f unctionalist a lternative to c -command. Language and Cognitive Processes, 17, 237–269. Hausser, R. (1999). Foundations of computational linguistics: Man-machine communication in natural language. Berlin: Springer. Haviland, J. B. (1993). Anchoring, iconicity, and orientation in Guugu Yimithirr pointing gestures. Journal of Linguistic Anthropology, 3, 3–45. Huttenlocher, J., & Presson, C. (1973). Mental rotation and the perspective problem. Cognitive Psychology, 4, 277–299. Jeannerod, M . ( 1997). The co gnitive n euroscience o f acti on. C ambridge, MA: Blackwell. Johnson-Laird, P. N. (1983). Mental mod els: Towards a c ognitive sc ience of language, inference, and consciousness. Cambridge, MA: Harvard University Press. Just, M. A., Carpenter, P. A., Keller, T. A., Eddy, W. F., & Thulborn, K. R. (1996). Brain activation modulated by s entence comprehension. Science, 274, 114–116. Kakei, S., Hoff man, D. S., & Strick, P. L. (1999). Muscle and movement representations in the primary motor cortex. Science, 285, 2136–2139. Kamio, A . (1995). Territory o f i nformation i n E nglish a nd J apanese a nd psychological utterances. Journal of Pragmatics, 24, 235–264.
How Mental Models Encode Embodied Linguistic Perspectives
407
Kempen, G., & Hoenkamp, E. (1987). An incremental procedural grammar for sentence formulation. Cognitive Science, 11, 201–258. Kintsch, W., & Van Dijk, T. (1978). Toward a model of text comprehension and production. Psychological Review, 85, 363–394. Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago: Chicago University Press. Lave, J. (1991). Situated learning: Legitimate peripheral participation. New York: Cambridge University Press. MacDonald, M. C., Pearlmutter, N. J., & S eidenberg, M. S. (1994). Lexical nature of syntactic ambiguity resolution. Psychological Review, 101(4), 676–703. Macrae, C., Heatherton, T., & Kel ley, W. (2004). A s elf less ordinary: The medial prefrontal cortex and your. In M. Gazzaniga (Ed.), The cognitive neurosciences (Vol. 3, pp. 1067–1076). Cambridge: MIT Press. MacWhinney, B. (1977). Starting points. Language, 53, 152–168. MacWhinney, B. (1987a). The competition model. In B. MacWhinney (Ed.), Mechanisms o f la nguage a cquisition ( pp. 2 49–308). H illsdale, N J: Erlbaum. MacWhinney, B . (1987b). Toward a p sycholinguistically pl ausible p arser. In S. Thom ason (Ed.), Proceedings of the Eastern States Conference on Linguistics (pp. . Columbus, Ohio: Ohio State University. MacWhinney, B. (2005). Language evolution and human development. In B. E llis & D . Bjorklund (Eds.), Origins of th e s ocial min d (p p. 3 83– 410). New York: Guilford. Marslen-Wilson, W. D., & Tyler, L. K. T. (1980). The temporal structure of spoken language understanding. Cognition, 8, 1–71. Martin, A., Wiggs, C. L., Ungerleider, L. G., & Ha xby, J. V. (1996). Neural correlates of category-specific knowledge. Nature, 379, 649–652. McClelland, J. L. (1979). On the time-relations of mental processes: An examination of systems of processes in cascade. Psychological Review, 86, 287–330. McDonald, J. L., & MacWhinney, B. J. (1995). The time course of anaphor resolution: E ffects o f i mplicit v erb c ausality a nd g ender. Journal of Memory and Language, 34, 543–566. McNeill, D. (1992). Hand an d min d: Wh at ge stures reveal about thoug ht. Chicago: University of Chicago Press. Meltzoff, A. N. (1995). Understanding t he i ntentions of others: Re-enactment of intended acts by 18-month-old children. Developmental Psychology, 31, 838–850. Meltzoff, A . N., & De cety, J. ( 2003). W hat i mitation tel ls u s a bout s ocial cognition: A rapprochement between developmental psychology and cognitive neuroscience. Philosophical Transactions of the Royal Society of London B, 358, 491–500.
408
Embodiment, Ego-Space, and Action
Middleton, F. A., & Strick, P. L. (1998). Cerebellar output: Motor and cognitive channels. Trends in Cognitive Sciences, 2, 348–354. Miller, G ., & J ohnson-Laird, P . ( 1976). Language a nd p erception. C ambridge, MA: Harvard University Press. Mitchell, D. C. (1994). Sentence parsing. In M. Gernsbacher (Ed.), Handbook of ps ycholinguistics (pp. 3 75–405). S an D iego, C A: A cademic Press. Mithen, S. (1996). The prehistory of th e mind: The cognitive origins of ar t, religion, and science. London: Thames & Hudson. Nelson, K. (1998). Language in cognitive development: The emergence of the mediated mind. New York: Cambridge University Press. O’Grady, W. (2005). Syntactic carpentry. Mahwah, NJ: Erlbaum. O’Grady, W. (2006). The syntax of quantification in SLA: An emergentist approach. In M. O’Brien, C. Shea, & J. A rchibald (Eds.), Proceedings of t he 8t h Gene rative A pproaches t o Second L anguage A cquisition Conference (GASLA 2006) (pp. 98 –113). Somerville, M A: Cascadilla Press. Parsons, L. M., Fox, P. T., Downs, J. H., Glass, T., Hirsch, T. B., Martin, C. C. et a l. (1995). Use of i mplicit motor i magery for v isual shape d iscrimination as revealed by PET. Nature, 375, 54–58. Pecher, D., & Z waan, R . (Eds.). (2005). Grounding cog nition. C ambridge: Cambridge University Press. Pelphrey, K. A., Mitchell, T. V., McKeown, M. J., Goldstein, J., A llison, T., & M cCarthy, G . (2003). B rain ac tivity e voked by t he p erception o f human walking: Controlling for meaningful coherent motion. Journal of Neuroscience, 23, 6819–6825. Pelphrey, K. A., Morris, J. P., & M cCarthy, G. (2005). Neural basis of eye gaze processing deficits in autism. Brain, 128, 1038–1048. Pelphrey, K. A., Viola, R. J., & McCarthy, G. (2004). When strangers pass. Psychological Science, 15, 598–603. Pinker, S. (1997). How the mind works. New York: Norton. Ramachandran, V. S. (2000). Phantom limbs and neural plasticity. Neurological Review, 57, 317–320. Ramachandran, V. S., & H ubbard, E . M. (2001). Synaesthesia: A w indow into p erception, t hought a nd l anguage. Journal of C onsciousness Studies, 8, 3–34. Ratner, N., & Bruner, J. (1978). Games, social exchange and the acquisition of language. Journal of Child Language, 5, 391–401. Reinhart, T. (1981). Definite NP anaphora and c-command domains. Linguistic Inquiry, 12, 605–635. Reinhart, T. (1983). Anaphora and semantic interpretation. Chicago: University of Chicago Press.
How Mental Models Encode Embodied Linguistic Perspectives
409
Rizzolatti, G., Fadiga, L ., Gallese, V., & F ogassi, L . (1996). Premotor cortex and the recognition of motor actions. Cognitive Brain Research, 3, 131–141. Rogoff, B . ( 2003). The c ultural n ature of hum an d evelopment. O xford: Oxford University Press. Spelke, E. (2002). Developing k nowledge of space: Core systems and new combinations. In S. Kosslyn & A. Galaburda (Eds.), Languages of the brain (pp. 239–258). Cambridge, MA: Harvard University Press. Spradley, J. ( Ed.). (1972). Culture an d c ognition: Rul es, m aps, an d pl ans. New York: Chandler. Stanfield, R . A ., & Z waan, R . A . (2001). The effect of i mplied orientation derived from verbal context on picture recognition. Psychological Science, 12, 153–156. Stein, D ., & W right, S . ( Eds.). ( 1995). Subjectivity and subjec tivisation. Cambridge: Cambridge University Press. Tenny, C., & Sp eas, P. (2002). Configurational properties of point of view roles. In A. DiSciullo (Ed.), Asymmetry in g rammar. A msterdam: Benjamins. Vygotsky, L. (1934). Thought and language. Cambridge: MIT Press. Zwaan, R . A . (1996). P rocessing na rrative t ime sh ifts. Journal of E xperimental Psychology: Learning, Memory, and Cognition, 22, 1196–1207.
Author Index
A Abravanel, E., 86 Adams, F., 209 Adelman, P. K., 96 Adelson, E. H., 116 Adolph, K. E., 277, 278, 279, 281, 283, 285, 286, 288, 289, 290, 293, 294, 295, 297, 298, 299, 300, 301, 302, 306, 307, 308, 310, 311, 312 Adolphs, R., 370, 372 Aguirre, G. K., 32 Ahlström, V., 120 Albright, T. D., 208, 230 Aldridge, J. W., 226, 229 Alexander, G. E., 225, 229 Allen, G. L., 17 Allison, T., 133 Allport, D. A., 218, 231 Alyan, S., 32 Amano, S., 341 Ambady, N., 134 Andersen, R. A., 208, 219, 220, 247 Anderson, A. K., 136 Anderson, J. R., 45 Andre, J., 4, 6, 7, 8, 26 Anisfeld, M., 357 Arbib, M. A., 231 Arkenberg, M., 404 Ash, M. G., 120 Ashby, W. R., 209 Ashmead, D. H., 8, 9 Astafiev, S. V., 256, 267 Atkinson, A. P., 93, 94, 95 Avenanti, A., 371 Averbeck, B. B., 229 Avraamides, M. N., 14, 15, 153, 161, 172 B Bailey, D., 371 Baldwin, D. A., 339
Ballard, D. H., 210, 371 Bangert, M., 54 Baron-Cohen, S., 358 Barresi, J., 47, 48, 83 Barrett, T. E., 162 Barsalou, L. W., 71, 82 Basso, M. A., 221, 222 Bastian, A., 223 Bateson, M., 403 Beardsworth, T., 59, 327 Bechara, A., 226 Beer, R. D., 209 Bekkering, H., 181, 335 Berger, S. E., 304, 305, 312 Bergson, H., 209 Berkowitz, L., 96 Berns, G. S., 225 Bernstein, N., 276, 277 Bertenthal, B. I., 120, 281, 291, 294, 296, 311, 326, 328, 329, 337, 347, 348, 349, 350, 354, 355, 356 Berti, A., 185, 248, 262 Best, J. B., 207 Beurze, S. M., 256, 267 Beusmans, J. M. H., 29 Bhalla, M., 189 Bichot, N. P., 222 Bingham, G. P., 7 Binkofski, F., 253, 258 Blackmore, S., 403 Blake, R., 124 Blakemore, S. J., 46, 48, 58, 84, 102, 325 Blanchard, R. J., 80 Block, N., 207 Bly, L., 281 Bock, O., 265 Bonatti, L., 83 Bonda, E., 128 Böök, A., 2, 14 Booth, A., 351, 352, 353
411
412
Author Index
Booth, J. R., 390 Boraud, T., 226 Bosbach, S., 69 Botvinick, M., 132 Boynton, G. M., 217, 218, 232 Brain, W. R., 2 Brass, M., 326, 337 Bremmer, F., 262 Bridgeman, B., 30 Bril, B., 285 Bristow, D., 267 Broadbent, D. E., 218 Brooks, R., 209 Brothers, L., 136 Brown, J. W., 225 Bruner, J. S., 335, 403 Buccino, G., 326 Budiu, R., 374 Bullock, D., 214, 229 Bülthoff, I., 128, 130 Buneo, C. A., 219, 220 Burbaud, P., 219 Burnod, Y., 214 Burt, P., 121 Butterworth, G., 340, 346 Buxbaum, L., 84 C Calton, J. L., 217, 220, 251 Calvo-Merino, B., 54, 55, 126, 325, 336 Caminiti, R., 220 Campbell, J., 401 Campos, J. J., 286, 292, 293, 307 Cantrall, W., 395 Carello, C. D., 221 Carey, D. P., 31, 188 Carlson, V. R., 4 Carruthers, P., 401 Casile, A., 56, 57, 59, 126 Castiello, U., 187, 218, 222, 223, 231, 326 Cavina-Pratesi, C., 253, 259 Chaminade, T., 327, 345 Chan, M. Y., 311, 312 Choamsky, N., 206 Chomsky, N., 375, 378, 379 Chong, R. K., 226 Chouchourelou, A., 134, 136 Churchland, P. S., 209 Cisek, P., 204, 212, 213, 215, 218, 221, 223, 225, 227, 229, 230, 231, 232, 233 Clark, A., 48, 204, 209, 227, 276 Clarke, T. J., 119, 134
Clower, W. T., 230 Coe, B., 221 Cohen, L. R., 92, 120, 132, 265 Cohen, M. S., 370 Colby, C. L., 208, 216, 217, 220, 232, 247, 248, 251, 262 Cole, J. D., 69, 70 Connolly, J. D., 252, 256, 267 Constantinidis, C., 219 Cooke, D. F., 248 Corkum, V., 339 Corlett, J. T., 7 Cowey, A., 185 Crammond, D. J., 221 Creem, S. H., 31, 188, 197 Creem-Regehr, S. H., 2, 7, 189 Cross, E. S., 57 Csibra, G., 324, 358 Cuijpers, R. H., 5 Culham, J. C., 209, 219, 247, 251, 252, 253, 255, 258 Cutting, J. E., 7, 59, 120, 180, 184, 348 D Da Silva, J. A., 4 Darwin, C., 81, 94 Dassonville, P., 31 Dawson, G., 98 de Jong, B. M., 257, 267, 268 de Vries, J. I. P., 277 Deak, G. O., 339 Decety, J., 46, 52, 125, 323, 326, 356 Dechent, P., 256, 267 Dennett, D., 401 DeRenzi, E., 85 Desjardins, J. K., 80 Desmurget, M., 214, 221 DeSouza, J. F. X., 31, 252, 265 DeSperati, C., 50 Dewey, J., 209 di Pellegrino, G., 226, 248 Diamond, A., 331 Diamond, R., 91 Diedrich, F. J., 334 DiFranco, D., 253 Dimberg, U., 97 Dittrich, W. H., 59, 134 Dodd, D. H., 207 Dokic, J., 48 Domini, F., 180 Donald, M., 372 Dorris, M. C., 219, 220, 221
Author Index
413
Downing, P., 83, 84 Dretske, F., 210, 233 Dulcos, S. E., 96, 97 Durgin, F. H., 20, 21
Frith, C. D., 133, 323 Fukusima, S. S., 12, 13 Funk, M., 51, 85, 127, 131, 132 Fuster, J. M., 226
E Eby, D. W., 7 Eckler, J., 403 Edwards, M. G., 326 Ellard, C. G., 20, 22 Elliott, D., 7, 8 Engel, A. K., 217 Enright, J. T., 148 Eppler, M. A., 292 Epstein, W., 4, 191, 221 Erlhagen, W., 231 Eskandar, E. N., 251 ESPNmag. com, 196 Ewert, J-P., 231
G Gail, A., 251 Gallagher, S., 84 Gallese, V., 46, 210, 233, 325 Galletti, C., 251, 268 Garciaguirre, J. S., 286, 303, 311, 312 Gaunet, F., 173 Gauthier, L., 91 Gazzaniga, M. S., 208, 230 Gentile, A. M., 312 Georgopoulos, A. P., 232 Gergely, G., 345, 358 Gernsbacher, M. A., 377 Gibson, E., 128, 275, 377 Gibson, E. J., 278, 290, 291, 307 Gibson, J. J., 45, 81, 179, 209, 212, 215, 223, 233, 260, 278, 288 Giese, M. A., 128, 130 Gilinsky, A. S., 4, 5, 22 Givon, T., 372, 398 Glenberg, A. M., 71, 82, 210 Glimcher, P. W., 213, 232 Gnadt, J. W., 31 Gogel, W. C., 3, 4, 5, 18 Gold, J. L., 213, 222, 227, 232 Goldman, A., 48 Goltz, H. C., 262 Gonzalez, C. L., 187 Goodale, M. A., 31, 197, 198, 217, 247 Gottlieb, J., 219, 220 Grafton, S. T., 252 Graziano, M. S. A., 220 Greeno, J., 402, 404 Greenwald, A. G., 46, 325 Grefkes, C., 247, 252 Grezes, J., 58, 70, 323, 326, 355 Grodzinsky, Y., 379, 391 Grön, G., 32 Grosjean, M., 52, 53 Grossman, E. D., 84, 102, 128, 133 Grush, R., 48
F Fadiga, L., 215, 326 Fagg, A. H., 231 Fantz, R. L., 83 Farrell, M. J., 19 Farroni, T., 340, 341 Fauconnier, G., 399 Fazendeiro, T., 100 Felleman, D. J., 216, 232 Ferraina, S., 219, 220 Ferrari, F., 339 Fetz, E. E., 210, 211 Field, T., 83 Fitts, P. M., 50, 52, 207 Flach, R., 51, 62, 68, 327 Flanders, M., 156, 174 Fleury, M., 70 Flores d’Arcais, J-P., 61 Fodor, J. A., 113 Fogassi, L., 46, 325 Foley, J. M., 4, 6, 7, 25, 29, 30, 265 Fontaine, R., 357 Fox, R., 86 Frak, V., 252 Franchak, J. M., 278, 279 Frankenburg, W. K., 281, 284, 285 Franklin, N., 173 Franks, S. L., 394 Franz, V. H., 31 Frazier, L., 386 Freedland, R. L., 283 Frey, S. H., 258
H Haffenden, A. M., 31 Haggard, P., 148 Halligan, P. W., 185, 248 Halverson, H. M., 253
414
Author Index
Hamilton, A., 59, 69, 126 Hardcastle, V. G., 209, 233 Hari, R., 267 Harlow, H. F., 308 Harnad, S., 81, 233 Harris, C. L., 380, 381 Harris, P., 48 Harvey, I., 209 Harway, N. I., 34 Hasebe, H., 263 Haslinger, B., 54 Hasson, U., 133, 261 Hatfield, E., 94, 95 Haueisen, J., 54 Hauser, M. D., 226 Hausser, R., 386 Haviland, J. B., 396 Haxby, J. V., 102 He, Z. J., 7, 9, 30 Heide, W., 31 Hendriks-Jansen, H., 204, 209, 227, 231 Henriques, D. Y., 265 Heptulla-Chatterjee, S., 87, 122, 132 Hertenstein, M. J., 95 Heyes, C., 326 Hildreth, E., 116 Hinde, R. A., 218 Hiris, E., 120 Hobson, P. R., 98, 100 Hofer, T., 338, 358 Hofstadter, D. R., 207 Hommel, B., 46, 48, 210, 347 Hood, B. M., 340, 341 Horak, F. B., 226 Horwitz, G. D., 221 Hoshi, E., 226, 227 Houghton, G., 231 Howard, I. P., 3 Hubbard, T. L., 51 Hubel, D., 115 Humphreys, G. W., 223, 231 Hutchison, J. J., 5, 12, 32, 33 Huttenlocher, J., 394 I Iacobini, M., 47, 124, 133, 325, 326 Iriki, A., 184, 198, 248, 261 Israël, 17 J Jackson, J. H., 209 Jackson, R. E., 195
Jacobs, A., 125, 126, 128, 134, 349 James, W., 45, 46, 81, 325 Janssen, P., 219, 220, 221 Jeannerod, M., 48, 58, 131, 252, 324, 370 Joh, A. S., 290, 306 Johansson, G., 59, 118, 119, 127, 130, 328, 348 Johnson, M. H., 83, 85 Johnson, M. L., 281 Johnson, P. B., 220 Johnson-Laird, P. N., 207, 387 Jordan, M. I., 328 Just, M. A., 391 Juurmaa, J., 17 K Kail, R., 279 Kakei, S., 370 Kalaska, J. F., 212, 213, 215, 217, 218, 219, 220, 221 Kalivas, P. W., 225 Kamio, A., 398 Kandel, S., 51, 67, 327 Karnath, H. O., 253 Kawashima, R., 252 Kawato, M., 328 Keele, S. W., 221 Keller, ??, 68 Kelly, J. W., 4, 5, 6 Kempen, G., 386 Kendrick, K. M., 81 Kermadi, I., 226, 229 Kertzman, C., 252 Kerzel, D., 51 Kieras, D., 45 Kilner, J. M., 327, 337 Kim, J-N., 213, 226 Kinsella-Shaw, J. M., 306 Kintsch, W., 374 Kiraly, I., 324 Klatzky, R. L., 14, 15, 145, 148, 149, 150, 151, 152, 159, 161, 172, 173, 371, 404 Knapp, J. M., 2, 4, 6, 7, 10, 12, 13, 23 Knoblich, G., 48, 51, 58, 60, 65, 66, 130, 132, 137, 327, 345, 372 Koch, C., 2, 183, 198 Kohler, E., 46, 56, 60 Kornblum, S., 231 Korte, A., 51, 120 Koski, L., 47, 326 Kozlowski, L. T., 134 Krebs, J. R., 181
Author Index Kudoh, N., 29 Kuhl, P., 83 Kurata, K., 223 Kusunoki, M., 219, 220, 232 L Lacerda, F., 324 Lacquaniti, ??, 50 Ladavas, E., 248 Lakoff, G., 399 Lampl, M., 281, 282 Land, M. F., 180, 181 Landy, M. S., 180 Langdell, T., 98 Larsen, A., 175 Larsen, R. J., 96 Lavadas, E., 184 Lave, J., 403 Law, I., 266 Lazarus, R. S., 94 Legerstee, M., 83 Lehar, S., 2 Leo, A. J., 297 Leung, E. H., 339, 347 Levin, C. A., 29 Lhermitte, F., 357 Liberman, A. M., 46 Linkenauger, S. A., 186, 198 Loarer, E., 14 Lockman, J. J., 296 Longo, M. R., 325, 326, 331, 332, 333, 336, 347, 358 Loomis, J. M., 2, 3, 5, 6, 7, 8, 9, 12, 13, 14, 15, 16, 17, 18, 19, 23, 26, 27, 28, 29, 30, 33, 34, 159, 371 Louis-Dam, A., 345 Loula, F., 60, 92, 129, 130, 349 Luo, Y., 324, 345 M MacDonald, M. C., 384 Macrae, C., 370 MacWhinney, B., 71, 373, 377, 384, 386, 401 Maguire, E. A., 32 Malach, R., 261 Maratos, O., 357 Maravita, A., 248 Marconi, B., 220 Marcovitch, S., 331, 334 Marey, E. J., 118 Mark, L. S., 306, 313 Marlinsky, V. V., 17
415
Marr, D. C., 3, 113, 207, 216, 324 Marslen-Wilson, W. D., 386 Martin, A., 370 Martin, R. A., 80 Mast, F. W., 179 Matelli, M., 217 Maturana, H. R., 209 Maurer, D., 85, 88 May, M., 172, 175 Mazzoni, P., 219 McClelland, J. L., 376 McCready, D., 4 McDonald, J. L., 393 McGoldrick, J. E., 88 McGraw, M. B., 357 McIntosh, D. N., 95, 96, 97 McNeill, D., 373 McPeek, R. M., 222 Mead, G. H., 209 Medendorp, W. P., 14, 31 Meltzoff, A. N., 86, 95, 131, 137, 356, 357, 370, 372 Merleau-Ponty, M., 45, 209 Merriam, E. P., 146 Messing, R., 7 Miall, R. C., 328 Middleton, F. A., 225, 229, 232, 370 Miller, E. K., 226 Miller, G. A., 207, 221, 374 Millikan, R. G., 209, 233 Milner, A. D., 30, 31, 189, 197, 198, 217, 232, 247 Milner, D. A., 187 Mink, J. W., 225, 226 Mitchell, D. C., 386 Mithen, S., 401 Mittelstaedt, M. L., 17 Mohler, B. J., 20, 21 Mondschein, E. R., 301 Montepare, J. M., 133 Moody, E. J., 95 Moore, C., 82, 100 Moore, K. L., 277, 280 Moran, J., 217, 218, 232 Morris, J. P., 133 Mountcastle, V. B., 219 Munakata, Y., 334, 337 N Nakagawa, A., 357 Nakamura, K., 220, 230 Neggers, S. F., 265
416
Author Index
Nelson, K., 403 Neumann, O., 218, 231 Newell, A., 45, 205, 207, 208, 209, 224 Niedenthal, P. M., 96, 101 Noonan, K. J., 281 Núñez, R., 209 O O’Grady, W., 377, 386, 387, 404 O’Regan, J. K., 210 Ogden, J. A., 85 Ohira, H., 96 Ooi, T. L., 3, 5, 17, 18, 19, 20, 29, 30, 32, 33, 34 Oram, M. W., 128 Orliaguet, J-P., 345 Ounsted, M., 280 Oyama, T., 4 Ozonoff, S., 100 P Palmer, C. E., 280 Palmer, C. F., 290 Pani, J. R., 174 Panksepp, J., 93, 94 Paré, M., 220 Parsons, L. M., 370 Passingham, R. E., 217, 218 Paus, T., 266 Pavlov, I., 205 Pavlova, M., 329, 349 Pecher, D., 71, 375 Pegna, A. J., 185 Pellijeff, A., 257, 267 Pelphrey, K. A., 323, 372 Petit, L. S., 162 Philbeck, J. W., 2, 4, 6, 8, 9, 10, 12, 13, 16, 17, 18, 19, 20, 25, 31, 32 Piaget, J., 45, 81, 82, 204, 209, 227 Pinker, S., 207, 401 Pinto, J, 86, 120, 132, 349, 350, 355 Pisella, L., 214 Pitzalis, S., 250, 252, 257, 268 Plamondon, R., 52 Platt, M. L., 208, 213, 219, 220, 221, 222 Plumert, J. M., 314 Pollick, F. E., 59, 119, 129, 134 Portin, K., 256, 267 Posner, M. I., 340 Post, R. B., 31 Powell, K. D., 219 Powers, W. T., 209
Prado, J., 252, 256, 267 Prasad, S., 131, 132 Prechtl, H. F. R., 357 Premack, D., 358 Previc, F. H., 248 Price, E. H., 85 Prinz, W., 45, 46, 57, 60, 71, 124, 125, 126, 130 Proffitt, D. R., 1, 12, 32, 180, 183, 188, 189, 190, 191, 348 Puce, A., 136 Pylyshyn, Z., 113, 207, 208 Q Quinlan, D. J., 262 Quinn, P., 83 Quintana, J., 226 R Rader, N., 292 Rainer, G., 226 Ramachandran, V. S., 370 Ratner, N., 403 Redgrave, P., 225 Redish, A. D., 32 Reed, C. L., 84, 86, 87, 88, 89, 91, 95, 99, 132 Reed, E. S., 277 Reinhart, T., 379, 383 Repp, B. H., 62, 63, 64, 65, 68, 327 Reynolds, J. H., 208, 233 Richards, J. E., 292 Richardson, A. R., 20, 22, 23 Richter, H. O., 263 Riecke, B. E., 172 Riehle, A., 223 Riener, C. R., 196 Rieser, J. J., 7, 8, 14, 20, 21, 34, 159, 173, 175, 192 Riskind, J., 96 Rizzolatti, G., 46, 47, 58, 84, 124, 132, 220, 324, 325, 326, 339, 347, 356, 371 Ro, T., 83 Robinson, S. R., 277, 311, 312 Rocha, C. F. D., 182 Rogoff, B., 403 Romo, R., 221, 232 Rose, J. L., 347 Rossetti, Y., 31 Rowe, J. B., 226 Ruff man, T., 334 Runeson, S., 59, 70, 119, 129, 133 Russell, B., 2 Rutherford, M. D., 99
Author Index S Sahm, C. S., 2, 7, 10, 11, 19, 23, 32 Sawamoto, N., 226 Saxe, R., 84, 102, 323 Saygin, A. P., 355 Scaife, M., 339 Schall, J. D., 221, 222 Schmidt, R. A., 285, 312 Schofield, W. N., 335 Schultz, W., 226 Schwoebel, J., 84 Searle, J., 233 Sebanz, N., 55, 71 Sedgwick, H. A., 4, 180, 191 Sereno, M. I., 31, 262 Shadlen, M. N., 220 Shannon, C. E., 206 Shelton, A. L., 148, 154, 155, 173 Shen, L., 221 Shepard, R. N., 113 Shiff rar, M., 50, 60, 87, 116, 117, 118, 121, 122, 124, 127, 130, 137, 349 Sholl, M. J., 17 Sinai, M. J., 9 Singer, W., 208, 217, 233 Singhal, A., 253 Sirigu, A., 162 Skinner, B. F., 206 Slaughter, V., 83, 85, 87, 88 Smeets, J. B., 253 Smith, L. B., 331, 332 Smith, P. C., 7 Smythies, J. R., 2 Snyder, L. H., 208, 219, 220, 232, 251 Sommerville, J. A., 358 Speigle, J. M., 8, 9 Spelke, E., 401 Spivey, M., 48 Spradley, J., 403 Stanfield, R. A., 375 Stankowich, T., 182 Steenhuis, R. E., 7, 8 Stefanucci, J. K., 191, 195 Stein, D., 375 Stein, J. F., 208, 216, 219, 220 Stekelenberg, J. J., 89 Stengel, E., 357 Stepper, S., 96 Sterelny, K., 209 Stetten, G., 166, 167 Stevens, J. A., 51, 124, 125, 349 Stevenson, H. W., 308
417
Still, A., 209 Strack, F., 96 Sugrue, L. P., 219, 220 Sumi, S., 329, 349 Swendsen, R., 164 T Tai, Y. F., 327 Takikawa, Y., 226 Tanaka, J. W., 88, 91 Tanji, J., 226 Tenny, C., 383, 384, 397 Thelen, E., 82, 204, 209, 227, 277, 331, 354 Thompson, E., 209 Thompson, W. B., 2, 10, 12, 13, 21, 23 Thomson, J. A., 7, 8, 14, 18, 159 Thorndike, E., 205 Thornton, I. M., 49, 57, 59, 122 Tipper, S. P., 222, 223, 231 Titchener, E., 204 Titzer, R., 292 Toates, F., 231 Tolman, E. C., 206 Tomasello, M., 330, 356 Toye, R. C., 29 Tracy, J. L., 95 Treue, S., 217, 218, 232 Trevarthen, C., 82 Turati, C., 85 Turing, A. M., 206, 207 Tversky, B., 155 U Umiltà, M. A., 46 Ungerleider, L. G., 208, 216, 232 V Valenza, E., 85 Valyear, K. F., 187, 198 van Donkelaar, P., 265 van Hof, P., 335 Van Sommers, P., 60 Vanni, S., 267 Vereijken, B., 285 Verfaillie, K., 329 Virji-Babul, N., 123, 126 Viviani, P., 50, 125, 130, 329 Vogt, S., 326 Volkmar, F., 97 von der Malsburg, C., 208, 233 von Hofsten, C., 253, 324, 340 Vygotsky, L., 403
418
Author Index
W Wagner, M., 29, 148 Walk, R. D., 291, 292 Wall, J., 192 Wallach, H., 116 Wapner, S., 335 Warren, W. H., 213, 278, 290, 306 Watson, J. B., 206 Weinstein, S., 85 Weiskrantz, L., 30 Weiss, P. H., 249 Welsh, T. N., 222 Wenger, K. K., 226 Went, F. W., 195 Wertheimer, M., 120 Wesp, R., 197 Whitshaw, I. Q., 32 Willbarger, J., 97 Wilson, H., 116, 124, 125 Wilson, M., 45, 47, 48, 49, 52, 54, 65, 71, 82, 83, 84 Winston, J. S., 133 Wise, S. P., 220
Witherington, D. c., 292 Witt, J. K., 1, 20, 32, 183, 184, 191, 193, 196, 198 Wolff, W., 59 Wolpert, D. M., 48, 214, 328 Wong, E., 31 Woodward, A. L., 324, 339, 345 Worsley, C. L., 32 Wraga, M. J., 173 Wray, R. E., 45 Wu, B., 3, 5, 7, 8, 19, 29, 30, 32, 167, 168 Wundt, W., 204 Y Yarbus, A., 180 Ydenberg, R. C., 182 Yin, R. K., 88 Z Zahorik, P., 8 Zelazo, P. D., 331 Zwaan, R. A., 397
Subject Index Numbers in italic refer to figures or tables
A Action, 204 characterized, 45 cognitively mediated vs. perceptually guided, 165, 165–169, 166 distance perception, 6–10 intention, 55–56 Action coordination, action prediction, 67–69 Action identification, self-generated actions, 58–65 Action perception, 45–71 body sense, 69–70 continuous, graded representations, 48 expertise, 53–58 motor laws, 49–53 apparent body motion, 51–52 Fitt’s law, 52–53 two-thirds power law, 50–51 new skill, 56–58 spatial perception, 6–10 Action prediction action coordination, 67–69 handwriting, 66–67 own vs. other’s action, 65–66 Action representations, 46 Action selection, 212–213, 213 evolutionary elaborations, 215 parameters, 212–213, 213 Action simulation, 48, 326–326 Action specification, 212–213, 213 affordance competition hypothesis, 205, 218–230 cognitive ability, 227–247 decision making through distributed consensus, 224–227 simultaneous processing of potential actions, 221–224 fronto-parietal network, 205, 218–230 parameters, 212–213, 213
Action understanding deafferentation, 69–70 defined, 324 human movement structure perception, 328–329, 329 modes, 324–325 motor knowledge development perspective, 323–359 perseverative errors in searching for hidden objects, 331–338, 332, 333, 336, 338 point-light display perception of biological motion, 348–356, 349, 351, 352, 353, 354 visual orienting in response to deictic gestures, 339–348, 342, 343, 344, 346 prediction of effects of action, 327–328 covert imitation, 327–328 proprioception, 69–70 sensory neuronopathy, 69–70 touch, 69–70 Adaptation path integration, 20 perceptually directed action, 20–25, 22 distance-specific adaptation, 22–23 Affordance competition hypothesis, 211–234 action specification, 205, 218–230 cognitive ability, 227–247 decision making through distributed consensus, 224–227 simultaneous processing of potential actions, 221–224 cerebral cortex, 214–215, 215 fronto-parietal network, 205, 218–230 visual processing, 216–218 Affordances, 278–280 defined, 278 development, 280–288
419
420
Subject Index
Affordances (continued) body growth, 280–281, 282 environment, 287 motor proficiency, 281–287, 284, 285 new perception-action systems, 281– 287, 284, 285 infant perception, 288 bridges, 303–306 bridges using handrail, 303–306, 304 bridges with wobbly and wooden handrails, 305, 305–306 cliffs, 290–293, 292 gaps, 293–297, 294, 298 perceptual problem, 288–290 slopes, 297–380, 299 slopes crawling vs. walking, 300–301 walking with weights, 302, 302–303 motor action, 278–280 thresholds, 278–279, 279 Alignment allocentric layer, 172–173, 174, 175 accessibility ordering, 173 bridging from actor’s representation to other coordinate systems, 169–170 frames of reference, 145–175 alignment load taxonomy, 171–175, 174 coordinate transformation, 157–158 imaginal walking, 170–171 parameter remapping, 157–158 right-hand rule, 171 ultrasound, 171 obliqueness, 173–174, 174, 175 Allocentric layer, 172 alignment, 172–173, 174, 175 accessibility ordering, 173 body-in-environment frame, 172 defined, 146 environmental frame of reference, 172 frames of reference, 172–173, 174, 175 accessibility ordering, 173 identification, 150 set of measures, 149 imagined frame, 172 Ambiguity, perspective tracking, 384–387 relativization, 382–384, 389–392 scope, 387–389 Aperture problem, 115, 115–118, 117 Attractor space, 48 Audition, blind walking, 8–9, 9 Auditory distance perception, 23–25, 24 Auditory stimulus, self-identification, 62–65
Autism, social perception, 97–100 atypical face and configural processing, 98–99 body perception, 99 difficulties in social adjustment, 98 mimicry, 98 perceiving emotion from movement, 100 rule-based approach of emotional perception, 99–100 social-emotional perception, 99–100 template-based processes, 99, 100 B Balance, infant, 281–282 Ball or bean bag throwing, 7 Behavior cognitive processes, 205, 212 motor processes, 205, 212 perceptual processes, 211–212 Behaviorism, 206 schematic functional architectures, 205 Belly crawling, 283, 284 Blind walking, 7–11, 8, 18 audition, 8–9, 9 effect of feedback, 21–22 full-cue conditions, 9–10, 10 recalibration, 21–22 reduced-cue conditions, 9–10, 10 visually-directed, 9–10, 10 visual perception, gain, 21 Body growth, 280–281, 282 fetus, 280 infant, 280–281 Body image matching, projection, 370–371 Body-in-environment frame, allocentric layer, 172 Body inversion paradigm, 88–89, 89, 90 Body movement, see also Specific type anatomically plausible movement path, 51 multimodal body schema, 51–52 Body perception configural processing continuum, 88–89 creating self-other correspondences, 83–93 multimodal body schema, 51–52 spatial perception, 181–183 action-specific perception, 181 behavioral ecology, 180–182 specialized body processing, 87–90 sources, 90–92 specialized body representations, 84–86 using one’s own body to organize information from others, 86–87
Subject Index Body posture, 80–81 feelings, 96–97 importance, 81 understanding others, 94–97 Body processing, specialized, 87–90 sources, 90–92 Body representations, specialized, 84–86 Body sense, action perception, 69–70 Brain, 205–206, see also Specific part macaque monkey brain, 247–249 peripersonal space, 248 Bridges, 303–306 C Central executive, 224 Cerebral cortex, 208 affordance competition hypothesis, 214–215, 215 Clapping, identification of one’s own, 62–63 Cliffs, 290–293, 292 Cognition embodiment, relationship, 203–204 general architecture, 45 schematic functional architectures, 205 Cognitive development, language, 403 Cognitive neuroscience, 208 Cognitive psychology, 208, 209 Color vision, 3 Common coding theory of perception and actions, 46 drawing, 60–62 embodiment, 47 handwriting, 60–62 motor skill acquisition, 58 representation, 47–48 vs. radical interactionism, 47 Computation, 206 Computer metaphor, 205, 207 Concept-measurement relationship, perception, 3–4 Converging measures, distance perception, 32–33 Coreference, 379–380 Cortical visual processing, two visual streams framework, 30–32 Crawling, infants, 283–285, 284, 285 Cue integration, 180 current models, 180 Domini et al.’s model, 180 perceiver’s goals, 180–181 perceiver’s intent, 180–181 weighted averaging models, 180
421
D Dancers, expertise in action perception, 54–55, 57–58 Deafferentation, action understanding, 69–70 Deictic gestures, visual orienting in response to deictic gestures, 339– 348, 342, 343, 344, 346 Delaying commitment, 380–382 Descriptive representations, 210–211 Development affordances, 280–288 body growth, 280–281, 282 environment, 287 motor proficiency, 281–287, 284, 285 new perception-action systems, 281– 287, 284, 285 afte r infancy, 312–314 evidence for observation–execution matching system, 330–356 motor knowledge and action understanding, 323–359 Direct matching system, 325, 327 developmental evidence, 330–356 Distance perception, 1–33 action, 6–10 converging measures, 32–33 indirect methods, 4 intention to act, 1 judgments of collinearity, 5 judgments of perceived exocentric extent, 5 meaning of, 1 methods for measuring, 4–12 perceived exocentric distance, distortions, 29–30 perceived shape, distortions, 29–30 perception of exocentric direction, 5 percept-percept couplings, 4 perceptually directed action model, 12–19 calibration role, 12–13, 19–25 processing stages, 13–15, 14–15 reasons for error, 1 scale construction, 5 spatial updating, 1, 2, 6–10 accuracy, 1–2 triangulation methods, 10–12, 11, 13 two visual systems, action-specific representations, 30–32 verbal report, 4, 5 Dorsal processing, spatial perception, 188, 189
422
Subject Index
Dorsal stream, 247–248 Drawing common coding theory of perception and actions, 60–62 mirror systems, 60–62 point-light displays, 60–62 Dualism, 204 schematic functional architectures, 205 Dynamical theory, 209 E Effector-specific areas, 248–249 Egocentric distance, verbal reports, spatial updating to correct for bias, 25–29, 27 Egocentric frames identification, 150 set of measures, 149 Embedded action fetus, 276–277 infant perception, 276–277 Embodied action fetus, 276–277 infant perception, 276–277 Embodied cognition, 203–204, 209 action context, 82 action goal, 82 sensorimotor roots, 81–82 social perception, added complexities, 81–83 in social world, 81 Embodied framework for behavior, 210–215 Embodied linguistic perspectives, mental model encoding, 369–404 Embodied motion perception, visual sensitivity, 113–137 Embodiment, 91 characterized, 45–46 common coding theory of perception and actions, 47 defined, 203 meaning, 203–204 mirror systems, 47 theoretical assumptions, 45–46 Emmert’s Law, size-distance invariance, 5 Emotional body perception, 93–97 Emotional contagion, 95–96 Empathy, 397 projection, 372 Environmental frame of reference, allocentric layer, 172 Evidentiality, 397–399
Expertise, 91 action perception, 53–58 Extinction, peripersonal space, 248 F Face, 88 relative location of features, 85 Face inversion effect, 88–89, 90 Facial expression, 80 effect of facial movement on affective experience, 96 induced corresponding emotional states, 96 understanding others, 94–97 Falling, spatial perception, 194–195 cost of injury, 195 slant perception, 195 vertical distance overestimation, 195–196 Fetus body growth, 280 embedded action, 276–277 embodied action, 276–277 Fitt’s law, 52–53 Frames of reference, see also Specific type alignment, 145–175, 157–158 alignment load taxonomy, 171–175, 174 coordinate transformation, 157–158 imaginal walking, 170–171 parameter remapping, 157–158 right-hand rule, 171 ultrasound, 171 allocentric frames, 148–151, 149 identification, 150 set of measures, 149 allocentric layer, 172–173, 174, 175 accessibility ordering, 173 coordination across body-defined, 146 defined, 146 egocentric frames, 148–151, 149 identification, 150 set of measures, 149 mechanisms, intermediate levels of description, 169–170 methods to identify parameters, 151 errors spatially biased by frame, 155–156 parameter values easiest to report, 154–155, 155 what is reported?, 151–154, 152, 153 multiple, 145–175
Subject Index changes in egocentric parameters under imagined locomotion, 158– 161, 160, 161 cognitively mediated action vs. perceptually guided action, 165, 165–169, 166 coordinating physical frames of reference, 169 current positional cues, 159 embodied actor in, 158–169 imagined updating, 159–161, 160, 161 ongoing movement cues, 159 right-hand rule, 162–164, 163 spatial thinking through action, 161–164 obliqueness, 173–174, 174, 175 parameters, 147 processes, 147–148 reference-frame parameterization studies implications, 156–157 task demands, 145 Fronto-parietal network action specification, 205, 218–230 affordance competition hypothesis, 205, 218–230 Full-cue conditions, blind walking, 9–10, 10 Functional magnetic resonance imaging, 247–268 superior parieto-occipital cortex, 247–268 activation from arm transport during reaching actions, 249–258, 250, 254, 256–257 in humans, 247–268 preference for near gaze, 262–266, 264 preference for objects within arm’slength, 259–262, 260 G Gait, 56–57 infant perception, 350–354, 351, 352, 353, 354 point-light displays, 59–60 self-generated actions, 59–60 self-recognition, individualistic styles, 60 symmetrical patterning, 350–354, 351, 352, 353, 354 Gaps, 293–297, 294, 298 Grammar perspective hypothesis coreference, 379–380
423
delaying commitment, 380–382 noncentrality, 380 structure vs. function, 378–384 perspective taking clitic assimilation, 392 empathy, 397 evidentiality, 397–399 implicit causality, 393–395 metaphor, 399 perspectival overlays, 395–399 space, 395–396 time, 396–397 perspective tracking, 373 maintenance, 378 modification, 378 shift, 378 Grasping neurophysiology, 187–188 spatial perception, 186–188 left-handed people, 187 right-handed people, 186–187 H Handwriting action prediction, 66–67 common coding theory of perception and actions, 60–62 mirror systems, 60–62 own vs. other’s recognition, 66–67 point-light displays, 60–62 trajectory, 51 velocity, 61–62 Heading, 147 Hidden objects, perseverative errors in searching for, 331–338, 332, 333, 336, 338 Hitting, spatial perception, 196–197 Human motion object motion comparing perception, 114–123 visual system in, compared, 114–137 visual analysis, 113–137 visual system action perception vs. object perception, 121–137 aperture problem, 115, 115–118, 117 bodily form, 131–133 controlling for viewpoint-dependent visual experience, 130–131 level of analysis, 116–117, 117 local measurements inherently ambiguous, 116
424
Subject Index
Human motion (continued) motion integration across space, 115–123 motion integration across time, 120–123, 123 motor experience vs. visual experience, 128–133 motor expertise, 124–127 multiple apertures, 118 perceptual sensitivity to emotional actions, 136–137 point-light displays, 118–120, 119 social context and apparent human motion, 134–136, 135 social-emotional processes, 133–137 visual expertise, 127–128 I Ideomotor principle, voluntary action, 46 Imaginal walking, 170–171 Imagined frame, allocentric layer, 172 Infant perception, 275–315 affordances, 288 bridges, 303–306 bridges using handrail, 303–306, 304 bridges with wobbly and wooden handrails, 305, 305–306 cliffs, 290–293, 292 gaps, 293–297, 294, 298 perceptual problem, 288–290 slopes, 297–380, 299 slopes crawling vs. walking, 300–301 walking with weights, 302, 302–303 embedded action, 276–277 embodied action, 276–277 everyday experience, 311–312 gait, 350–354, 351, 352, 353, 354 learning in development, 306–314 learning sets, 308–310 perception of point-light displays of biological motion, 348–356, 349, 351, 352, 353, 354 perseverative errors in searching for hidden objects, 331–338, 332, 333, 336, 338 visual orienting in response to deictic gestures, 339–348, 342, 343, 344, 346 Infants balance, 281–282 body growth, 280–281 crawling, 283–285, 284, 285
locomotion, 283–287 everyday experience, 311–312 learning in development, 306–314 learning sets, 308–310 perceptually guided action, 275–315 perception-action studies development, 276 learning, 276 perception-action systems, 286–287 spatial body representation, 85–86 supramodal body scheme, 86 walking, 285–286 Information, 206–207 Information processing system, schematic functional architectures, 205 Intention, 323 action, 55–56 distance perception, 1 self-other relationships, 82 Inter-location relations, parameters, 147 Inversion effects, 92 J Joint attention, visual orienting in response to deictic gestures, 339–348, 342, 343, 344, 346 L Language, 206, 373–404 cognitive development, 403 integration, 400–403 mental model encoding, 369–404 production, 400–403 Lateral intraparietal area, 219–220 Left-handed people, 187 Listening, 54 Localization, projection, 371 Locomotion, infants, 283–287 everyday experience, 311–312 learning in development, 306–314 learning sets, 308–310 Luminous figures, judgments of shapes, 18 M Machine theory, 206 Magnetoencephalography, superior parieto-occipital cortex, 256–257 Metaphor, 399 Mimicry, 95–96, 98 Mirror neurons, 325–326 Mirror systems, 46 drawing, 60–62
Subject Index embodiment, 47 handwriting, 60–62 motor skill acquisition, 58 parietal cortex, 46–47 premotor cortex, 47 Motor action affordances, 278–280 perceptual guidance, 275–276 Motor knowledge, action understanding development perspective, 323–359 perception of point-light displays of biological motion, 348–356, 349, 351, 352, 353, 354 perseverative errors in searching for hidden objects, 331–338, 332, 333, 336, 338 visual orienting in response to deictic g estures, 339–348, 342, 343, 344, 346 Motor laws, action perception, 49–53 apparent body motion, 51–52 Fitt’s law, 52–53 two-thirds power law, 50–51 Motor skill acquisition, 57–58 common coding theory of perception and actions, 58 mirror systems, 58 Multimodal spatial body representation, 86–87 Musicians, self-identification, 64–65 N Naïve realism, 2 Near space, characterized, 184 Neural data, pragmatic perspective interpretation, 216 Neuroimaging studies, peripersonal space, 247–268 Neurophysiology, 209 grasping, 187–188 New skill, action perception, 56–58 Noncentrality, 380 O Object, defined, 147 Object motion human motion comparing perception, 114–123 visual system in, compared, 114–137 visual system action perception vs. object perception, 121–137 aperture problem, 115, 115–118, 117
425
bodily form, 131–133 controlling for viewpoint-dependent visual experience, 130–131 level of analysis, 116–117, 117 local measurements inherently ambiguous, 116 motion integration across space, 115–123 motion integration across time, 120–123, 123 motor experience vs. visual experience, 128–133 motor expertise, 124–127 multiple apertures, 118 perceptual sensitivity to emotional actions, 136–137 point-light displays, 118–120, 119 social context and apparent human motion, 134–136, 135 social-emotional processes, 133–137 visual expertise, 127–128 Obliqueness alignment, 173–174, 174, 175 defined, 146 frames of reference, 173–174, 174, 175 Observation-execution matching system, 325 developmental evidence, 330–356 Occipito-parietal cortex, 247–268 Open-loop behavior, 7 P Parameters frames of reference, 147 inter-location relations, 147 types, 147 Parietal cortex, 209 mirror systems, 46–47 Path integration, adaptation, 20 Perceived egocentric distance, measurement, 1–33 Perceived exocentric distance, distance perception, distortions, 29–30 Perceived shape, distance perception, distortions, 29–30 Perception, 204, see also Specific type action continuous, graded representations, 48 expertise, 53–58 characterized, 45 concept-measurement relationship, 3–4 representational nature, 2–3
426
Subject Index
Perception-action studies, infant perception development, 276 learning, 276 Perception-action systems, infants, 286–287 Perception of others, inherently social, 79 Percept-percept couplings, distance perception, 4 Perceptually directed action, 7, 12–19 adaptation, 20–25, 22 distance-specific adaptation, 22–23 angular declination, 17, 20 calibration, 12–13, 19–25 infant locomotion, 275–315 on-line modifications, 18–19 spatial updating, 19 systematic error, 19–20 Perceptual representations, 46 Peripersonal space, 248 brain, 248 characterized, 184 extinction, 248 neuroimaging studies, 247–268 spatial encoding, 265–268 Personal space, characterized, 184 Perspective hypothesis, 373–376, 403 claims, 373–374 grammar coreference, 379–380 delaying commitment, 380–382 noncentrality, 380 structure vs. function, 378–384 Perspective taking, grammar clitic assimilation, 392 empathy, 397 evidentiality, 397–399 implicit causality, 393–395 metaphor, 399 perspectival overlays, 395–399 space, 395–396 time, 396–397 Perspective tracking, 376 ambiguity, 384–387 relativization, 382–384, 389–392 scope, 387–389 grammar, 373 maintenance, 378 modification, 378 shift, 378 projection, 372–373 sentence comprehension
competition, 377 cues, 377 incremental interpretation, 376 load reduction, 376–377 principles, 376–377 role slots, 377 starting points, 377 Phenomenal world, 2 Philosophy, psychology, relationship, 204 Pitch, 147 Point-light displays, 118–120, 119 drawing, 60–62 gait, 59–60 handwriting, 60–62 Positron emission tomography, superior parieto-occipital cortex, 256–257 Posterior parietal cortex, 218–220 Postperceptual processes, 12 Pragmatic representations, 210–211 Premotor cortex, mirror systems, 47 Projection, 370–373 body image matching, 370–371 empathy, 372 localization, 371 perspective tracking, 372–373 Proprioception, action understanding, 69–70 Psyche, 204 Psychology history of, 204–210 neglected body, 204 philosophy, relationship, 204 Purpose, spatial perception, 183 Putting, spatial perception, 196–197 R Radical interactionism, 45–46 vs. common coding theory of perception and actions, 47 Rats in maze, 206 Reaching spatial perception, 184–186 extending reach with tool, 184 tool effe ct, 184, 185 superior parieto-occipital cortex, 249– 258, 250, 254, 256–257 preferential response to objects within reachable space, 259–262, 260 Recalibration blind walking, 21–22 general form, 23 verbal reports, 21–22
Subject Index Reduced-cue conditions, blind walking, 9–10, 10 Relativization, 382–384, 389–392 Representation, common coding theory of perception and actions, 47–48 Representative realism, 2 Right-hand rule, 162–164, 163, 171 Roll, 147 S Scale, virtual reality systems, uniform scale compression, 2 Scale construction, distance perception, 5 Searching, perseverative errors in searching for hidden objects, 331–338, 332, 333, 336, 338 Self-generated actions action identification, 58–65 gait, self-recognition, 59–60 Self-identification, auditory stimulus, 62–65 Self-other mapping, 83–93 Self-other relationships, intentionality, 82 Self-recognition, gait, individualistic styles, 60 Sensory neuronopathy, action understanding, 69–70 Sentence comprehension, perspective tracking competition, 377 cues, 377 incremental interpretation, 376 load reduction, 376–377 principles, 376–377 role slots, 377 starting points, 377 Shared representation, 355 Simulation, 48 Situated cognition, 46 Situated robotics, 209 Size-distance invariance, 4–5 defined, 4–5 Emmert’s Law, 5 Slant perception, 189–191, 195 Slopes, 289–290, 297–380, 299 Social-emotional processes, 133–137 Social perception, 79–102 autism, 97–100 atypical face and configural processing, 98–99 body perception, 99 difficulties in social adjustment, 98
427
mimicry, 98 perceiving emotion from movement, 100 rule-based approach of emotional perception, 99–100 social-emotional perception, 99–100 template-based processes, 99, 100 body-specific representations and processes, 81 creating self-other correspondences, 83–93 deficits in understanding others, 97–100 embodied cognition, added complexities, 81–83 importance, 80–81 specialized body processing, 87–90 specialized body processing sources, 90–92 specialized body representations, 84–86 using one’s own body to organize information from others, 86–87 Spatial image nonperceptual input, 15 path integration, 16 updating, 16, 16 Spatial perception action, 6–10 action-specific approach, 183–199 body, 181–183 action-specific perception, 181 behavioral ecology, 180–182 characterized, 179 dorsal processing, 188, 189 falling, 194–195 cost of injury, 195 slant perception, 195 vertical distance overestimation, 195–196 grasping, 186–188 left-handed people, 187 right-handed people, 186–187 hitting, 196–197 purpose, 183 putting, 196–197 putting together what, where, and how, 197–199 reaching, 184–186 extending reach with tool, 184 tool effe ct, 184, 185 spatial updating, 6–10 throwing, 193–194 anticipated effort, 194
428
Subject Index
Spatial perception (continued) effort associated with, 193 intention of throwing, 193 ventral processing, 188, 189 visually specified environment, 179–181 walking, 188–193 behavioral ecology, 192–193 perceiving distances, 192 psychophysical response compression, 190 slant perception, 189–191 surface layout of ground, 188–189 Spatial thinking, 170 Spatial updating distance perception, 1, 2, 6–10 accuracy, 1–2 perceptually directed action, 19 spatial perception, 6–10 Split treadmill, 354–355 Structuralism, 204 Substance dualism, 204 Superior parieto-occipital cortex activation foci, 266–268, 267 functional magnetic resonance imaging, 247–268 activation from arm transport during reaching actions, 249–258, 250, 254, 256–257 in humans, 247–268 preference for near gaze, 262–266, 264 preference for objects within arm’slength, 259–262, 260 magnetoencephalography, 256–257 positron emission tomography, 256–257 reaching, 249–258, 250, 254, 256–257 preferential response to objects within reachable space, 259–262, 260 visual perception, preference for near gaze, 262–266, 264 T Tasks, frames of reference, 145 own intrinsic frame of representation, 146 Teleoperator systems, 3 Thro wing, 7 spatial perception, 193–194 anticipated effort, 194 effort associated with, 193 intention of throwing, 193 Timing information, 62–64 Tools, 186 Touch, action understanding, 69–70
Trajectory, velocity, 50–51 Triad judgments, 154–155, 155 Triangulation methods, distance perception, 10–12, 11, 13 Two-thirds power law, 50–51 Two visual systems debate about, 1 distance perception, action-specific representations, 30–32 U Ultrasound, 171 Understanding others body posture, 94–97 facial expression, 94–97 V Velocity handwriting, 61–62 trajectory, 50–51 Ventral processing, spatial perception, 188, 189 Ventral visual pathway, 324–325 Verbal reports distance perception, 4, 5 effect of feedback, 21–22 egocentric distance, spatial updating to correct for bias, 25–29, 27 recalibration, 21–22 systematic underreporting bias, 26–29, 27, 28 Vertical distance overestimation, 195–196 Virtual reality, 172–173 uniform scale compression, 2 Visual attention, action-specific, 181 Visual distance perception, 23–25, 24 Visual expertise, 91 Visually directed pointing, 7 Visually specified environment, spatial perception, 179–181 Visual perception, see also Visual system blind walking, gain, 21 learning new motor task, 56–57 superior parieto-occipital cortex, preference for near gaze, 262–266, 264 Visual processing, 216–218 affordance competition hypothesis, 216–218 Visual space perception, 3 Visual stimulus, above-ground plane, 17, 17–18
Subject Index Visual system, 113, see also Visual perception human motion action perception vs. object perception, 121–137 aperture problem, 115, 115–118, 117 bodily form, 131–133 controlling for viewpoint-dependent visual experience, 130–131 level of analysis, 116–117, 117 local measurements inherently ambiguous, 116 motion integration across space, 115–123 motion integration across time, 120–123, 123 motor experience vs. visual experience, 128–133 motor expertise, 124–127 multiple apertures, 118 perceptual sensitivity to emotional actions, 136–137 point-light displays, 118–120, 119 social context and apparent human motion, 134–136, 135 social-emotional processes, 133–137 visual expertise, 127–128 modular understanding, 113 object motion action perception vs. object perception, 121–137 aperture problem, 115, 115–118, 117 bodily form, 131–133 controlling for viewpoint-dependent visual experience, 130–131
429
level of analysis, 116–117, 117 local measurements inherently ambiguous, 116 motion integration across space, 115–123 motion integration across time, 120–123, 123 motor experience vs. visual experience, 128–133 motor expertise, 124–127 multiple apertures, 118 perceptual sensitivity to emotional actions, 136–137 point-light displays, 118–120, 119 social context and apparent human motion, 134–136, 135 social-emotional processes, 133–137 visual expertise, 127–128 Visual virtual reality, perceptual errors, 10, 11 Voluntary action, ideomotor principle, 46 W Walking infants, 285–286 spatial perception, 188–193 behavioral ecology, 192–193 perceiving distances, 192 psychophysical response compression, 190 slant perception, 189–191 surface layout of ground, 188–189 Y Yaw, 147