February 2011 Volume 15, Number 2 pp. 47–94 Editor Stavroula Kousta
Review
Executive Editor, Neuroscience Katja Brose
47
Sounds and scents in (social) action
Salvatore M. Aglioti and Mariella Pazzaglia
Journal Manager Rolf van der Sanden
56
Value, pleasure and choice in the ventral prefrontal cortex
Fabian Grabenhorst and Edmund T. Rolls
68
Cognitive culture: theoretical and empirical insights into social learning strategies
Luke Rendell, Laurel Fogarty, William J.E. Hoppitt, Thomas J.H. Morgan, Mike M. Webster and Kevin N. Laland
77
Visual search in scenes involves selective and nonselective pathways
Jeremy M. Wolfe, Melissa L.-H. Võ, Karla K. Evans and Michelle R. Greene
85
Emotional processing in anterior cingulate and medial prefrontal cortex
Amit Etkin, Tobias Egner and Raffael Kalisch
Journal Administrator Myarca Bonsink Advisory Editorial Board R. Adolphs, Caltech, CA, USA R. Baillargeon, U. Illinois, IL, USA N. Chater, University College, London, UK P. Dayan, University College London, UK S. Dehaene, INSERM, France D. Dennett, Tufts U., MA, USA J. Driver, University College, London, UK Y. Dudai, Weizmann Institute, Israel A.K. Engel, Hamburg University, Germany M. Farah, U. Pennsylvania, PA, USA S. Fiske, Princeton U., NJ, USA A.D. Friederici, MPI, Leipzig, Germany O. Hikosaka, NIH, MD, USA R. Jackendoff, Tufts U., MA, USA P. Johnson-Laird, Princeton U., NJ, USA N. Kanwisher, MIT, MA, USA C. Koch, Caltech, CA, USA M. Kutas, UCSD, CA, USA N.K. Logothetis, MPI, Tübingen, Germany J.L. McClelland, Stanford U., CA, USA E.K. Miller, MIT, MA, USA E. Phelps, New York U., NY, USA R. Poldrack, U. Texas Austin, TX, USA M.E. Raichle, Washington U., MO, USA T.W. Robbins, U. Cambridge, UK A. Wagner, Stanford U., CA, USA V. Walsh, University College, London, UK Editorial Enquiries Trends in Cognitive Sciences
Cell Press 600 Technology Square Cambridge, MA 02139, USA Tel: +1 617 397 2817 Fax: +1 617 397 2810 E-mail:
[email protected]
Forthcoming articles Cognitive neuroscience of self-regulation failure Todd Heatherton and Dylan D. Wagner
Representing multiple objects as an ensemble enhances visual cognition George A. Alvarez
Songs to syntax: The linguistics of birdsong Robert C Berwick, Kazuo Okanoya, Gabriel J Beckers and Johan J. Bolhuis
Connectivity constrains the organization of object knowledge Bradford Zack Mahon and Alfonso Caramazza
Specifying the self for cognitive neuroscience Kalina Christoff, Diego Cosmelli, Dorothée Legrand and Evan Thompson Cover: Although vision holds a central role in social interactions, the social perception of actions also relies on auditory and olfactory information. On pages 47–55, Salvatore M. Aglioti and Mariella Pazzaglia review recent evidence showing how actions can be guided by sounds and smells both independently as well as within the context of the multimodal perceptions and representations that characterize real world experiences. Crucially, non-visual information appears to have a crucial role not only in guiding actions, but also in anticipating others' actions and thus in shaping social interactions more generally.
Review
Sounds and scents in (social) action Salvatore M. Aglioti1,2 and Mariella Pazzaglia1,2 1 2
Dipartimento di Psicologia, Sapienza University of Rome, Via dei Marsi 78, Rome I-00185, Italy IRCCS Fondazione Santa Lucia, Via Ardeatina 306, Rome I-00179, Italy
Although vision seems to predominate in triggering the simulation of the behaviour and mental states of others, the social perception of actions might rely on auditory and olfactory information not only when vision is lacking (e.g. in congenitally blind individuals), but also in daily life (e.g. hearing footsteps along a dark street prompts an appropriate fight-or-fly reaction and smelling the scent of coffee prompts the act of grasping a mug). Here, we review recent evidence showing that non-visual, telereceptor-mediated motor mapping might occur as an autonomous process, as well as within the context of the multimodal perceptions and representations that characterize real-world experiences. Moreover, we discuss the role of auditory and olfactory resonance in anticipating the actions of others and, therefore, in shaping social interactions. Telereceptive senses, namely vision, audition and olfaction Perceiving and interacting with the world and with other individuals might appear to be guided largely by vision, which, according to classical views, leads over audition, olfaction and touch, and commands, at least in human and non-human primates, most types of cross-modal and perceptuo-motor interactions [1]. However, in sundry daily life circumstances, our experience with the world is inherently cross-modal [2]. For example, inputs from all sensory channels combine to increase the efficiency of our actions and reactions. Seeing flames, smelling smoke or hearing a fire alarm might each be sufficient to create an awareness of a fire. However, the combination of all these signals ensures that our response to danger is more effective. The multimodal processing of visual, acoustic and olfactory information is even more important for our social perception of the actions of other individuals [3]. Indeed, vision, audition and olfaction are the telereceptive senses that process information coming from both the near and the distant external environment, on which the brain then defines the self–other border and the surrounding social world [4,5]. Behavioural studies suggest that action observation and execution are coded according to a common representational medium [6]. Moreover, neural studies indicate that seeing actions activates a fronto-parietal neural network that is also active when performing those same actions [7,8]. Thus, the notion that one understands the actions of others by simulating them motorically is based mainly on visual studies (Box 1). Vision is also the channel used for studying the social nature of somatic experiences (e.g. touch and pain) [9–11] and emotions (e.g. anger, disgust Corresponding author: Aglioti, S.M. (
[email protected]).
and happiness) [12]. In spite of the notion that seeing might be informed by what one hears or smells, less is known about the possible mapping of actions through the sound and the odour associated with them, either in the absence of vision or within the context of clear cross-modal perception. In this review, we question the exclusive supremacy of vision in action mapping, not to promote a democracy of the senses, but to highlight the crucial role of the other two telereceptive channels in modulating our actions and our understanding of the world in general, and of the social world in particular. The sound and flavour of actions Classic cross-modal illusions, such as ventriloquism or the McGurk effect, indicate that vision is a key sense in several circumstances [13,14]. Therefore, when multisensory cues are simultaneously available, humans display a robust tendency to rely more on visual than on other forms of sensory information, particularly when dealing with spatial tasks (a phenomenon referred to as the ‘Colavita visual dominance effect’) [15]. However, our knowledge is sometimes dominated by sound and is filtered through a predominantly auditory context. Auditory stimuli might, for example, capture visual stimuli in temporal localization tasks [16]. Moreover, the presentation of two beeps and a single flash induces the perception of two visual stimuli [17]. Thus, sound-induced flash illusions create the mistaken belief that we are seeing what we are, in fact, only hearing. This pattern of results might be in keeping with the notion that multisensory processing reflects ‘modality appropriateness’ rules, whereby vision dominates in spatial tasks, and audition in temporal ones [18]. However, psychophysical studies indicate that the degradation of visual inputs enables auditory inputs to modulate spatial localization [19]. This result is in keeping with the principle of inverse effectiveness [20], according to which multisensory integration is more probable or stronger for the unisensory stimuli that evoke relatively weak responses when presented in isolation. Notably, the recording of neural activity from the auditory cortex of alert monkeys watching naturalistic audiovisual stimuli indicates that not only do congruent bimodal events provide more information than do unimodal ones, but also that suppressed responses are also less variable and, thus, more informative than are enhanced responses [21]. Relevant to the present review is that action sounds might be crucial for signalling socially dangerous or unpleasant events. Efficient mechanisms for matching audition with action might be important, even at basic levels, because they might ensure the survival of all hearing
1364-6613/$ – see front matter ß 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.tics.2010.12.003 Trends in Cognitive Sciences, February 2011, Vol. 15, No. 2
47
Review Box 1. Beyond the visuomotor mirror system Mirror neurons (MNs), originally discovered in the monkey ventral premotor cortex (F5) and inferior parietal lobe (PFG), increase their activity during action execution as well as during viewing of the same action [68,69]. Single-cell recording from the ventral premotor cortex showed that MNs fired also when sight of the hand–object interaction was temporarily occluded [70]. In a similar way, the activity in the parietal MNs of the onlooking monkey was modulated differentially when the model exhibited different intentions (e.g. grasping the same object to eat or to place it) [71]. Taken together, these results suggest that MNs represent the observed actions according to anticipatory codes. Relevant to the present review is the existence of audio-motor MNs specifically activated when the monkey hears the sound of a motor act without seeing or feeling it [72]. In addition, the multimodal response of visuo-audio-motor neurons might be superadditive; that is, stronger than the sum of the unimodal responses [73]. Whereas audio MNs might underpin an independent and selective mapping modality [72], triple-duty neurons are likely to constitute the neural substrate of the complex multimodal mapping of actions [73]. Therefore, the physiological properties of these resonant neurons suggest they constitute a core mechanism for representing the actions of others. In a recent study, single-cell recording was conducted in human patients who observed and performed emotional and non-emotional actions. The study provides direct evidence of double-duty visuo-motor neurons, possibly coding for resonant emotion and action [74]. Importantly, the human ‘see-do’ neurons were found in the medial frontal and temporal cortices (where the patients, for therapeutic reasons, had electrodes implanted). These two regions are not part of the classic mirror system, suggesting that the onlooker-model resonance extends beyond action mirroring and the premotor-parietal network. Direct information relating to the nonvisual and anticipatory properties of the human mirror system is, however, still lacking.
individuals. For example, in the dark of the primordial nights, ancestral humans probably detected potential dangers (e.g. the footsteps of enemies) mainly by audition and, therefore, implemented effective fight-or-flight behaviour. However, action–sound mediated inferences about others might also occur in several daily life circumstances in present times. Imagine, for example, your reaction to the approach of heavy footsteps when you are walking along a dark street. Furthermore, listening to footsteps of known individuals might enable one to not only recognize the identity [22], but also determine the disposition of these individuals (e.g. bad mood). Although olfaction in some mammals mediates sophisticated social functions, such as mating, and might facilitate the recognition of ‘who is who’ [23], this sense is considered somewhat reductional in humans. However, even in humans, olfaction is closely related to not only neurovegetative and emotional reactivity, but also higher-order functions, such as memory. Moreover, olfaction in humans is also linked to empathic reactivity [24], kin recognition [25], cross-modal processing of the faces of others and the construction of the semantic representation of objects [26]. Behavioural studies indicate that the grasping of small (e.g. an almond) or large (e.g. an apple) objects with characteristic odours is influenced by the delivery of the same or of different smells. In particular, a clear interference with the kinematics of grasping [27] and reaching [28] movements was found in conditions of mismatch between the observed objects (e.g. a strawberry) and the odour delivered during the task (e.g. the scent of an orange). 48
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Mapping sound- and odour-related actions in the human brain Inspired by single-cell recording in monkeys [7], many neuroimaging and neurophysiological studies suggest that the adult human brain is equipped with neural systems and mechanisms that represent visual perception and the execution of action in common formats. Moreover, studies indicate that a large network, centred on the inferior frontal gyrus (IFG) and the inferior parietal lobe (IPL), and referred to as the action observation network (AON) [29,30], underpins action viewing and action execution. Less information is available about whether the AON [31] is also activated by the auditory and olfactory coding of actions. The phenomena, mechanisms and neural structures involved in processing action-related sounds have been explored in healthy subjects (Figure 1) and in brain-damaged individuals (Box 2) using correlational [32–36] and causative approaches [37]. At least two important conclusions can be drawn from these studies. The first is that listening to the sound produced by human body parts (e.g. two hands clapping) activates the fronto-parietal AON. The second is that such activation might be somatotopically organized, with the left dorsal premotor cortex and the IPL being more responsive to the execution and hearing of hand movements than to mouth actions or to sounds that are not associated with human actions (e.g. environmental sounds, a phase-scrambled version of the same sound, or a silent event). Conversely, the more ventral regions of the left premotor cortex are more involved in processing sounds performed by the mouth (Figure 1 and Box 2). The social importance of olfaction in humans has been demonstrated in a positron emission tomography (PET) study [38], showing that body odours activate a set of cortical regions that differed from those activated by non-body odours. In addition, smelling the body odour of a friend activates different neural regions (e.g. Extrastriate body area (EBA)) from smelling the odour of strangers (e.g. amygdala and insula). However, interest in the olfactory coding of actions and its neural underpinnings is very recent, and only two correlational studies have addressed this topic thus far (Figure 2). In particular, mere perception of smelling food objects induced both a specific facilitation of the corticospinal system [39] and specific neural activity in the AON [40]. Multimodal coding of actions evoked by auditory and olfactory cues The inherently cross-modal nature of action perception is supported by evidence showing that a combination of multiple sensory channels might enable individuals to interpret actions better. The merging of visual and auditory information, for example, enables individuals to optimize their perceptual and motor behaviour [41]. Moreover, a combination of olfactive and visual inputs facilitates the selection of goal-directed movements [42]. Importantly, although auditory or olfactory cues might increase neural activity in action-related brain regions, such effects might be higher when the two modalities are combined. It has been demonstrated, for example, that the blood
()TD$FIG][ Review
0
500
Key:
5 µV
Key:
mmn mouth - hand
8
8
7
7
6 5 4 3
0.12 nA/cm²
2 1 0 -100
0
-0.12 nA/cm²
100 200 300 400 500
6 5 4 3
0.5 µV
2 1 0 -100
0
100
Latency (ms)
Motor
Latency (ms)
BA 6
Action sound
Auditory Scrambled mouth action Hand action Scrambled hand action
Key:
Hand
Mouth
Overlap
Mirror overlap
Key: Bold signal
Environmental Silence
Key:
Auditory-mirror
Unimodal
MEP amplitude (% vs. baseline)
Water
6s
Crush
2° trial
MEP amplitude (% vs. baseline)
Crush
4.8 s
Mouth
1 0 -1
50 40 30 20 10 0 -10
Crush
TMS
Hand
BA 44
Key: Congruent
1° trial
4s
2
-2 Sound Sound Visual perception execution perception
Audio visual mirror
Bimodal Incongruent
0s
IPL
7 7 6 6 5 5 4 4 3 3 2 2 1 1 0 0 -1 -1 Sound Sound Visual Sound Sound Visual perception execution perception perception execution perception
Mouth action
(c)
200 300 400 500 600 700 800
Bold signal
(b)
Action evoking sounds Non-action evoking sounds
Bold signal
mmn mouth mmn hand
Global field power
FCz
Key:
Signal-to-noise ratios
(a)
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Congruent
Unimodal Bimodal visual
Incongruent
Unimodal Bimodal auditory
50 40 30 20 10 0 -10
35 30 25 20 15 10 5 0 Unimodal Multimodal Multimodal Multimodal sum congruent incongruent incongruent visual auditory
TRENDS in Cognitive Sciences
Figure 1. The sound of actions. Representative studies on the auditory mapping of actions, performed by using different state-of-the-art cognitive neuroscience techniques. (a) Left panel: cortical activity evoked by listening to sounds associated with human finger (red line) and tongue (blue line) movements that were used as deviant stimuli to evoke a potential known as mismatch negativity (MMN). Sounds not associated with any actions were used as control stimuli. Deviant stimuli produced larger MMNs than did sounds not already associated with actions 100 ms after the stimulus presentation. Furthermore, the source estimation of MMN indicates that finger and tongue sounds activated distinct regions in the left pre-motor areas, suggesting an early automatic, somatotopic mapping of action-related sounds in these regions. Right panel: auditoryevoked potentials in response to context-related sounds that typically cue a responsive action by the listener (e.g. a ringing telephone; red line) and to context-free sounds that do not elicit responsive actions (e.g. a ringing tower bell; blue line). Responses were higher for action-evoking sounds than for non-action-evoking sounds 300 ms after the stimulus, mainly in the left premotor and inferior frontal and prefrontal regions [32,33]. (b) Hearing sounds related to human actions increases neural activity in left perisylvian fronto-parietal areas relative to hearing environmental sounds, a phase-scrambled version of the same sound, or a silent event. In the frontal cortex, the pattern of neural activity induced by action-related sounds reflected the body part evoked by the sound heard. A dorsal cluster was more involved during listening to and executing hand actions, whereas a ventral cluster was more involved during listening to and executing mouth actions. Thus, audio-motor mapping might occur according to somatotopic rules. The audio-motor mirror network was also activated by the sight of the heard actions, thus hinting at the multimodal nature of action mapping [34]. (c) Single pulse TMS enables the exploration of the functional modulation of the corticospinal motor system during visual or acoustic perception of actions. During unimodal presentations, participants observed a silent video of a right hand crushing a small plastic bottle or heard the sound of a bottle being crushed. During bimodal conditions, vision and auditory stimuli were congruent (seeing and hearing a hand crushing a bottle; blue lines and bars) or incongruent (e.g. seeing a hand crushing a bottle but hearing the sound of water being poured in a glass and hearing the sound of a hand crushing a bottle but seeing a foot crushing a bottle; red lines and bars). Compared with incongruent bimodal stimulation, unimodal and congruent bimodal stimulation induced an increase of amplitude of the motor potentials evoked by the magnetic pulse. Thus, corticospinal reactivity is a marker of both unimodal and cross-modal mapping of actions [35]. Data adapted, with permission, from [32–35].
oxygenation level-dependent (BOLD) signal in the left ventral premotor cortex is enhanced when seeing and hearing another individual tearing paper as compared with viewing a silent video depicting the same scene or only hearing the sound associated with the observed action [43]. No such dissociation was found for the parietal regions, indicating that cross-modal modulation might differentially impact on the different nodes of the AON. Similarly, corticospinal motor activity in response to the acoustic presentation of the sound of a hand crushing a
small bottle was lower than to the presentation of congruent visuo-acoustic input (e.g. the same sound and the corresponding visual scene), and higher than to incongruent visuo-acoustic information (e.g. the same sound and a hand pouring water from the same bottle) [35] (Figure 1c). This pattern of results hints at a genuine, cross-modal modulation of audiomotor resonance [31]. Neurophysiological studies have identified multisensory neurons in the superior temporal sulcus (STS) that code both seen and heard actions [21]. When driven by audiovisual bimodal 49
Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Box 2. Audio-motor resonance in patients with apraxia reported visual action recognition impairment of patients with apraxia [75]. To determine whether deficits in audio-motor and visuo-motor action mapping reflect a common supramodal representation, or were driven by visual deficits, results from the lesion mapping study [37] were compared with those from neuroimaging studies in healthy subjects where visual or auditory action recognition was required. Distinct neural regions in the left hemisphere were identified that were specifically related to observing or hearing hand- and mouthrelated actions. In particular, a somatotopic arrangement along the motor strip seems to be distinctive of visual- and auditory-related actions (Figure Ib) [31]. Thus, although multimodal perception might optimize action mapping, an independent contribution to this process could be provided by vision and audition. Olfactory mapping of actions has not yet been performed in patients with brain damage.
Crucially, causative information on the auditory mapping of actions has been provided by a study on patients with apraxia [37], where a clear association was identified between deficits in performing handor mouth-related actions and the ability to recognize the associated sounds. Moreover, using state-of-the-art lesion-mapping procedures, it was shown that, whereas both frontal and parietal structures are involved in executing actions and discriminating the sounds produced by the actions of others, the ability to recognize specifically sounds arising from non-human actions appears to be linked to the temporal regions (Figure Ia). This finding supports the notion that different neural substrates underpin the auditory mapping of actions and the perception of non-human action-related cues. Because this study was based on a sound-picture matching task, it is, in principle, possible that the audio-motor mapping deficit reflects a deficit in visual motor mapping, in keeping with the
[()TD$FIG]
(a) Mouth intransitive sounds
Mouth transitive sounds
Limb intransitive sounds
Sound recognition accuracy
Key:
18 17 16 15 14 13 12 11 10
18 17 16 15 14 13 12 11 10
Mouth sounds Limb sounds Non-human sounds 18 17 16 15 14 13 12 11 10 9 8
Frontal
Parietal
Mouth Limb Control sounds sounds sounds
Animal sounds
(b)
Buccofacial apraxia patients Limb apraxia patients Controls
Key:
Limb transitive sounds
Key: Mouth-related action sounds in VLSM Body-related action sounds in fmri Body-related action observation in fmri Overlap Mouth related actions Limb related actions
18 17 16 15 14 13 12 11 10 9 8
Mouth Limb Control sounds sounds sounds
Hand
Mouth
Non-human sounds
TRENDS in Cognitive Sciences
Figure I. Visual and auditory action mapping in brain damaged patients. (a) Direct evidence for the anatomical and functional association between action execution and discrimination during matching of specific visual pictures to previously presented sounds in patients with brain damage and with or without apraxia. Voxel-based lesion symptom mapping (VLSM) analysis demonstrated a clear association between deficits in performing hand- or mouth-related actions, the ability to recognize the same sounds acoustically, and frontal and parietal lesions [37]. (b) Cortical rendering shows the voxel clusters selectively associated with deficits (VLSM study) or activation (fMRI studies) when processing limb- or mouth-related action sounds [31]. Data adapted, with permission, from [31,37].
input, the firing rate of a proportion of these cells was higher with respect to the sum of auditory or visual input alone. This superadditive response occurred when the seen action matched the heard action [44]. The STS is heavily connected to the frontal and parietal regions [45], thus hinting at the important role of temporal structures in the simulation of actions triggered by audiovisual inputs. A clear multimodal contribution to action mapping was demonstrated in a functional magnetic resonance imaging (fMRI) study where subjects observed hand-grasping actions directed to odourant objects (e.g. a fruit, such as a strawberry or an orange) that were only smelt, only seen, or both smelt and seen [40]. Grasping directed towards objects perceived only through smell activated not only the olfactory cortex, but also the AON (Figure 2). Moreover, perceiving the action towards an object coded via both olfaction and vision (visuo-olfacto-motor mapping) induced further increase in activity in the temporo-parietal cortex [40]. A clear increase of corticospinal motor facilitation during the observation of unseen but smelt objects, and 50
the visual observation of the grasping of the same objects, has also been reported, further confirming the presence of visuo-olfacto-motor resonance [39]. It is also relevant that the neural activity in response to visual–olfactory actionrelated cues in the right middle temporal cortex and left superior parietal cortex might be superadditive; that is, higher than the sum of visual and olfactory cues presented in isolation [40]. Accordingly, although unimodal input might trigger action representation, congruent bimodal input is more appropriate because it provides an enriched sensory representation, which, ultimately, enables fullblown action simulation. In human and non-human primates, the orbitofrontal cortex (OFC) receives input from both the primary olfactory cortex and the higher-order visual areas [46], making it a prominent region for the multisensory integration of olfactory and visual signals. When the integration concerns visual–olfactory representations related to the simulation of a given action, the product of such computation has to be sent to motor regions. The OFC is heavily connected to brain regions
()TD$FIG][ Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2 Grasping observation
1°trial
0s 3s
TMS
5s 2° trial 10 s
8
Grasping observation by smelling interaction
7 6
No-smell
Smelling
8
5 4
TMS
MEP amplitude (log mv)
MEP amplitude (log mv)
(a)
7 No-smell
Smelling
6
12 s
MEP amplitude (log mv)
3° trial TMS
17 s 19 s
4° trial
24 s
TMS
26 s
5
7 6
No-observatione
Observation
4
5 4
Smelling
(b)
Visual
Olfactory
1°trial Superior parietal
3s inferior parietal cortex
Premotor dorsal cortex Inferior parietal cortex
Middle temporal cortex
10.5 s
9
Visual - olfactory 3s
Grasping
Left hemisphere
Premotor dorsal cortex Premotor ventral cortex
t values
2° trial
Superior parietal cortex Inferior parietal cortex
3.5
Middle parietal cortex
Right hemisphere TRENDS in Cognitive Sciences
Figure 2. The flavour of actions. The effects of unimodal and cross-modal olfactory stimulation have been investigated in subjects who smelled the odours of graspable objects and observed a model grasping ‘odourant’ foods. Unimodal presentation consisted of either visual (see a model while reaching to grasp food) or olfactory (smelling the graspable food with no concurrent visual stimulation) stimuli. In the bimodal presentation, visual and olfactory stimuli occurred together. (a) Single-pulse TMS was delivered to the primary motor cortex of healthy subjects. Sniffing alimentary odourants induced an increase of corticospinal reactivity of the same muscles that would be activated during actual grasping of the presented food. Moreover, cross-modal facilitation was observed during concurrent visual and olfactory stimulation [39]. (b) The observation of a hand grasping an object that was smelt but not seen activated the frontal, parietal and temporal cortical regions. No such activity was found during observation of a mimed grasp. Additive activity in this action observation network was observed when the object to be grasped was both seen and smelt [40]. Importantly, maximal modulation of corticospinal reactivity (TMS) and of BOLD signal (fMRI) was observed when both visuo-motor and olfacto-motor information were presented. This result suggests that, although olfactory stimuli might unimodally modulate the action system, its optimal tuning is achieved through cross-modal stimulation. Data adapted, with permission, from [39,40].
involved in movement control. In particular, direct connections between the OFC and the motor part of the cingulate area, the supplementary and the pre-supplementary motor areas, the ventral premotor area and even the primary motor cortex, have been described [47,48]. The possible functional gain of multimodal over unimodal coding of actions deserves further discussion. Motor resonance does not only involve the commands associated with motor execution, but also a variety of sensory signals that trigger or modulate the action simulation process. Such modulation might be more effective when mediated by more than one sensory modality. Indeed, multimodal integration seems to enhance perceptual accuracy and saliency by providing redundant cues that might help to
characterize actions fully. Importantly, multisensory integration appears more effective when weak and sparse stimuli are involved [49]. Thus, it might be that multisensory integration in the service of action simulation provides precise dynamic representations of complex sensory actions. Moreover, the functional gain derived from multimodal integration might help robust and detailed simulation of the perceived action. Can the social mapping of an action occur independently from vision or audition? Inferences about the sensory and motor states of others can be drawn via mental imagery that involves specific neural systems (e.g. the somatic or the visual cortex for tactile and 51
Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
visual imagery, respectively) [9,10,50]. However, only the telereceptive senses allow the social perception of touch and pain. Perceiving the touch of others, for example, can occur only through vision [11], whereas perceiving pain in others can occur through vision (e.g. direct observation of needles penetrating skin) [10,51–53], audition (e.g. hearing another’s cry) [54], or even smell (e.g. the odour of burning flesh) [55]. Although the telereceptive senses can map the actions of others unimodally, cross-modal mapping is likely to be the norm. However, whether vision or audition is more dominant in modulating this process in humans is still an open question. The study of blind or deaf individuals provides an excellent opportunity for addressing this issue (Figure 3). A recent fMRI study demonstrated that the auditory presentation of hand-executed actions in congenitally blind individuals activated the AON, although to a lesser extent compared with healthy, blindfolded participants [56]. However, a clear lack of corticospinal motor reactivity to vision and the sound of
[()TD$FIG]
actions in individuals with congenital deafness and blindness, respectively, was found [57]. This pattern of results suggests that, despite the plastic potential of the developing brain, action mapping remains an inherently crossmodal process. Anticipatory coding of the actions of others based on auditory and olfactory cues Influential theoretical models suggest that the human motor system is designed to function as an anticipation device [58] and that humans predict forthcoming actions by using their own motor system as an internal forward model. Action prediction implies the involvement of specific forms of anticipatory, embodied simulation that triggers neural activity in perceptual [59] and motor [60] systems. Evidence in support of this notion comes from a study in which merely waiting to observe a forthcoming movement made by another individual was found to trigger (unconsciously) a readiness potential in the motor system of an
(a) Blind
Controls Visual
Auditory
dPM
dPM SPL
vPM
vPM IF
aMF
IPL
IF MT/ST
Scissors
SPL IPL MT/ST
Action sound
Scissors
t scores
Motor pantomime +8 +2.3 -2.3 -8
dPM SPL vPM IF
vPM IPL
IPL
MT/ST
Mirror overlap
0s
4.1 s
TMS
4.8 s
Visual
Controls
Auditory 25 20 15 10 5 0 -5 -10 -15 -20 -25
Blind
Deaf 25 20 15 10 5 0 -5 -10 -15 -20 -25
MEP amplitude (% vs. baseline)
Visual/auditory
MEP amplitude (% vs. baseline)
(b)
6s
TRENDS in Cognitive Sciences
Figure 3. The auditory and visual responsiveness of the action observation network in individuals with congenital blindness or deafness. (a) The modulation of resonant action systems was investigated by using fMRI while congenitally blind or sighted individuals listened and recognized hand-related action sounds or environmental sounds and executed motor pantomimes upon verbal utterance of the name of a specific tool. The sighted individuals were also requested to perform a visual action recognition task. Listening to action sounds activated a premotor-temporoparietal cortical network in the congenitally blind individuals. This network largely overlapped with that activated in the sighted individuals while they listened to an action sound and observed and executed an action. Importantly, however, the activity was lower in blind than in sighted individuals, suggesting that multimodal input is necessary for the optimal tuning of action representation systems [56]. (b) Corticospinal reactivity to TMS was assessed in congenitally blind (blue bars) or congenitally deaf (green bars) individuals during the aural or visual presentation of a right-hand action or a non-human action (the flowing of a small stream of water in a natural environment). All videos were aurally presented to the blind and sighted control subjects and visually presented with muted sound to the deaf and hearing control individuals (grey bars = control subjects). Amplitudes of the motor evoked potentials (MEPs) recorded from the thumb (OP) and wrist (FCR) muscles during action perception in the deaf versus the hearing control group and the blind versus the sighted control group indicated that somatotopically, muscle-specific modulation was absent in individuals with a loss of a sensory modality (either vision or hearing). The reduction of resonant audio- or visuo-motor facilitation in individuals with congenital blindness or deafness suggests that the optimal tuning of the action system is necessarily multimodal [57]. Data adapted, with permission, from [56,57].
52
Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
onlooker [61]. In a similar vein, by using single-pulse transcranial magnetic stimulation (TMS), it was demonstrated that the mere observation of static pictures, representing implied human actions, induced body part-specific, corticospinal facilitation [62–64]. Moreover, it was also demonstrated that the superior perceptual ability of elite basketball players in anticipating the fate of successful versus unsuccessful basket throws is instantiated in a time-specific increase in corticospinal activity during the observation of erroneous throws [65]. Although cross-modal perception studies indicate that auditory [17] or olfactory [40] inputs might predominate over visual perception in certain circumstances, information on the phenomenology and neural underpinnings of the anticipatory process that enables individuals to predict upcoming actions on the basis of auditory and olfactory information remains meagre (Figure 4). Anticipation of sound sequences typically occurs when repeatedly listening to a music album in which different tracks are played in the same order. Indeed, it is a common experience that hearing the end of a given track evokes, in total silence, the anticipatory image of the subsequent track on the same album. Interestingly, the creation of this association brings about an increase of neural activity in premotor and basal ganglia regions,
suggesting that analogous predictive mechanisms are involved in both sound sequence and motor learning [66,67]. The prediction of an upcoming movement and the anticipation of forthcoming actions might be even stronger when dealing with precise sound–action order association. It is relevant that hearing sounds typically associated with a responsive action (e.g. a doorbell) brings about an increase in neural activity in the frontal regions, mainly on the left hemisphere, which is not found in response to sounds that do not elicit automatic motor responses (e.g. piano notes that have not been heard before) [33]. Thus, social action learning triggered by auditory cues might imply the acquisition of a temporal contingency between the perception of a particular sound and the movement associated with a subsequent action. This experience-related, top-down modulation of auditory perception might be used to predict and anticipate forthcoming movements and to create a representation of events that should occur in the near future. The grasping actions triggered by smelling fruits or sandwiches, indicate that olfactory cues might trigger anticipatory action planning [40]. Therefore, the sensory consequences of an odour are integrated and become part of the cognitive representation of the related action. Unfortunately, studies on the role of social odours in triggering anticipatory
[()TD$FIG] 0s
GRASP
1° trials
0.02 s
FLICK
GRASP
TMS
Start 2° trials TMS
Middle
10 s 10.5 s
FLICK
1.6
0.5 s
MEP amplitude
(a)
1.2 0.8 0.4 0
Middle
End
Start
Middle
End
End
Start
(b)
1.2
SMA (0, -6, 58)
Cerebellum (24, -72, -50)
Listen with anticipation
% Bold change
0.8 0.4 0 1.2
Listen with Tap anticipation
Passive listen
Listen with Tap anticipation
Passive listen
0.8 0.4 0
4
12
midPMC (-50, -6, 52)
Key:
midPMC (54, 2, 48)
Action--perception connection No action--perception connection
TRENDS in Cognitive Sciences
Figure 4. Prospective coding of actions. (a) A single-pulse TMS study demonstrated that the observation of the start and middle phases of grasp and flick actions induces a significantly higher motor facilitation than does observation of final posture. Higher resonance with upcoming than with past action phases supports the notion that the coding of observed actions is inherently anticipatory [63]. (b) An fMRI study demonstrated that neural activity in the ventral premotor cortex (which is part of the motor resonance network) and cerebellum is higher when subjects listen to a specific rhythm in anticipation of an overt reaction to it than when they listen to the same sound passively, without expecting an action to follow. This result sheds light on the nature of action–perception processes and suggests an inherent link between auditory and motor systems in the context of rhythm [66]. Data adapted, with permission, from [63,66].
53
Review Box 3. Questions for future research What are the developmental paths to the auditory and olfactory mapping of actions? Does action anticipation occur in the complete absence of visual mediation and what are the neural systems underpinning the auditory anticipation of actions? Does subconscious perception of odours have a role in the mapping of actions? What is the exact role of olfactory cues in action anticipation? Based on the notion that multimodal modulation is essential for action mapping, is it possible to use different sensory inputs to make virtual interactions more veridical? This would imply, for example, that the presence of odors during virtual interactions with avatars, would trigger greater embodiment of their sensorimotor states.
representations of the actions of others are currently lacking (see Box 3). Conclusions and future directions Hearing and smelling stimuli that evoke, or are associated with, actions activate their representation, thus indicating that not only vision, but also the other two telereceptive senses (i.e. audition and olfaction) might trigger the social mapping of actions somewhat independently from one another. Although the mapping process might be triggered by unimodal stimulation, the action representation process elicited by auditory and olfactory cues typically occurs within the context of multimodal perception, as indicated by the defective resonance in blind or deaf individuals. The results expand current knowledge by suggesting that cross-modal processing optimizes not only perceptual, but also motor performance. The analysis of how these two sensory channels contribute to the perspective coding of the actions of others remains a fundamental topic for future research. Acknowledgements Funded by the Istituto Italiano di Tecnologia (SEED Project Prot. Num. 21538), by EU Information and Communication Technologies Grant (VERE project, FP7-ICT-2009-5, Prot. Num. 257695) and the Italian Ministry of Health.
References 1 Smith, M.M. (2007) Sensing the Past: Seeing, Hearing, Smelling, Tasting and Touching in History, University of California Press 2 Driver, J. and Noesselt, T. (2008) Multisensory interplay reveals crossmodal influences on ‘sensory-specific’ brain regions, neural responses, and judgments. Neuron 57, 11–23 3 Brancucci, A. et al. (2009) Asymmetries of the human social brain in the visual, auditory and chemical modalities. Philos. Trans. R. Soc. Lond. B Biol. Sci. 364, 895–914 4 Serino, A. et al. (2009) Motor properties of peripersonal space in humans. PLoS One 4, e6582 5 Low, K.E.Y. (2006) Presenting the self, the social body, and the olfactory: managing smells in everyday life experiences. Sociological Perspect. 49, 607–631 6 Schu¨tz-Bosbach, S. and Prinz, W. (2007) Perceptual resonance: actioninduced modulation of perception. Trends Cogn. Sci. 11, 349–355 7 Rizzolatti, G. and Sinigaglia, C. (2010) The functional role of the parietofrontal mirror circuit: interpretations and misinterpretations. Nat. Rev. Neurosci. 11, 264–274 8 Iacoboni, M. (2009) Imitation, empathy, and mirror neurons. Annu. Rev. Psychol. 60, 653–670 9 Avenanti, A. et al. (2007) Somatic and motor components of action simulation. Curr. Biol. 17, 2129–2135
54
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2 10 Bufalari, I. et al. (2007) Empathy for pain and touch in the human somatosensory cortex. Cereb. Cortex 17, 2553–2561 11 Keysers, C. et al. (2010) Somatosensation in social perception. Nat. Rev. Neurosci. 11, 417–428 12 Tamietto, M. and de Gelder, B. (2010) Neural bases of nonconscious perception of emotional signals. Nat. Rev. Neurosci. 11, 697–709 13 Recanzone, G.H. (2009) Interactions of auditory and visual stimuli in space and time. Hear. Res. 258, 89–99 14 Campbell, R. (2008) The processing of audio-visual speech: empirical and neural bases. Philos. Trans. R. Soc. Lond. B Biol. Sci. 363, 1001– 1010 15 Spence, C. (2009) Explaining the Colavita visual dominance effect. Prog. Brain Res. 176, 245–258 16 Burr, D. et al. (2009) Auditory dominance over vision in the perception of interval duration. Exp. Brain Res. 198, 49–57 17 Shams, L. et al. (2000) Illusions. What you see is what you hear. Nature 408, 788 18 Welch, R.B. and Warren, D.H. (1980) Immediate perceptual response to intersensory discrepancy. Psychol. Bull. 88, 638–667 19 Alais, D. and Burr, D. (2004) The ventriloquist effect results from nearoptimal bimodal integration. Curr. Biol. 14, 257–262 20 Meredith, M.A. and Stein, B.E. (1983) Interactions among converging sensory inputs in the superior colliculus. Science 221, 389–391 21 Kayser, C. et al. (2010) Visual enhancement of the information representation in auditory cortex. Curr. Biol. 20, 19–24 22 Thomas, J.P. and Shiffrar, M. (2010) I can see you better if I can hear you coming: action-consistent sounds facilitate the visual detection of human gait. J. Vis. 10, 14 23 Brennan, P.A. and Kendrick, K.M. (2006) Mammalian social odours: attraction and individual recognition. Philos. Trans. R. Soc. Lond. B Biol. Sci. 361, 2061–2078 24 Prehn-Kristensen, A. et al. (2009) Induction of empathy by the smell of anxiety. PLoS One 4, e5987 25 Lundstro¨m, J.N. et al. (2009) The neuronal substrates of human olfactory based kin recognition. Hum. Brain Mapp. 30, 2571–2580 26 Walla, P. (2008) Olfaction and its dynamic influence on word and face processing: cross-modal integration. Prog. Neurobiol. 84, 192–209 27 Tubaldi, F. et al. (2008) The grasping side of odours. PLoS One 3, e1795 28 Tubaldi, F. et al. (2008) Effects of olfactory stimuli on arm-reaching duration. Chem. Senses 33, 433–440 29 Grafton, S.T. (2009) Embodied cognition and the simulation of action to understand others. Ann. N. Y. Acad. Sci. 1156, 97–117 30 Caspers, S. et al. (2010) ALE meta-analysis of action observation and imitation in the human brain. Neuroimage 50, 1148–1167 31 Aglioti, S.M. and Pazzaglia, M. (2010) Representing actions through their sound. Exp. Brain Res. 206, 141–151 32 Hauk, O. et al. (2006) The sound of actions as reflected by mismatch negativity: rapid activation of cortical sensory-motor networks by sounds associated with finger and tongue movements. Eur. J. Neurosci. 23, 811–821 33 De Lucia, M. et al. (2009) The role of actions in auditory object discrimination. Neuroimage 48, 475–485 34 Gazzola, V. et al. (2006) Empathy and the somatotopic auditory mirror system in humans. Curr. Biol. 16, 1824–1829 35 Alaerts, K. et al. (2009) Interaction of sound and sight during action perception: evidence for shared modality-dependent action representations. Neuropsychologia 47, 2593–2599 36 Galati, G. et al. (2008) A selective representation of the meaning of actions in the auditory mirror system. Neuroimage 40, 1274–1286 37 Pazzaglia, M. et al. (2008) The sound of actions in apraxia. Curr. Biol. 18, 1766–1772 38 Lundstro¨m, J.N. et al. (2008) Functional neuronal processing of body odors differs from that of similar common odors. Cereb. Cortex 18, 1466–1474 39 Rossi, S. et al. (2008) Distinct olfactory cross-modal effects on the human motor system. PLoS One 3, e1702 40 Tubaldi, F. et al. (2010) Smelling odors, understanding actions. Soc. Neurosci. 7, 1–17 41 Chen, Y.C. and Spence, C. (2010) When hearing the bark helps to identify the dog: semantically-congruent sounds modulate the identification of masked pictures. Cognition 114, 389–404 42 Castiello, U. et al. (2006) Cross-modal interactions between olfaction and vision when grasping. Chem. Senses 31, 665–671
Review 43 Kaplan, J.T. and Iacoboni, M. (2007) Multimodal action representation in human left ventral premotor cortex. Cogn. Process 8, 103–113 44 Barraclough, N.E. et al. (2005) Integration of visual and auditory information by superior temporal sulcus neurons responsive to the sight of actions. J. Cogn. Neurosci. 17, 377–391 45 Allison, T. et al. (2000) Social perception from visual cues: role of the STS region. Trends Cogn. Sci. 4, 267–278 46 Rolls, E.T. and Baylis, L.L. (1994) Gustatory, olfactory, and visual convergence within the primate orbitofrontal cortex. J. Neurosci. 14, 5437–5452 47 Cavada, C. et al. (2000) The anatomical connections of the macaque monkey orbitofrontal cortex. A review. Cereb. Cortex 10, 220–242 48 Morecraft, R.J. and Van Hoesen, G.W. (1993) Frontal granular cortex input to the cingulate (M3), supplementary (M2) and primary (M1) motor cortices in the rhesus monkey. J. Comp. Neurol. 337, 669–689 49 Ghazanfar, A.A. and Lemus, L. (2010) Multisensory integration: vision boosts information through suppression in auditory cortex. Curr. Biol. 20, R22–R23 50 Moro, V. et al. (2008) Selective deficit of mental visual imagery with intact primary visual cortex and visual perception. Cortex 44, 109–118 51 Avenanti, A. et al. (2005) Transcranial magnetic stimulation highlights the sensorimotor side of empathy for pain. Nat. Neurosci. 8, 955–960 52 Betti, V. et al. (2009) Synchronous with your feelings: sensorimotor {gamma} band and empathy for pain. J. Neurosci. 29, 12384–12392 53 Minio-Paluello, I. et al. (2009) Absence of ‘sensorimotor contagion’ during pain observation in Asperger Syndrome. Biol. Psychiatry 65, 55–62 54 Bastiaansen, J.A. et al. (2009) Evidence for mirror systems in emotions. Philos. Trans. R. Soc. Lond. B Biol. Sci. 364, 2391–2404 55 Plotkin, A. et al. (2010) Sniffing enables communication and environmental control for the severely disabled. Proc. Natl. Acad. Sci. U.S.A. 107, 14413–14418 56 Ricciardi, E. et al. (2009) Do we really need vision? How blind people ‘see’ the actions of others. J. Neurosci. 29, 9719–9724 57 Alaerts, K. et al. (2010) Action perception in individuals with congenital blindness or deafness: how does the loss of a sensory modality from birth affect perception-induced motor facilitation? J. Cogn. Neurosci. DOI: 10.1162/jocn.2010.21517 58 Wolpert, D.M. et al. (2003) A unifying computational framework for motor control and social interaction. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2358, 593–602
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2 59 Perrett, D.I. et al. (2009) Seeing the future: natural image sequences produce ‘anticipatory’ neuronal activity and bias perceptual report. Q. J. Exp. Psychol. 62, 2081–2104 60 Stanley, J. and Miall, R.C. (2009) Using predictive motor control processes in a cognitive task: behavioral and neuroanatomical perspectives. Adv. Exp. Med. Biol. 629, 337–354 61 Kilner, J.M. et al. (2004) Motor activation prior to observation of a predicted movement. Nat. Neurosci. 7, 1299–1301 62 Urgesi, C. et al. (2006) Motor facilitation during action observation: topographic mapping of the target muscle and influence of the onlooker’s posture. Eur. J. Neurosci. 23, 2522–2530 63 Urgesi, C. et al. (2010) Simulating the future of actions in the human corticospinal system. Cereb. Cortex 20, 2511–2521 64 Candidi, M. et al. (2010) Competing mechanisms for mapping actionrelated categorical knowledge and observed actions. Cereb. Cortex 20, 2832–2841 65 Aglioti, S.M. et al. (2008) Action anticipation and motor resonance in elite basketball players. Nat. Neurosci. 11, 1109–1116 66 Chen, J.L. et al. (2008) Listening to musical rhythms recruits motor regions of the brain. Cereb. Cortex 18, 2844–2854 67 Leaver, A.M. et al. (2009) Brain activation during anticipation of sound sequences. J. Neurosci. 29, 2477–2485 68 di Pellegrino, G. et al. (1992) Understanding motor events: a neurophysiological study. Exp. Brain Res. 91, 176–180 69 Gallese, V. et al. (1992) Action recognition in the premotor cortex. Brain 119, 593–609 70 Umilta`, M.A. et al. (2001) I know what you are doing. A neurophysiological study. Neuron 31, 155–165 71 Fogassi, L. et al. (2005) Parietal lobe: from action organization to intention understanding. Science 308, 662–667 72 Kohler, E. et al. (2002) Hearing sounds, understanding actions: action representation in mirror neurons. Science 297, 846–848 73 Keysers, C. et al. (2003) Audiovisual mirror neurons and action recognition. Exp. Brain Res. 153, 628–636 74 Mukamel, R. et al. (2010) Single-neuron responses in humans during execution and observation of actions. Curr. Biol. DOI: 10.1016/ j.cub.2010.02.045 75 Pazzaglia, M. et al. (2008) Neural underpinnings of gesture discrimination in patients with limb apraxia. J. Neurosci. 28, 3030– 3041
55
Review
Value, pleasure and choice in the ventral prefrontal cortex Fabian Grabenhorst1 and Edmund T. Rolls2 1 2
University of Cambridge, Department of Physiology, Development and Neuroscience, Cambridge, UK Oxford Centre for Computational Neuroscience, Oxford, UK
Rapid advances have recently been made in understanding how value-based decision-making processes are implemented in the brain. We integrate neuroeconomic and computational approaches with evidence on the neural correlates of value and experienced pleasure to describe how systems for valuation and decision-making are organized in the prefrontal cortex of humans and other primates. We show that the orbitofrontal and ventromedial prefrontal (VMPFC) cortices compute expected value, reward outcome and experienced pleasure for different stimuli on a common value scale. Attractor networks in VMPFC area 10 then implement categorical decision processes that transform value signals into a choice between the values, thereby guiding action. This synthesis of findings across fields provides a unifying perspective for the study of decision-making processes in the brain. Integrating different approaches to valuation and decision-making Consider a situation where a choice has to be made between consuming an attractive food and seeking a source of warm, pleasant touch. To decide between these fundamentally different rewards, the brain needs to compute the values and costs associated with two multisensory stimuli, integrate this information with motivational, cognitive and contextual variables and then use these signals as inputs for a stimulus-based choice process. Rapid advances have been made in understanding how these key component processes for value-based, economic decision-making are implemented in the brain. Here, we review recent findings from functional neuroimaging, single neuron recordings and computational neuroscience to describe how systems for stimulus-based (goal-based) valuation and choice decision-making are organized and operate in the primate, including human, prefrontal cortex. When considering the neural basis of value-based decision-making, the sensory nature of rewards is often neglected, and the focus is on action-based valuation and choice. However, many choices are between different sensory and, indeed, multisensory rewards, and can be action independent [1–3]. Here, we bring together evidence from investigations of the neural correlates of the experienced pleasure produced by sensory rewards and from studies that have used neuroeconomic and computational approaches, Corresponding authors: Rolls, E.T. (
[email protected]). URL: www.oxcns.org
56
thereby linking different strands of research that have largely been considered separately so far. Neural systems for reward value and its subjective correlate, pleasure Reward and emotion: a Darwinian perspective The valuation of rewards is a key component process of decision-making. The neurobiological and evolutionary context is as follows [3]. Primary rewards, such as sweet taste and warm touch, are gene-specified (i.e. unlearned) goals for action built into us during evolution by natural selection to direct behavior to stimuli that are important for survival and reproduction. Specification of rewards, the goals for action, by selfish genes is an efficient and adaptive Darwinian way for genes to control behavior for their own reproductive success [3]. Emotions are states elicited when these gene-specified rewards are received, omitted, or terminated, and by other stimuli that become linked with them by associative learning [3]. The same approach leads to understanding motivations or ‘wantings’ as states in which one of these goals is being sought [3]. (This approach suggests that when animals perform responses for rewards that have been devalued, which have been described as ‘wantings’ [4], such behavior is habit or stimulus-response based after overtraining, and is not goal directed.) Neuronal recordings in macaques, used as a model for these systems in humans [3], and functional neuroimaging studies in humans have led to the concept of three tiers of cortical processing [1], illustrated in Figure 1 and described in this review. Object representations independent of reward valuation: Tier 1 The first processing stage is for the representation of what object or stimulus is present, independently of its reward value and subjective pleasantness. In this first tier, the identity and intensity of stimuli are represented, as exemplified by correlations of activations in imaging studies with the subjective intensity but not pleasantness of taste in the primary taste cortex [5,6], and neuronal activity that is independent of reward value, investigated, for example, when food value is reduced to zero by feeding to satiety [1,3]. As shown in Figure 1, this first tier includes the primary taste cortex in the anterior insula, the pyriform olfactory cortex and the inferior temporal visual cortex, where objects and faces are represented relatively invariantly with respect to position on the retina, size, view and so on, where this invariant representation is ideal for association with a reward [1,3,7]. Part of the utility of a ‘what’ representation
1364-6613/$ – see front matter ß 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.tics.2010.12.004 Trends in Cognitive Sciences, February 2011, Vol. 15, No. 2
()TD$FIG][ Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Tier 1
Tier 2
Tier 3
'What'
Reward / Affective value
Decision-making / Output
Lateral PFC top-down affective modulation
Vision
V1
V2
V4
Inferior temporal visual cortex
Medial PFC area 10 Pregen cing
Choice value decision-making
Cingulate cortex Behavior: Action-outcome
Amygdala
Striatum Behavior: Habit
Taste Taste receptors
Dopamine
Nucleus of the solitary tract
Thalamus VPMpc nucleus
Lateral hypothalamus
Frontal operculum/Insula (Primary taste cortex)
Gate
Gate function
Autonomic and endocrine responses
Orbitofrontal Cortex Hunger neuron controlled by e.g. glucose utilization, stomach distension or body weight
Olfaction
Olfactory bulb
Olfactory (Pyriform) cortex
Touch
Thalamus VPL
Primary somatosensory cortex (1,2,3) Insula TRENDS in Cognitive Sciences
Figure 1. Organization of cortical processing for computing value (in Tier 2) and making value-based decisions (in Tier 3) and interfacing to action systems. The Tier 1 brain regions up to and including the column headed by the inferior temporal visual cortex compute and represent neuronally ‘what’ stimulus or object is present, but not its reward or affective value. Tier 2 represents, by its neuronal firing, the reward or affective value, and includes the OFC, amygdala, and anterior including pregenual cingulate cortex. Tier 3 is involved in choices based on reward value (in particular VMPFC area 10), and in different types of output to behavior. The secondary taste cortex and the secondary olfactory cortex are within the orbitofrontal cortex. Abbreviations: lateral PFC, lateral prefrontal cortex, a source for top-down attentional and cognitive modulation of affective value [50]; PreGen Cing, pregenual cingulate cortex; V1, primary visual cortex; V4, visual cortical area V4. ‘Gate’ refers to the finding that inputs such as the taste, smell and sight of food in regions where reward value is represented only produce effects when an appetite for the stimulus (modulated e.g. by hunger) is present [3]. Adapted, with permission, from [1].
independent of reward value is that one can learn about an object, for example about its location and properties, even when it is not rewarding, for example when satiated. Reward value and pleasure: Tier 2 The orbitofrontal cortex: the value and pleasure of stimuli Receiving inputs from Tier 1, the primate, including human, orbitofrontal cortex (OFC) in Tier 2 (Figure 1) is the first stage of cortical processing in which reward value is made explicit in the representation. This is supported by
discoveries that: (i) OFC neurons decrease their responses to a food or to water to zero when the reward values of food and water are reduced to zero by feeding to satiety; (ii) OFC neurons with visual responses learn rapidly and reverse their responses to visual stimuli depending on whether the stimulus is associated with a reward or punisher; and (iii) activations in humans are related to the reward value of taste, olfactory, oral texture, somatosensory, visual, social and monetary stimuli [1,3] (Table 1 and the supplementary material online for references). Subjective pleasure is the 57
Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Table 1. Principles of operation of the OFC and ACC in reward processing, and their adaptive valuea Operational principle 1. Neural activity in the OFC and ACC represents reward value and pleasure on a continuous scale.
Adaptive value This type of representation provides useful inputs for neural attractor networks involved in choice decision-making.
2. The identity and intensity of stimuli are represented at earlier cortical stages that send inputs to the OFC and ACC: stimuli and objects are first represented, then their reward and affective value is computed in the OFC.
This separation of sensory from affective processing is highly adaptive for it enables one to identify and learn about stimuli independently of whether one currently wants them and finds them rewarding.
3. Many different rewards are represented close together in the OFC, including taste, olfactory, oral texture, temperature, touch, visual, social, amphetamine-induced and monetary rewards.
This organization facilitates comparison and common scaling of different rewards by lateral inhibition, and thus provides appropriately scaled inputs for a choice decision-making process.
4. Spatially separate representations of pleasant stimuli (rewards) and unpleasant stimuli (punishers) exist in the OFC and ACC.
This type of organization provides separate and partly independent inputs into brain systems for cost–benefit analysis and decision-making.
5. The value of specific rewards is represented in the OFC: different single neurons respond to different combinations of specific taste, olfactory, fat texture, oral viscosity, visual, and face and vocal expression rewards.
This type of encoding provides a reward window on the world that allows not only selection of specific rewards, but also for sensoryspecific satiety, a specific reduction in the value of a stimulus after it has been received continuously for a period of time.
6. Both absolute and relative value signals are present in the OFC.
Absolute value is necessary for stable long-term preferences and transitivity. Being sensitive to relative value might be useful in climbing local reward gradients as in positive contrast effects.
7. Top-down cognitive and attentional factors, originating in lateral prefrontal cortex, modulate reward value and pleasantness in the OFC and ACC through biased competition and biased activation.
These top-down effects allow cognition and attention to modulate the first cortical stage of reward processing to influence valuation and economic decision-making.
a
References to the investigations that provide the evidence for this summary are provided in the supplementary material online.
consciously experienced affective state produced by rewarding stimuli [3]. In imaging studies, neural activations in the OFC and adjacent anterior cingulate cortex (ACC) are correlated with the subjective pleasure produced by many different stimuli (Figure 2a). For example, the subjective pleasantness of the oral texture of fat, an indicator for high energy density in foods, is represented on a continuous scale by neural activity in the OFC and ACC (Figure 2b) [8]. Neuroeconomic approaches focus largely on subjective value as inferred from choices (revealed preferences). By contrast, pleasure is a consciously experienced state. The conscious route to choice and action may be needed for rational (i.e. reasoning) thought about multistep plans [3,9]. Primary rewards would become conscious by virtue of entering a reasoning processing system, for example when reasoning about whether an experienced reward, such as a pleasant touch, should be sought in future [3,9,10]. Because pleasure may reflect processing by a reasoning, conscious system when decision-making is performed by goal-directed explicit decision systems involving the prefrontal cortex (as opposed to implicit habit systems involving the basal ganglia) [1,3,11], pleasure may provide insight into what guides decision-making beyond what can be inferred from observed choices [12]. The ACC: the reward value of stimuli; and an interface to goal-directed action The pleasure map in Figure 2 indicates that the ACC, which receives inputs from the OFC (Figure 1), also has value-based representations, consistent with evidence from single neuron studies [13– 17]. These value representations provide the goal representation in an ‘action to goal outcome’ associative learning system in the mid-cingulate cortex (Box 1), and also provide an output for autonomic responses to affective stimuli [18]. 58
Key principles of value representations in the OFC and ACC Key principles of operation of the OFC and ACC in reward and punishment valuation are summarized in Table 1. We examine some of these principles, focusing on recent developments in understanding how valuation signals in the OFC and ACC are scaled, how they adapt to contexts and how they are modulated by top-down processes. Box 1. Reward representations in the ACC If activations in both the OFC and ACC reflect the value of rewards, what might be the difference in function between these two areas [1,18,89]? We suggest that the information about the value of rewards is projected from the OFC to ACC (its pregenual and dorsal anterior parts). The pregenual and dorsal ACC parts can be conceptualized as a relay that allows information about rewards and outcomes to be linked, via longitudinal connections running in the cingulum fiber bundle, to information about actions represented in the mid-cingulate cortex. Bringing together information about specific rewards with information about actions, and the costs associated with actions, is important for associating actions with the value of their outcomes and for selecting the correct action that will lead to a desired reward [89,90]. Indeed, consistent with its strong connections to motor areas [91], lesions of ACC impair reward-guided action selection [92,93], neuroimaging studies have shown that the ACC is active when outcome information guides choices [94], and single neurons in the ACC encode information about both actions and outcomes, including reward prediction errors for actions [14,15]. For example, Luk and Wallis [14] found that, in a task where information about three potential outcomes (three types of juice) had to be associated on a trial-by-trial basis with two different responses (two lever movements), many neurons in the ACC encoded information about both specific outcomes and specific actions. In a different study, Seo and Lee [17] found that dorsal ACC neurons encoded a signal related to the history of rewards received in previous trials, consistent with a role for this region in learning the value of actions. Interestingly, in both of these studies, there was little evidence for encoding of choices, indicating that a choice mechanism between rewards might not be implemented in the ACC.
()TD$FIG][ Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Pleasure maps in orbitofrontal and anterior cingulate cortex
(a)
Pleasant Unpleasant 9
7 27 20 29 31 4 24 11 6 14 17 19 22 10 30 1 17 24 10
6 16 3 26 15 10 20 5 23 3 20 5 21 25 24 29 1 14 17 15 27 11 7 28 4 12 19 22 13 8 18 18 2 7
20 2115 22 5 16 19 8 28 1 14 3
Neural representation of the subjective pleasantness of fat texture 0.6
% BOLD change
% BOLD change
(b)
0.3 0 -0.3 -0.6 -2
-1
0
1
2
Pleasantness of texture
0.3 0 -0.3 -0.6 -2
-1
0
1
2
Pleasantness of texture
Key:
Temperature
Flavor
0.5
0
-0.5 -2
-1
0
1
Pleasantness ratings
2
9
6 Key: DV=2 DV=3 DV=4 DV=6 DV=10
3
0 0
2
4
6
8
10
Neuronal responses impulses/s
1
Average firing rate (sp/s)
Common scaling and adaptive encoding of value in orbitofrontal cortex
(c)
% BOLD change
0.6
50
Key: Narrow distribution Wide distribution
25
0 0.1
0.3
0.5
Juice volume ml
Offer value (uV) TRENDS in Cognitive Sciences
Figure 2. Pleasure and value in the brain. (a) Maps of subjective pleasure in the OFC (ventral view) and ACC (sagittal view). Yellow font indicates sites where activations correlate with subjective pleasantness; whereas white font indicates sites where activations correlate with subjective unpleasantness. The numbers refer to effects found in specific studies: taste: 1, 2; odor: 3–10; flavor: 11–16; oral texture: 17, 18; chocolate: 19; water: 20; wine: 21; oral temperature: 22, 23; somatosensory temperature: 24, 25; the sight of touch: 26, 27; facial attractiveness: 28, 29; erotic pictures: 30; and laser-induced pain: 31. (See the supplementary material online for references to the original studies.) (b) How the brain represents the reward value of the oral texture (i.e. the mouth feel) of food stimuli [8]. Oral texture is a prototypical primary reward important for detecting the presence of fat in foods and is thus an indicator of high energy density in foods. Subjective pleasantness (+2 = very pleasant, -2 = very unpleasant) of the oral texture of liquid food stimuli that differed in flavor and fat content tracked neural activity (% BOLD signal change) in the OFC (left) and ACC (right). (c) Common scaling and adaptive encoding of value in the OFC. (left) A common scale for the subjective pleasure for different primary rewards: neural activity in the OFC correlates with the subjective pleasantness ratings for flavor stimuli in the mouth and somatosensory temperature stimuli delivered to the hand. The regression lines describing the relationship between neural activity (% BOLD signal) and subjective pleasantness ratings were indistinguishable for both types of reward. (middle) Padoa-Schioppa [43] found that neurons in the OFC that encode the offer value of different types of juice adapt their sensitivity to the value range of juice rewards available in a given session, while keeping their neuronal activity range constant. Each line shows the average neuronal response for a given value range. (right) Kobayashi et al. [44] found that neurons in the OFC adapt their sensitivity of value coding to the statistical distribution of reward values, in that the reward sensitivity slope adapted to the standard deviation of the probability distribution of juice volumes. These findings indicate that the range of the value scale in the OFC can be adjusted to reflect the range of rewards that are available at a given time. Reproduced, with permission, from [30] (c left), [43] (c middle) and [44] (c right).
Reward-specific value representations on a common scale, but not in a common currency Reward-specific representations Single neurons in the OFC encode different specific rewards [1,3] by responding to different combinations of taste, olfactory, somatosensory, visual and auditory stimuli, including socially relevant
stimuli such as face expression [1,3,19]. Part of the adaptive utility of this reward-specific representation is that it provides for sensory-specific satiety as implemented by a decrease in the responsiveness of reward-specific neurons [1]. This is a fundamental property of every reward system that helps to ensure that a variety of 59
Review different rewards is selected over time [3]. Representations of both reward outcome and expected value are specific for the particular reward: not only do different neurons respond to different primary reinforcers, but different neurons also encode the conditioned stimuli for different outcomes, with different neurons responding, for example, to the sight or odor of stimuli based on the outcome that is expected [20,21]. Topology of reward and punishment systems Different types of reward tend to be represented in the human medial OFC and pregenual ACC, and different types of punisher tend to be represented in the lateral OFC and the dorsal part of the ACC (Figure 2). The punishers include negative reward prediction error encoded by neurons that fire only when an expected reward is not received [20]. To compute this OFC signal, inputs are required from neurons that respond to the expected value of a stimulus (exemplified in the OFC by neurons that respond to the sight of food), and from other neurons that respond to the magnitude of the reward outcome (exemplified in the OFC by neurons that respond to the taste of food) [3,22]. All these signals are reflected in activations found for expected value and for reward outcome in the human medial OFC [23,24], and for monetary loss and negative reward prediction error for social reinforcers in the human lateral OFC [25]. This topological organization with different types of specific reward represented close together in the OFC may allow for comparison between different rewards implemented by lateral inhibition as part of a process of scaling different specific rewards to the same range [3]. A topological organization of reward and punishment systems is also important to provide partly separate inputs into systems for learning, choice and cost–benefit analysis (Box 2). A common scale for different specific rewards A classic view of economic decision theory [26] implies that decisionmakers convert the value of different goods into a common scale of utility. Ecological [27], psychological [28] and neuroeconomic approaches [29] similarly suggest that the values of different types of reward are converted into a common currency. Rolls and Grabenhorst [1,3] have argued that different specific rewards must be represented on the same scale, but not converted into a common currency, as the specific goal selected must be the output of the decision process so that the appropriate action for that particular goal can then be chosen [1,3]. The key difference between the two concepts of common currency and common scaling lies in the specificity with which rewards are represented at the level of single neurons. Whereas a common currency view implies convergence of different types of reward onto the same neurons (a process in which information about reward identity is lost), a common scaling view implies that different rewards are represented by different neurons (thereby retaining reward identity in information processing), with the activity of the different neurons scaled to be in the same value range. A recent functional magnetic resonance imaging (fMRI) study demonstrated the existence of a region in the human OFC where activations are scaled to the same range as a function of pleasantness for even fundamentally different 60
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Box 2. Cost–benefit analysis for decision-making: extrinsic and intrinsic costs If the OFC and ACC encode the value of sensory stimuli, does neural activity in these structures also reflect the cost of rewards? We propose that, when considering this, it is important to distinguish two types of cost. Extrinsic costs are properties of the actions required to obtain rewards or goals, for example physical effort and hard work, and are not properties of the rewards themselves (which are stimuli). By contrast, intrinsic costs are properties of stimuli. For example, many rewards encountered in the world are hedonically complex stimuli containing both pleasant and unpleasant components at the same time, for example: natural jasmine odor contains up to 6% of the unpleasant chemical indole; red wines and leaves contain bitter and astringent tannin components; and dessert wines and fruits can contain unpleasant sulfur components. Furthermore, cognitive factors can influence intrinsic costs, for example when knowledge of the energy content of foods modulates their reward value. Intrinsic costs can also arise because of the inherent delay or low probability/high uncertainty in obtaining them. We suggest that intrinsic costs are represented in the reward– pleasure systems in the brain, including the OFC, where the values of stimuli are represented, and that extrinsic costs are represented in brain systems involved in linking actions to rewards, such as the cingulate cortex. Evaluation of stimulus-intrinsic benefits and costs appears to engage the OFC [55,95,96]. For example, in a recent fMRI study, it was found that the medial OFC, which represents the pleasantness of odors, was sensitive to the pleasant components in a naturally complex jasmine olfactory mixture, whereas the lateral OFC, which represents the unpleasantness of odors, was sensitive to the unpleasant component (indole) in the mixture [95]. A recent neurophysiological study found that reward risk and value are encoded by largely separate neuronal populations in the OFC [97]. The implication is that both reward value and intrinsic cost stimuli are represented separately in the OFC. This might provide a neural basis for processing related to cognitive reasoning about reward value and its intrinsic cost, and for differential sensitivity to rewards and aversion to losses. By contrast, a role for the cingulate cortex in evaluating the physical effort associated with actions has been demonstrated in studies in rats, monkeys [98] and humans [99]. Interestingly, single neurons in the lateral prefrontal cortex encode the temporally discounted values of choice options, suggesting that reward and delay costs are integrated in this region [100].
primary rewards: taste in the mouth and warmth on the hand [30] (Figure 2c). A different study found that the decision value for different categories of goods (food, nonfood consumables and monetary gambles) during purchasing decisions correlated with activity in the adjacent ventromedial prefrontal cortex [VMPFC (the term ‘VMPFC’ is used to describe a large region of the medial prefrontal cortex that includes parts of the medial OFC, ACC and the medial prefrontal cortex area 10)] [31]. Importantly, because of the limited spatial resolution of fMRI, these studies are unable to determine whether it is the same or different neurons in these areas that encode the value of different rewards. However, as shown most clearly by single-neuron recording studies, the representations in the OFC provide evidence about the exact nature of each reward [1,3,22] (see the supplementary material online). Moreover, in economic decision-making, neurons in the macaque OFC encode the economic value of the specific choice options on offer, for example different juice rewards [2]. For many of these ‘offer value’ neurons, the relationship between neuronal impulse rate and value was invariant with respect to the different types of juice that were available [32], suggesting that different types of juice are evaluated on a common value scale.
Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
With current computational understanding of how decisions are made in attractor neural networks [33–36] (see below), it is important that different rewards are expressed on a similar scale for decision-making networks to operate correctly but retain information about the identity of the specific reward. The computational reason is that one type of reward (e.g. food reward) should not dominate all other types of reward and always win in the competition, as this would be maladaptive. Making different rewards approximately equally rewarding makes it probable that a range of different rewards will be selected over time (and depending on factors such as motivational state), which is adaptive and essential for survival [3]. The exact scaling into a decision-making attractor network will be set by the number of inputs from each source, their firing rates and the strengths of the synapses that introduce the different inputs into the decision-making network [7,33,35,36]. Importantly, common scaling need not imply conversion into a new representation that is of a common currency of general reward [1]. In the decision process itself, it is important to know which reward has won, and the mechanism is likely to involve competition between different rewards represented close together in the cerebral cortex, with one of the types of reward winning the competition, rather than convergence of different rewards onto the same neuron [3,7,33,35,36].
[()TD$FIG]
The OFC and ACC represent value on a continuous scale, and not choice decisions between different value signals To test whether the OFC and ACC represent the value of stimuli on a continuous scale and, thus, provide the evidence for decision-making, or instead are implicated themselves in making choices, Grabenhorst, Rolls et al. performed a series of investigations in which the valuation of thermal and olfactory stimuli in the absence of choice was compared with choice decision-making about the same stimuli. Whereas activation in parts of the OFC and ACC represented the value of the rewards on a continuous scale [10,37], the next connected area in the system, VMPFC area 10 (Figure 1), had greater activations when choices were made, and showed other neural signatures of decision-making indicative of an attractor-based decision process, as described below for Tier 3 processing [38,39] (Figure 3d). Absolute value and relative value are both represented in the OFC For economic decision-making, both absolute and relative valuation signals have to be neurally represented. A representation of the absolute value of rewards is important for stable long-term preferences and consistent economic choices [32,40]. Such a representation should not be influenced by the value of other available rewards. By contrast,
(a) Decision-making map of the ventromedial prefrontal cortex (b) Relative value of the chosen option
(c) Chosen stimulus value (prior to action)
2 1 24 20 3 15 7 10 19 22 13 6 17 16 1 1 12 23 18 10 4 9 8 2 21 5 14
(d) Decision easiness (prior to action)
TRENDS in Cognitive Sciences
Figure 3. From value to choice in the VMPFC. (a) Activations associated with 1: (economic) subjective value during intertemporal choice; 2: immediate versus delayed choices; 3 immediate versus delayed primary rewards; 4: expected value during probabilistic decision-making; 5: expected value based on social and experience-based information; 6: expected value of the chosen option; 7: price differential during purchasing decisions; 8: willingness to pay; 9: goal value during decisions about food cues; 10: choice probability during exploitative choices; 11: conjunction of stimulus- and action-based value signals; 12: goal value during decisions about food stimuli; 13: willingness to pay for different goods; 14: willingness to pay for lottery tickets; 15: subjective value of charitable donations; 16: decision value for exchanging monetary against social rewards; 17: binary choice versus valuation of thermal stimuli; 18: binary choice versus valuation of olfactory stimuli; 19: easy versus difficult binary choices about thermal stimuli; 20: easy versus difficult binary choices about olfactory stimuli; 21: value of chosen action; 22: difference in value between choices; 23: prior correct signal during probabilistic reversal learning; and 24: free versus forced charitable donation choices. It is notable that some of the most anterior activations in VMPFC area 10 (activations 17–19) were associated with binary choice beyond valuation during decision-making. (See supplementary material online for references to the original studies.) (b) VMPFC correlates of the relative value of the chosen option during probabilistic decision-making. (c) VMPFC correlates of the chosen stimulus value are present even before action information is available [72]. (d) VMPFC correlates of value difference, and thus decision easiness and confidence, during olfactory and thermal value-based choices. Effects in this study were found in the far anterior VMPFC, medial area 10, but not in the OFC or ACC. Reproduced, with permission, from [70] (b), [72] (c), and [38] (d).
61
Review to select the option with the highest subjective value in a specific choice situation, the relative value of each option needs to be represented. A recent study provided evidence for absolute value coding in the OFC, in that neuronal responses that encoded the value of a specific stimulus did not depend on what other stimuli were available at the same time [32]. It was suggested that transitivity, a fundamental trait of economic choice, is reflected by the neuronal activity in the OFC [32]. This type of encoding contrasts with value-related signals found in the parietal cortex, where neurons encode the subjective value associated with specific eye movements in a way that is relative to the value of the other options that are available [41]. The apparent difference in value coding between the OFC and parietal cortex has led to the suggestion that absolute value signals encoded in the OFC are subsequently rescaled in the parietal cortex to encode relative value to maximize the difference between the choice options for action selection [41]. However, there is also evidence for the relative encoding of value in the OFC, in that neuronal responses to a food reward can depend on the value of the other reward that is available in a block of trials [42]. Two recent studies demonstrated that neurons in the OFC adapt the sensitivity with which reward value is encoded to the range of values that are available at a given time [43,44] (Figure 2c). This reflects an adaptive scaling of reward value, evident also in positive and negative contrast effects, that makes the system optimally sensitive to the local reward gradient, by dynamically altering the sensitivity of the reward system so that small changes can be detected [3]. The same underlying mechanism may contribute to the adjustment of different types of reward to the same scale described in the preceding section. Given that representations of both absolute value and relative value are needed for economic decision-making, Grabenhorst and Rolls [45] tested explicitly whether both types of representation are present simultaneously in the human OFC. In a task in which two odors were successively delivered on each trial, they found that blood oxygenation level-dependent (BOLD) activations to the second odor in the antero-lateral OFC tracked the relative subjective pleasantness, whereas activations in the medial and mid-OFC tracked the absolute pleasantness of the second odor. Thus, both relative and absolute subjective value signals, both of which provide important inputs to decision-making processes, are separately and simultaneously represented in the human OFC [45]. Cognitive and attentional influences on value: a biased activation theory of top-down attention How do cognition and attention affect valuation and neural representations of value? One possibility is that value representations ascend from the OFC and ACC to higher language-related cortical systems, and there become entwined with cognitive representations. In fact, there is a more direct mechanism. Cognitive descriptions at the highest, linguistic level of processing (e.g. ‘rich delicious flavor’) or attentional instructions at the same, linguistic level (e.g. ‘pay attention to and rate pleasantness’ vs ‘pay attention to and rate intensity’) have a top-down modulatory influence on value representations in the OFC and ACC of odor [46], taste and 62
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
flavor [6], and touch [47] stimuli by increasing or decreasing neural responses to these rewards. Thus, cognition and attention have top-down influences on the first part of the cortex in which value is represented (Tier 2), and modulate the effects of the bottom-up sensory inputs. Recent studies have identified the lateral prefrontal cortex (LPFC, a region implicated in attentional control; Figure 1 [7,48]) as a site of origin for these top-down influences. In one study, activity in the LPFC correlated with value signals in the ventral ACC during self-controlled choices about food consumption [49]. Grabenhorst and Rolls have shown recently with fMRI connectivity analyses that activity in different parts of the LPFC differentially correlated with activations to a taste stimulus in the OFC or anterior insula, depending on whether attention was focused on the pleasantness or intensity of the taste, respectively [50]. Because activations of connected structures in whole cortical processing streams were modulated, in this case the affective stream (Tier 2 of Figure 1, including the OFC and ACC) versus the discriminative (object) stream (Tier 1 of Figure 1, including the insula), Grabenhorst and Rolls extended the concept of biased competition [51] and its underlying neuronal mechanisms [52] in which top-down signals operate to influence competition within an area implemented through a set of local inhibitory interneurons, to a biased activation theory of top-down attention [50], in which activations in whole processing streams can be modulated by top-down signals (Figure 4c). These insights have implications for several areas related to neuroeconomics and decision-making, including the design of studies in which attentional instructions might influence which brain systems become engaged, as well as situations in which affective processing might be usefully modulated (e.g. in the control of the effects of the reward value of food and its role in obesity and addiction) [3,7,53]. From valuation to choice in the ventromedial prefrontal cortex The operational principles described above enable the OFC and ACC (Tier 2 in Figure 1) to provide value representations that are appropriately scaled to act as inputs into neural systems for economic decision-making, and to promote a progression through the reward space in the environment to find the range of rewards necessary for survival and reproduction [3]. We next consider how neural value representations are transformed into choices in the VMPFC. We describe evidence that choices are made in attractor networks with nonlinear dynamics, in which one of the possible attractor states, each biased by a different value signal, wins the competition implemented through inhibitory interneurons [36]. Neural activity in the VMPFC in neuroeconomic tasks Studies based on neuroeconomic and computational approaches have revealed that neural activity in the VMPFC correlates with the expected value of choice options during decision-making (Figure 3) [41,54]. For example, subject-specific measures of the expected ‘goal value’ of choice options can be derived from observed
()TD$FIG][ Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
(b) (a)
λ1
λ1
λ2
λ2
Decision network
Confidence decision network
Dendrites yj
λA
Recurrent collateral synapses wij
wij
Inputs
λB
hi = dendritic activation Cell bodies yi = output firing
Recurrent collateral axons
RC
RC
Output axons
λext
(AMPA,NMDA)
λext
(AMPA)
(AMPA)
Output: Decision
(AMPA,NMDA)
(c) Short term memory bias source for D1
Inhibitory pool
D2
Nonspecific neurons
cortical stream 1
Output: Decision about confidence in the decision Short term memory bias source for cortical stream 2
(GABA)
“Potential”
(GABA)
λ1
λ2
Decision state Decision state attractor Spontaneous attractor state
D1
S
D2 Firing rate
Biased activation of cortical stream 1
Biased activation of cortical stream 2
Bottom Up input 2
Bottom Up input 1 Output of cortical stream 1
Output of cortical stream 2 TRENDS in Cognitive Sciences
Figure 4. Decision-making and attentional mechanisms in the brain. (a) (top) Attractor or autoassociation single network architecture for decision-making. The evidence for decision 1 is applied via the l1, and for decision 2 via the l2 inputs. The synaptic weights wij have been associatively modified during training in the presence of l1 and at a different time of l2. When l1 and l2 are applied, each attractor competes through the inhibitory interneurons (not shown), until one wins the competition, and the network falls into one of the high firing rate attractors that represents the decision. The noise in the network caused by the random spiking times of the neurons (for a given mean rate) means that, on some trials, for given inputs, the neurons in the decision 1 (D1) attractor are more likely to win and, on other trials, the neurons in the decision 2 (D2) attractor are more likely to win. This makes the decision-making probabilistic, for, as shown in (bottom), the noise influences when the system will jump out of the spontaneous firing stable (low energy) state S, and whether it jumps into the high firing state for decision 1 (D1) or decision 2 (D2). (middle) The architecture of the integrate-and-fire network used to model decision-making. (bottom) A multistable ‘effective energy landscape’ for decision-making with stable states shown as low ‘potential’ basins. Even when the inputs are being applied to the network, the spontaneous firing rate state is stable, and noise provokes transitions into the high firing rate decision attractor state D1 or D2. (b) A network for making confidence-based decisions. Given that decisions made in a first decision-making network have firing rates in the winning attractor that reflect the confidence in the first decision, a second ‘monitoring’ decision network can take confidence-related decisions based on the inputs received from the first decision-making network. The inputs to the decision-making network are lA and lB. A fixed reference firing rate input to the second, confidence decision, network is not shown. (c) A biased activation theory of attention. The short-term memory systems that provide the source of the top-down activations may be separate (as shown), or could be a single network with different attractor states for the different selective attention conditions. The top-down short-term memory systems hold what is being paid attention to active by continuing firing in an attractor state, and bias separately either cortical processing system 1, or cortical processing system 2. This weak top-down bias interacts with the bottom-up input to the cortical stream and produces an increase of activity that can be supralinear [52]. Thus, the selective activation of separate cortical processing streams can occur. In the example, stream 1 might process the affective value of a stimulus, and stream 2 might process the intensity and physical properties of the stimulus. The outputs of these separate processing streams must then enter a competition system, which could be, for example, a cortical attractor decision-making network that makes choices between the two streams, with the choice biased by the activations in the separate streams. (After Grabenhorst and Rolls 2010 [50].) Adapted, with permission, from [38] (aiii), [36] (b) and [50] (c).
choices between different rewards, such as when subjects bid money for goods they wish to acquire (i.e. willingness to pay), and these can be used as regressors for fMRI activity [31,49,55–57]. Using this approach, neural correlates of the goal value for different types of expected reward, including food items, non-food consumables, monetary gambles and lottery tickets, have been found in the VMPFC (Figure 3). Decision-related activity in the VMPFC is also found for choices about primary rewards, such as a pleasant warm or
unpleasant cold touch to the hand, and between olfactory stimuli [10]. As can be seen from Figure 3a, there is considerable variability in the exact anatomical location of decisionrelated effects in the VMPFC. Moreover, VMPFC activity has been linked to a wide range of valuation and choice signals that incorporates information about temporal delay [58–60], uncertainty [61], price or value differential [62,63], social advice [64], and monetary expected value 63
Review and reward outcome [24]. This heterogeneity of findings raises the question of whether a common denominator for the functional role of VMPFC in value-based decisionmaking can be identified or, alternatively, whether different VMPFC subregions make functionally distinct contributions to the decision-making process. A common theme that has emerged from the different strands of research is that the VMPFC provides a system for choices about different types of reward and for different types of decision, including in the social domain [64–67]. For example, Behrens and colleagues found that the VMPFC encoded the expected value of the chosen option based on the subjects’ own experiences as well as on social advice [64]. On the basis of these findings, it has been suggested that the VMPFC represents a common valuation signal that underlies different types of decision as well as decisions about different types of goods [31,41,59,68]. A related account [69] suggests that, whereas the OFC is involved in encoding the value of specific rewards, the VMPFC plays a specific role in value-guided decision-making about which of several options to pursue by encoding the expected value of the chosen option [64,70,71]. Indeed, VMPFC activity measured with fMRI correlates with the value difference between chosen and unchosen options (i.e. relative chosen value), and this signal can be further dissected into separate value signals for chosen and unchosen options [70] (Figure 3b). However, with the temporal resolution of fMRI, it is difficult to distinguish input signals to a choice process (the expected or offer value, or value difference between options) from output signals of a choice process (the value of the chosen or unchosen option) and from those that represent the categorical choice outcome (the identity of the chosen option). Value in the OFC and choice in VMPFC area 10 Rolls, Grabenhorst and colleagues have proposed an alternative account [1,10,36,38,39] that suggests that, whereas the OFC and ACC parts of the VMPFC are involved in representing reward value as inputs for a value-based choice process, the anterior VMPFC area 10 is involved in choice decision-making beyond valuation, as has been found in studies that have contrasted choice with valuation [10,37] (Figure 3d). Part of this proposal is that area 10 is involved in decision-making beyond valuation by implementing a competition between different rewards, with the computational mechanism described below. This choice process operates on the representation of rewarding stimuli (or goods, in economic terms) and, thus, occurs before the process of action selection. This is based, in part, on the evidence that neuronal activity in the OFC is related to the reward value of stimuli, and that actions such as whether any response should be made, or a lick response, or a touch response [3,7], or a right versus left response [2], are not represented in the OFC [3]. Indeed, using an experimental design that dissociated stimulus and action information in a value-based choice task, Wunderlich et al. demonstrated that correlates of the value of the chosen stimulus can be found in the VMPFC even before action information is available [72] (Figure 3c). Thus, we suggest that the role of the anterior VMPFC area 10 is to transform a continuously scaled representation of expected value (or offer 64
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
value) of the stimulus choice options into a categorical representation of reward stimulus choice. This process uses a mechanism in which the winner in the choice competition is the chosen stimulus, which can then be used as the goal for action to guide action selection. This computational view on the role of the VMPFC in decision-making is fundamentally different from the proposal made by Damasio and colleagues, in which the VMPFC is involved in generating somatic markers (changes in the autonomic, endocrine and skeletomotor responses), which are then sensed in the insular and somatosensory cortices and thereby reflect the value of choice options and ‘weigh in’ on the decision process [73], as has been discussed in detail elsewhere [3]. Computational mechanisms for choice and their neural signatures Phenomenological approaches By examining computational models of decision-making, we now consider the processes by which the brain may make choices between rewards. One approach, which has been used mainly in the domain of sensory decision-making, can be described as phenomenological, in that a mathematical model is formulated without specifying the underlying neural mechanisms. The main such approach is the accumulator or race model, in which the noisy (variable) incoming evidence is accumulated or integrated until some decision threshold is reached [74]. This provides a good account of many behavioral aspects of decision-making, but does not specify how a mechanism for choice could be implemented in a biologically realistic way in the brain. Choice implemented by competition between attractor states in cortical networks A different approach is to formulate a theory at the mechanistic level of the operation of populations of neurons with biologically plausible dynamics of how choices are made in the brain (Figure 4) [33–36,75]. In this scenario, the parameters are given by the time constants and strengths of the synapses and the architecture of the networks; neuronal spiking occurring in the simulations provides a source of noise that contributes to the decision-making being probabilistic and can be directly compared with neuronal activity recorded in the brain; and predictions can be made about the neuronal and fMRI signals associated with decision-making, which can be used to test the theory. Interestingly, the theory implements a type of nonlinear diffusion process that can be related to the linear diffusion process implemented by accumulator or race models [76]. Furthermore, the degree of confidence in one’s decisions and other important properties of a decision-making process, such as reaction times and Weber’s Law, arise as emergent properties of the integrate-and-fire attractor model summarized in Figure 4 [33,36]. Predictions of the noisy attractor theory of decisionmaking The attractor-based integrate-and-fire model of decisionmaking makes specific predictions about the neuronal signature of a choice system in the brain, including higher neuronal firing, and correspondingly larger fMRI BOLD
Review signals, on correct than error trials. The reason for this is that the winning attractor on a given trial (say attractor 1 selected as a consequence of a larger l1 than l2 and the noise in the system caused by the randomness in the neuronal spiking times for a given mean rate) receives additional support from the external evidence that is received via l1 on correct trials [36,39,75]. For the same reason, on correct trials, as the difference Dl between l1 and l2 increases, so the firing rates and the predicted fMRI BOLD signal increase. Rolls et al. have recently confirmed this prediction for VMPFC area 10 when choices were being made between the pleasantness of successive odors [39]. Conversely, but for the same reason, on error trials, as Dl increases, so the firing rates and the predicted fMRI BOLD signal decrease [39]. This prediction has also been confirmed for area 10 [39]. If all trials, both correct and error, are considered together, then the model predicts an increase in the BOLD signal in choice decision-making areas, and this prediction has been confirmed for area 10 [38,39]. (Indeed, this particular signature has been used to identify decision-making areas of the brain, even though there was no account of why this was an appropriate signature [77].) The confirmation of these predictions for area 10, but not for the OFC where the evidence described above indicates that value is represented, provides strong support for this neuronal mechanism of decision-making in the brain [38,39]. The same neuronal cortical architecture for decisionmaking (Figure 4) is, Rolls and Deco propose [36], involved in many different decision-making systems in the brain, including vibrotactile flutter frequency discrimination in the ventral premotor cortex [35], optic flow in the parietal cortex and the confidence associated with these decisions [78], olfactory confidence-related decisions in the rat prefrontal cortex [79,80] and perceptual detection [36]. A useful property of this model of decision-making is that it maintains as active the representation of the goal or state that has been selected in the short-term memory implemented by the recurrent collateral connections, providing a representation for guiding action and other behavior that occurs subsequent to the decision [36]. In a unifying computational approach, Rolls and Deco [36] argue that the same noise-influenced categorization process also accounts for memory recall, for the maintenance of short-term memory and therefore attention, and for the way in which noise affects signal detection. Furthermore, disorders in the stability of these stochastic dynamical cortical systems implemented by the recurrent collateral excitatory connections between nearby cortical pyramidal cells, contribute to a new approach to understanding schizophrenia (in which there is too little stability) [81,82] and obsessive-compulsive disorder (in which it is hypothesized that there is too much stability) [83]. Confidence in decisions As the evidence for a decision becomes stronger, confidence in the decision being correct increases. More formally, before the outcome of the decision is known, confidence in a correct decision increases with Dl on correct trials, and decreases on trials when an error has in fact been made [84]. The model just described accounts for confidence in decisions as an emergent property of the attractor network
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
processes just described, with the firing rates and predicted BOLD signals reflecting confidence, just as they do Dl on correct than error trials. If one does not have confidence in an earlier decision then, even before the outcome is known, one might abort the strategy and try the decision-making again [79]. The second decision can be modeled by a second decisionmaking network that receives the outputs from the first decision-making network [36,80] (see Figure 4b). If the first network in its winning attractor has relatively high firing rates reflecting high confidence in a correct decision, then the second network can use these high firing rates to send it into a decision state reflecting ‘confidence in the first decision’. If the first network in its winning attractor has relatively lower firing rates reflecting low confidence in a correct decision, then the second network can use these lower firing rates to send it into a decision state reflecting ‘lack of confidence in the first decision’ [80]. This two-decision network system (Figure 4b) provides a simple model of monitoring processes in the brain, and makes clear predictions of the neuronal activity that reflects this monitoring process [36,80]. Part of the interest is that ‘self-monitoring’ is an important aspect of some approaches to consciousness [85,86]. However, we think that it is unlikely that the two attractor network architecture would be conscious [36]. Concluding remarks and future priorities We have linked neurophysiological and neuroimaging to computational approaches to decision-making and have shown that representations of specific rewards on a continuous and similar scale of value in the OFC and ACC (Tier 2) are followed by a noisy attractor-based system for making choices between rewards in VMPFC area 10 (Tier 3). Subjective pleasure is the state associated with the activation of representations in Tier 2, and confidence is an emergent property of the decision-making process in Tier 3. Similar neuronal choice mechanisms in other brain areas are suggested to underlie different types of decision, memory recall, short-term memory and attention, and signal detection processes, and for some disorders in these processes. In future research, it will be important to examine how well this stochastic dynamical approach to decision-making, memory recall, and so on, can account for findings in many brain systems at the neuronal level; how subjective reports of confidence before the outcome is known are related to neural processing in these different brain systems; how this stochastic dynamic approach to decisionmaking may be relevant to economic decision-making [87,88]; and whether this approach helps to understand and treat patients, for example those with damage to the brain that affects decision-making, and those with schizophrenia and obsessive-compulsive disorder. Acknowledgments Some of the research described in this paper was supported by the Medical Research Council and the Oxford Centre for Computational Neuroscience. F.G. was supported by the Gottlieb-Daimler- and Karl Benz-Foundation, and by the Oxford Centre for Computational Neuroscience. 65
Review Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.tics. 2010.12.004. References 1 Rolls, E.T. and Grabenhorst, F. (2008) The orbitofrontal cortex and beyond: from affect to decision-making. Prog. Neurobiol. 86, 216– 244 2 Padoa-Schioppa, C. and Assad, J.A. (2006) Neurons in the orbitofrontal cortex encode economic value. Nature 441, 223–226 3 Rolls, E.T. (2005) Emotion Explained, Oxford University Press 4 Kringelbach, M.L. and Berridge, K.C. (2009) Towards a functional neuroanatomy of pleasure and happiness. Trends Cogn. Sci. 13, 479– 487 5 Grabenhorst, F. and Rolls, E.T. (2008) Selective attention to affective value alters how the brain processes taste stimuli. Eur. J. Neurosci. 27, 723–729 6 Grabenhorst, F. et al. (2008) How cognition modulates affective responses to taste and flavor: top down influences on the orbitofrontal and pregenual cingulate cortices. Cereb. Cortex 18, 1549–1559 7 Rolls, E.T. (2008) Memory, Attention, and Decision-Making: A Unifying Computational Neuroscience Approach, Oxford University Press 8 Grabenhorst, F. et al. (2010) How the brain represents the reward value of fat in the mouth. Cereb. Cortex 20, 1082–1091 9 Rolls, E.T. (2007) A computational neuroscience approach to consciousness. Neural Networks 20, 962–982 10 Grabenhorst, F. et al. (2008) From affective value to decision-making in the prefrontal cortex. Eur. J. Neurosci. 28, 1930–1939 11 Balleine, B.W. and O’Doherty, J.P. (2010) Human and rodent homologies in action control: corticostriatal determinants of goaldirected and habitual action. Neuropsychopharmacology 35, 48–69 12 Kahneman, D. et al. (1997) Back to Bentham? Explorations of experienced utility. Q. J. Econ. 112, 375–405 13 Rolls, E.T. (2008) Functions of the orbitofrontal and pregenual cingulate cortex in taste, olfaction, appetite and emotion. Acta Physiol. Hung. 95, 131–164 14 Luk, C.H. and Wallis, J.D. (2009) Dynamic encoding of responses and outcomes by neurons in medial prefrontal cortex. J. Neurosci. 29, 7526–7539 15 Matsumoto, M. et al. (2007) Medial prefrontal selectivity signalling prediction errors of action values. Nat. Neurosci. 10, 647–656 16 Amiez, C. et al. (2006) Reward encoding in the monkey anterior cingulate cortex. Cereb. Cortex 16, 1040–1055 17 Seo, H. and Lee, D. (2007) Temporal filtering of reward signals in the dorsal anterior cingulate cortex during a mixed-strategy game. J. Neurosci. 27, 8366–8377 18 Rolls, E.T. (2009) The anterior and midcingulate cortices and reward. In Cingulate Neurobiology and Disease (Vogt, B.A., ed.), pp. 191–206, Oxford University Press 19 Rolls, E.T. et al. (2006) Face-selective and auditory neurons in the primate orbitofrontal cortex. Exp. Brain Res. 170, 74–87 20 Thorpe, S.J. et al. (1983) Neuronal activity in the orbitofrontal cortex of the behaving monkey. Exp. Brain Res. 49, 93–115 21 Critchley, H.D. and Rolls, E.T. (1996) Hunger and satiety modify the responses of olfactory and visual neurons in the primate orbitofrontal cortex. J. Neurophysiol. 75, 1673–1686 22 Rolls, E.T. (2009) From reward value to decision-making: neuronal and computational principles. In Handbook of Reward and DecisionMaking (Dreher, J.-C. and Tremblay, L., eds), pp. 95–130, Academic Press 23 O’Doherty, J. et al. (2001) Abstract reward and punishment representations in the human orbitofrontal cortex. Nat. Neurosci. 4, 95–102 24 Rolls, E.T. et al. (2008) Expected value, reward outcome, and temporal difference error representations in a probabilistic decision task. Cereb. Cortex 18, 652–663 25 Kringelbach, M.L. and Rolls, E.T. (2003) Neural correlates of rapid reversal learning in a simple model of human social interaction. Neuroimage 20, 1371–1383
66
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2 26 Bernoulli, J. (1738/1954) Exposition of a new theory on the measurement of risk. Econometrica 22, 23–36 27 McFarland, D.J. and Sibly, R.M. (1975) The behavioural final common path. Philos. Trans. R. Soc. Lond. 270, 265–293 28 Cabanac, M. (1992) Pleasure: the common currency. J. Theor. Biol. 155, 173–200 29 Montague, P.R. and Berns, G.S. (2002) Neural economics and the biological substrates of valuation. Neuron 36, 265–284 30 Grabenhorst, F. et al. (2010) A common neural scale for the subjective pleasantness of different primary rewards. Neuroimage 51, 1265–1274 31 Chib, V.S. et al. (2009) Evidence for a common representation of decision values for dissimilar goods in human ventromedial prefrontal cortex. J. Neurosci. 29, 12315–12320 32 Padoa-Schioppa, C. and Assad, J.A. (2008) The representation of economic value in the orbitofrontal cortex is invariant for changes of menu. Nat. Neurosci. 11, 95–102 33 Deco, G. and Rolls, E.T. (2006) Decision-making and Weber’s Law: a neurophysiological model. Eur. J. Neurosci. 24, 901–916 34 Wang, X.J. (2008) Decision making in recurrent neuronal circuits. Neuron 60, 215–234 35 Deco, G. et al. (2009) Stochastic dynamics as a principle of brain function. Prog. Neurobiol. 88, 1–16 36 Rolls, E.T. and Deco, G. (2010) The Noisy Brain: Stochastic Dynamics as a Principle of Brain Function, Oxford University Press 37 Rolls, E.T. et al. (2010) Neural systems underlying decisions about affective odors. J. Cogn. Neurosci. 22, 1069–1082 38 Rolls, E.T. et al. (2010) Choice, difficulty, and confidence in the brain. Neuroimage 53, 694–706 39 Rolls, E.T. et al. (2010) Decision-making, errors, and confidence in the brain. J. Neurophysiol. 104, 2359–2374 40 Glimcher, P.W. et al., eds (2009) Neuroeconomics: Decision-Making and the Brain, Academic Press 41 Kable, J.W. and Glimcher, P.W. (2009) The neurobiology of decision: consensus and controversy. Neuron 63, 733–745 42 Tremblay, L. and Schultz, W. (1999) Relative reward preference in primate orbitofrontal cortex. Nature 398, 704–708 43 Padoa-Schioppa, C. (2009) Range-adapting representation of economic value in the orbitofrontal cortex. J. Neurosci. 29, 14004– 14014 44 Kobayashi, S. et al. (2010) Adaptation of reward sensitivity in orbitofrontal neurons. J. Neurosci. 30, 534–544 45 Grabenhorst, F. and Rolls, E.T. (2009) Different representations of relative and absolute value in the human brain. Neuroimage 48, 258– 268 46 de Araujo, I.E.T. et al. (2005) Cognitive modulation of olfactory processing. Neuron 46, 671–679 47 McCabe, C. et al. (2008) Cognitive influences on the affective representation of touch and the sight of touch in the human brain. Soc. Cogn. Affect. Neurosci. 3, 97–108 48 Corbetta, M. and Shulman, G.L. (2002) Control of goal-directed and stimulus-driven attention in the brain. Nat. Rev. 3, 201–215 49 Hare, T.A. et al. (2009) Self-control in decision-making involves modulation of the vmPFC valuation system. Science 324, 646–648 50 Grabenhorst, F. and Rolls, E.T. (2010) Attentional modulation of affective vs sensory processing: functional connectivity and a topdown biased activation theory of selective attention. J. Neurophysiol. 104, 1649–1660 51 Desimone, R. and Duncan, J. (1995) Neural mechanisms of selective visual attention. Annu. Rev. Neurosci. 18, 193–222 52 Deco, G. and Rolls, E.T. (2005) Neurodynamics of biased competition and co-operation for attention: a model with spiking neurons. J. Neurophysiol. 94, 295–313 53 Rolls, E.T. (2010) Taste, olfactory, and food texture reward processing in the brain and obesity. Int. J. Obes. DOI: 10.1038/ijo.2010.155 54 Rangel, A. and Hare, T. (2010) Neural computations associated with goal-directed choice. Curr. Opin. Neurobiol. 20, 262–270 55 Hare, T.A. et al. (2008) Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors. J. Neurosci. 28, 5623–5630 56 Plassmann, H. et al. (2007) Orbitofrontal cortex encodes willingness to pay in everyday economic transactions. J. Neurosci. 27, 9984–9988 57 De Martino, B. et al. (2009) The neurobiology of reference-dependent value computation. J. Neurosci. 29, 3833–3842
Review 58 McClure, S.M. et al. (2004) Separate neural systems value immediate and delayed monetary rewards. Science 306, 503–507 59 Kable, J.W. and Glimcher, P.W. (2007) The neural correlates of subjective value during intertemporal choice. Nat. Neurosci. 10, 1625–1633 60 Peters, J. and Buchel, C. (2009) Overlapping and distinct neural systems code for subjective value during intertemporal and risky decision making. J. Neurosci. 29, 15727–15734 61 Levy, I. et al. (2010) Neural representation of subjective value under risk and ambiguity. J. Neurophysiol. 103, 1036–1047 62 FitzGerald, T.H. et al. (2009) The role of human orbitofrontal cortex in value comparison for incommensurable objects. J. Neurosci. 29, 8388– 8395 63 Knutson, B. et al. (2007) Neural predictors of purchases. Neuron 53, 147–156 64 Behrens, T.E. et al. (2008) Associative learning of social value. Nature 456, 245–249 65 Rolls, E.T. et al. (1994) Emotion-related learning in patients with social and emotional changes associated with frontal lobe damage. J. Neurol. Neurosurg. Psychiatry 57, 1518–1524 66 Hornak, J. et al. (1996) Face and voice expression identification in patients with emotional and behavioural changes following ventral frontal lobe damage. Neuropsychologia 34, 247–261 67 Hornak, J. et al. (2003) Changes in emotion after circumscribed surgical lesions of the orbitofrontal and cingulate cortices. Brain 126, 1691–1712 68 Rangel, A. (2009) The computation and comparison of value in goaldirected choice. In Neuroeconomics: Decision-Making and the Brain (Glimcher, P.W. et al., eds), pp. 425–440, Academic Press 69 Rushworth, M.F. et al. (2009) General mechanisms for making decisions? Curr. Opin. Neurobiol. 19, 75–83 70 Boorman, E.D. et al. (2009) How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action. Neuron 62, 733–743 71 Wunderlich, K. et al. (2009) Neural computations underlying actionbased decision making in the human brain. Proc. Natl. Acad. Sci. U.S.A. 106, 17199–17204 72 Wunderlich, K. et al. (2010) Economic choices can be made using only stimulus values. Proc. Natl. Acad. Sci. U.S.A. 107, 15005–15010 73 Damasio, A.R. (2009) Neuroscience and the emergence of neuroeconomics. In Neuroeconomics: Decision-Making and the Brain (Glimcher, P.W. et al., eds), pp. 209–213, Academic Press 74 Ratcliff, R. et al. (1999) Connectionist and diffusion models of reaction time. Psychol. Rev. 106, 261–300 75 Wang, X.J. (2002) Probabilistic decision making by slow reverberation in cortical circuits. Neuron 36, 955–968 76 Roxin, A. and Ledberg, A. (2008) Neurobiological models of two-choice decision making can be reduced to a one-dimensional nonlinear diffusion equation. PLoS Comput. Biol. 4, e1000046 77 Heekeren, H.R. et al. (2008) The neural systems that mediate human perceptual decision making. Nat. Rev. Neurosci. 9, 467–479 78 Kiani, R. and Shadlen, M.N. (2009) Representation of confidence associated with a decision by neurons in the parietal cortex. Science 324, 759–764
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2 79 Kepecs, A. et al. (2008) Neural correlates, computation and behavioural impact of decision confidence. Nature 455, 227–231 80 Insabato, A. et al. (2010) Confidence-related decision-making. J. Neurophysiol. 104, 539–547 81 Loh, M. et al. (2007) A dynamical systems hypothesis of schizophrenia. PLoS Comput. Biol. 3, e228 doi:210.1371/journal.pcbi.0030228 82 Rolls, E.T. et al. (2008) Computational models of schizophrenia and dopamine modulation in the prefrontal cortex. Nat. Rev. Neurosci. 9, 696–709 83 Rolls, E.T. et al. (2008) An attractor hypothesis of obsessivecompulsive disorder. Eur. J. Neurosci. 28, 782–793 84 Vickers, D. and Packer, J. (1982) Effects of alternating set for speed or accuracy on response time, accuracy and confidence in a unidimensional discrimination task. Acta Psychol. 50, 179–197 85 Lycan, W.G. (1997) Consciousness as internal monitoring. In The Nature of Consciousness: Philosophical Debates (Block, N. et al., eds), pp. 755–771, MIT Press 86 Block, N. (1995) On a confusion about a function of consciousness. Behav. Brain Sci. 18, 227–247 87 Glimcher, P.W. (2011) Foundations of Neuroeconomic Analysis, Oxford University Press 88 Rolls, E.T. From brain mechanisms of emotion and decision-making to neuroeconomics. In The State of Mind in Economics (Oullier, O. et al., eds), Cambridge University Press (in press) 89 Rushworth, M.F. et al. (2007) Functional organization of the medial frontal cortex. Curr. Opin. Neurobiol. 17, 220–227 90 Walton, M.E. et al. (2003) Functional specialization within medial frontal cortex of the anterior cingulate for evaluating effort-related decisions. J. Neurosci. 23, 6475–6479 91 Morecraft, R.J. and Tanji, J. (2009) Cingulofrontal interaction and the cingulate motor areas. In Cingulate Neurobiology and Disease (Vogt, B.A., ed.), pp. 113–144, Oxford Univesrity Press 92 Kennerley, S.W. et al. (2006) Optimal decision making and the anterior cingulate cortex. Nat. Neurosci. 9, 940–947 93 Rudebeck, P.H. et al. (2008) Frontal cortex subregions play distinct roles in choices between actions and stimuli. J. Neurosci. 28, 13775– 13785 94 Walton, M.E. et al. (2004) Interactions between decision making and performance monitoring within prefrontal cortex. Nat. Neurosci. 7, 1259–1265 95 Grabenhorst, F. et al. (2007) How pleasant and unpleasant stimuli combine in different brain regions: odor mixtures. J. Neurosci. 27, 13532–13540 96 Rolls, E.T. et al. (2008) Warm pleasant feelings in the brain. Neuroimage 41, 1504–1513 97 O’Neill, M. and Schultz, W. (2010) Coding of reward risk by orbitofrontal neurons is mostly distinct from coding of reward value. Neuron 68, 789–800 98 Kennerley, S.W. et al. (2009) Neurons in the frontal lobe encode the value of multiple decision variables. J. Cogn. Neurosci. 21, 1162–1178 99 Croxson, P.L. et al. (2009) Effort-based cost-benefit valuation and the human brain. J. Neurosci. 29, 4531–4541 100 Kim, S. et al. (2008) Prefrontal coding of temporally discounted values during intertemporal choice. Neuron 59, 161–172
67
Review
Cognitive culture: theoretical and empirical insights into social learning strategies Luke Rendell, Laurel Fogarty, William J.E. Hoppitt, Thomas J.H. Morgan, Mike M. Webster and Kevin N. Laland Centre for Social Learning and Cognitive Evolution, School of Biology, University of St. Andrews, Bute Medical Building, St. Andrews, Fife KY16 9TS, UK
Research into social learning (learning from others) has expanded significantly in recent years, not least because of productive interactions between theoretical and empirical approaches. This has been coupled with a new emphasis on learning strategies, which places social learning within a cognitive decision-making framework. Understanding when, how and why individuals learn from others is a significant challenge, but one that is critical to numerous fields in multiple academic disciplines, including the study of social cognition. The strategic nature of copying Social learning, defined as learning that is influenced by observation of or interaction with another individual, or its products [1], and frequently contrasted with asocial learning (e.g. trial and error), is a potentially cheap way of acquiring valuable information. However, copying comes with pitfalls [2] – the acquired information might be outdated, misleading or inappropriate. Nevertheless, social learning is widespread in animals [3,4] and reaches a zenith in the unique cumulative culture of humans. Understanding how to take advantage of social information, while managing the risks associated with its use, has become a focus for research on social learning strategies [5–7], which explores how natural selection has shaped learning strategies in humans and other animals. Research on this topic has expanded rapidly in recent years, in part by building on a more detailed understanding of social learning and teaching mechanisms (Box 1). However, the expansion has primarily been fuelled by a strong link between theory and empirical work, as well as the often surprising parallels between the social decisionmaking of humans and that of other animals (Box 2). Thus, the field has moved beyond asking which psychological mechanisms individuals use to copy each other toward an exploration of the cognitive decision-making framework that individuals use to balance the competing demands of accuracy and economy in knowledge gain [8]. The marriage between the economics of information use and evolutionary theory has generated a rich research program that spans multiple disciplines, including biology, psychology, anthropology, archaeology, economics, computer science Corresponding author: Rendell, L. (
[email protected]).
68
and robotics. Researchers are now starting to gain an understanding of the functional rules that underlie the decision to copy others, and are beginning to appreciate that the rules deployed at the individual level profoundly affect the dynamics of cultural evolution over larger temporal and social scales. Theoretical insights Research into social learning strategies is supported by a rich and interdisciplinary theoretical background (Box 3) [5–18], with active ongoing debates, such as on the importance of conformity [5,16,17,19–21], whether the decision to copy is more dependent on the content of the acquired information or the social context [5,22,23], and whether, and under what circumstances, social learning can lead to maladaptive information transmission [2,5,13,24]. An important starting point was a simple thought experiment that became one of the most productive ideas to date related to the evolution of social learning, known as Rogers’ paradox [10]. Anthropologist Alan Rogers constructed a simple mathematical model to explore how best to learn in a changing environment. The analysis suggested, somewhat surprisingly, that social learning does not increase mean population fitness, because its efficacy is highly frequency-dependent. Copying is advantageous at low frequency because social learners acquire their information primarily from asocial learners who have directly sampled the environment, but avoid the costs of asocial learning. However, copying becomes disadvantageous as it increases in frequency, because social learners find themselves increasingly copying other copiers. The information acquired is then rendered outdated by environmental change, giving a fitness advantage to asocial learning when the latter is rare. At equilibrium, both social and asocial learners persist with the same average fitness. Rogers’ Glossary Conformist bias: positive frequency-dependent social learning for which the probability of acquiring a trait increases disproportionately with the number of demonstrators performing it. Cultural drift: random, or unbiased, copying in which individuals acquire variants according to the frequency at which they are practiced. Social learning strategy: evolved psychological rule specifying under what circumstances an individual learns from others and/or from whom they learn.
1364-6613/$ – see front matter ß 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.tics.2010.12.002 Trends in Cognitive Sciences, February 2011, Vol. 15, No. 2
Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Box 1. Social learning and teaching processes A large amount of research has focused on determining the psychological mechanisms underlying social learning in animals. This was initially driven by the question of which non-human animals are capable of imitation, a process assumed to involve sophisticated cognition, requiring an observer to extract the motor program for an action from the experience of observing another individual perform that action [74]. The recognition of alternative processes through which animals could come to acquire similar behaviour following social interaction, not all of which implied complex mechanisms, eventually spawned a number of classifications of different social learning processes that can result in the transmission of behaviour between individuals [1,75]. Simpler mechanisms, such as local and stimulus enhancement (see Table I) were usually seen as explanations that should be ruled out before imitation could be inferred [76]. This enabled researchers to devise the two-action test, a laboratory procedure for inferring imitation [77]. The two-action method requires experimental subjects to solve a task with two alternative solutions, with half observing one solution and the other half the alternative; if subjects disproportionately use the method that they observed, this is taken as evidence of imitation. In recent years, interest has shifted away from the question of ‘do animals imitate?’ towards the more general question of ‘how do animals (including humans) copy others?’ [78–81]. This approach includes recreation of the movements of objects in the environment,
copying the goals of observed behaviour, learning about the affordance of objects and imitation at a number of levels of copying fidelity [78,79]. Other researchers aim to elucidate the neural mechanisms and developmental processes underpinning imitation [80,81]. Collectively, this work has revealed an extensive repertoire of copying processes, all of which are probably exhibited by humans, but only some of which are observed in other species. Advances in both experimental and statistical methods [3,82,83] mean that specific learning processes can now be identified, which will potentially facilitate mapping of the taxonomic distribution of these processes. Historically, teaching has been viewed as a contributor of additional and separate mechanisms to the list of social learning processes. However, recent findings on simple forms of teaching in ants, bees, pied babblers and meerkats [84] have led to the detection of correspondences between teaching and social learning processes. Social learning mechanisms relate primarily to psychological processes in the observer (pupil), whereas teaching processes relate specifically to activities of the demonstrator (tutor). Accordingly, alternative forms of teaching can be viewed as special cases of established social learning processes, in which the demonstrator actively facilitates information transmission. For instance, while many species, including ants, teach through local enhancement, humans might be unique in teaching through imitation.
Table I. A classification of social learning mechanisms. Social learning mechanism Stimulus enhancement Local enhancement Observational conditioning Social enhancement of food preferences Response facilitation
Social facilitation Contextual imitation Production imitation Observational R-S learning Emulation
Definition A demonstrator exposes an observer to a single stimulus, which leads to a change in the probability that the observer will respond to stimuli of that type A demonstrator attracts an observer to a specific location, which can lead to the observer learning about objects at that location The behaviour of the demonstrator exposes an observer to a relationship between stimuli, enabling the observer to form an association between them Exposure to a demonstrator carrying cues associated with a particular diet causes the observer to become more likely to consume that diet A demonstrator performing an act increases the probability that an animal that sees it will do the same. This can result in the observer learning about the context in which to perform the act and the consequences of doing so Social facilitation occurs when the mere presence of a demonstrator affects the observer’s behaviour, which can influence the observer’s learning Observing a demonstrator performing an action in a specific context directly causes an observer to learn to perform that action in the same context Observing a demonstrator performing a novel action, or action sequence, that is not in its own repertoire causes an observer to be more likely to perform that action or sequence Observation of a demonstrator exposes the observer to a relationship between a response and a reinforcer, causing the observer to form an association between them Observation of a demonstrator interacting with objects in its environment causes an observer becomes more likely to perform any actions that bring about a similar effect on those objects
Note that these definitions relate to psychological processes in the observer. The presence or absence of active demonstration or teaching (behaviour whose function is to facilitate learning in others) can be regarded as orthogonal to mechanisms in the observer. Hence, it is possible to categorize instances of teaching as, for example, teaching through local enhancement. For the original sources of these definitions, see Hoppitt and Laland [3] and Hoppitt et al. [84].
finding, although not paradoxical in any strict sense, was viewed as counterintuitive because culture, and thus social learning, is widely thought to be the basis of human population growth [25], which implies an increase in absolute fitness. More recently, spatially explicit models have exacerbated this challenge by suggesting that with certain kinds of population structure and realistic patterns of ecological change, social learning could drive asocial learning to extinction, with disastrous consequences for fitness when environments change [12,13]. This thought experiment vastly simplifies the choices available to individuals. Several studies have shown that a way out of this ‘paradox’ is through the selective use of asocial and social learning [5,12,14,15,18,26]. For example, a strategy termed critical social learning, which uses social
learning initially but switches to asocial learning if it fails to acquire an adaptive behaviour, outcompetes pure social learners and, under most circumstances, asocial learners, while also increasing fitness across a broad range of conditions [12,15]. However, there are also relatively narrow circumstances in which pure social learning outcompetes both individual learning and conditional strategies, while also increasing fitness [12]. The conditions for this exist when individual learning is challenging (e.g. very costly in time) but there are a range of viable alternatives available to copy, any of which might produce a reasonably effective, if not globally optimal, solution. Interestingly, these conditions seem to fit well to some examples of human cultural evolution that are best described by the kind of drift dynamics expected under unbiased (or random) copying, 69
Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Box 2. Functional parallels in the social learning of humans and non-human animals Experimental studies in non-human animals have explored both when animals copy and from whom they do so, and revealed surprising parallels with the social learning of humans [85]. Although the social learning mechanisms used can vary across species (Box 1), this does not mean we cannot learn a lot about the functional consequences of various strategies from comparative studies. Studies of sticklebacks (Pungitius spp.) have revealed evidence that these fish disproportionately copy when uncertain [86], when the demonstrator receives a higher payoff than they do [87,88] and when asocial learning would be costly [89,90]. Sticklebacks are disproportionately more likely to use social information that conflicts with their own experience as the number of demonstrators increases, which provides evidence of conformist bias in this species [91]. It has also been found that small fish are sensitive to a range of attributes in their tutors, including age [92], size [93], boldness [94] and familiarity [95], and adjust their social information use with reproductive state, with gravid females much more likely to use social information than other individuals [90]. A similar set of studies investigated the contexts that promote the social enhancement of food preferences in rats (Rattus norvegicus)
such as choice of pet breeds, baby names and aesthetic craft production [27]. One challenge for the developing field is that the potential diversity of strategies is huge, and only a small number
and provide evidence of the use of various strategies, including copy if dissatisfied, copy when uncertain, and copy in a stable environment [96]. As yet, however, there is no evidence that rats copy selectively with respect to demonstrator age, familiarity, relatedness or success [96]. By contrast, chimpanzees (Pan troglodytes) disproportionately adopt the behaviour of the oldest and highest-ranking of two demonstrators [97], and vervet monkeys (Chlorocebus aethiops) preferentially copy dominant female models over dominant males (females are the philopatric sex in this species) [98]. These studies imply that even relatively simple animals are capable of flexibly using a range of social learning strategies. Although there is clearly scope for further comparative experiments, it is apparent from existing research that strategic learning behaviour has evolved in a range of taxa, with strikingly similar context-specific patterns of copying to those observed in humans clearly evident [58,59,61]. This suggests that the evolution of copying behaviour is best regarded as a convergent response to specific selection pressures, and might not be well predicted by the relatedness of a species to humans.
of plausible strategies have been subject to formal analyses. Nonetheless, many of these have received theoretical support, backed up in several cases by empirical evidence from humans or other animals (Figure 1). Strategies relate
Box 3. Modelling social learning from individuals to populations
70
Table I. Probability that an individual acquires trait c given its frequency in the set of cultural role models Number of role models with c
Probability that a focal individual acquires c 0
0 1 2 3
1 3 2 3
þ
D 3 D 3
1
bias and the recursion expression is p0 = p + Bp(1 p). These equations can be used to compare the fate of trait c over time under different transmission biases, and show that the different individuallevel learning strategies produce different outcomes at the population level (Figure I).
[()TD$FIG]
1 0.9 Frequency of trait
A variety of theoretical approaches has been used to model the evolution of social learning strategies, commonly known as cultural evolution, gene–culture co-evolution and dual inheritance theory [5,9,10,14,16,18–21]. Typically, models are based on systems of recursions that track the frequencies of cultural and genetic variants in a population, often with fitness defined by the match between a behavioural phenotype and the environment. These systems range from those containing only two possible discrete behavioural variants through to traits that vary continuously along one or more dimensions, with evolutionarily stable strategy (ESS) and populationgenetic analyses applied to these models [15,18,21]. Other approaches include multi-armed bandits (in which a number of discrete choices with different expected payoffs are available to players [8,11,32]), reaction-diffusion models (in which differential equations describe the change in frequency of cultural traits over time and incorporate individual learning biases [17]) and information-cascade games (in which individuals choose from a limited set of options after receiving private information and observing the decisions of previous actors [50,52]), all of which have been influential in identifying adaptive social learning strategies. The complexities of tracking genetic and cultural parameters over time, and the need to incorporate increasingly complex learning strategies, have led to greater use of simulation modelling in recent years [12–14,19,26], which has enabled researchers to build models that are spatially explicit [12] and to separately track knowledge and behaviour [32]. Here we illustrate the methods using a classic model of unbiased, directly biased and frequency-dependent biased cultural transmission, introduced by Boyd and Richerson [5]. Consider a cultural trait with two alternative variants, denoted c and d, acquired through social learning. The model tracks the spread of c in the population; the proportion of the population with c is denoted by p. Each individual in the population is exposed to three randomly selected cultural role models: thus, theprobability of having i role models with trait c, given p, is Mðij pÞ ¼ 3i p i ð1 pÞ3 i . To model cultural transmission with frequency-dependent bias, the strength of which is D, expressions for the probability that an individual acquires c when i role models have c are given in Table I (note that when D=0, then transmission is unbiased). This gives a recursion for the frequency of c in the population: p0 = p + Dp(1 p)(2p 1). A direct learning bias can be modelled by assuming that some feature of trait c renders it inherently more likely to be copied. B is the strength of this direct
0.8 0.7 0.6 Key: 0.5
Unbiased transmission (D=0) Frequency−dependent bias (D=0.5) Directly biased transmission (B=0.3)
0.4
2
4 6 Cultural generations
8
10
TRENDS in Cognitive Sciences
Figure I. Individual-level transmission biases produce different outcomes at the population level. The figure shows the time course of trait c when different biases are operating.
()TD$FIG][ Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Copy if uncertain [96] Copy if personal information outdated [86]
State based
Copy if dissatisfied [11]
Unbiased or random copying [9,66]
Copy depending on reproductive state [90]
Copy rare behaviour [54]
Copy if demonstrators consistent [53]
Copy the majority, conformist bias [5,91]
Frequency dependent
Context dependent
Familiarity−based [48,59,95]
Social learning strategies
Bias derived from emotional reaction (e.g. disgust [30]) Content dependent
Dominance rank based [97]
Number of demonstrators [39]
Prestige−based [31]
Bias for social information [28] Bias for memorable or attractive variants [29]
Kin−based [62]
Copy variants that are increasing in frequency [47]
Model based
Guided variation [5] (trial−and−error learning combined with unbiased transmission)
Based on model’s knowledge [43]
Copy if payoff better [87]
Success −based
Copy in proportion to payoff [88]
Size−based [93]
Copy most successful individual [35]
Age−based [92] Gender−based [98] TRENDS in Cognitive Sciences
Figure 1. Social learning strategies for which there is significant theoretical or empirical support. The tree structure is purely conceptual and not based on any empirical data on homology or similarity of cognition. The sources given are not necessarily the first descriptions or the strongest evidence, but are intended as literature entry points for readers.
to both when it is best to choose social sources to acquire information and from whom one should learn. These latter class are often referred to as learning biases [5]. These can be based on content (such as a preference for social information [28], attractive information [29], or content that evokes a strong emotion such as disgust [30]) as well as context, such as the frequency of a trait in a population (e.g. a conformist bias towards adopting the majority behaviour), the payoff associated with it (e.g. copy the most successful individual), or some property of the individuals from whom one learns (model-based biases such as copy familiar individuals). Many studies have focussed on establishing the theoretical viability of a given strategy or a small number of strategies, and explored the conditions under which each is expected to prosper [5,11,12,15,16,18–21,31]. A different approach is to establish a framework within which the relative merits of a wide range of strategies can be evaluated
[11,32]. A recent example is the social learning strategies tournament [32], an open competition in which entrants submitted strategies specifying how agents should learn in order to prosper in a simulated environment (Box 4). This study relaxed some assumptions prevalent in the field, such as that asocial learning is more costly than social learning, to surprising effect. It revealed that copying pays under a far greater range of conditions than ever previously thought, even when extremely error-prone. In any given simulation involving the top-performing strategies, very little of the learning performed was asocial and learning for the winning strategy was almost exclusively social. The strength of this result depends in part on the tournament assumption that individuals build up a repertoire of multiple behaviour patterns, rather than focussing on a single acquired behaviour, as in most analytical theory. This meant that when a copied behaviour turned out to confer low fitness, agents could switch rapidly to an alternative behaviour in the 71
Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Box 4. The social learning strategies tournament
Mean score
(a) 0.9
0.6
0.3
0
0
0.2
0.4
0.6
0.8
1
Proportion of OBSERVE when learning
(b) 0.4
40
Mean score in melee
[()TD$FIG]
20
0.2
0
1
2
3
4
5
6
7
8
9 10
Mean lifetime payoff when alone
observed agent was performing. Agents could only receive payoffs by playing EXPLOIT, and the fitness of agents was determined by the total payoff received divided by the number of iterations through which they had lived. Evolution occurred through a death–birth process, with dying agents replaced by the offspring of survivors; the probability of reproduction was proportional to fitness. Offspring would carry the same strategy as their parents with probability 0.98, such that successful strategies tended to increase in frequency, and another strategy with probability 0.02, so that strategies could invade and re-invade the population. The most important finding was the success of strategies that relied almost entirely on copying (i.e. OBSERVE) to learn behaviour (Figure Ia). Social learning in this context proved an extremely robustly successful strategy because the exploited behaviour patterns available to copy constituted a select subset that had already been chosen for their high payoff (see the main text). The results also highlighted the parasitic nature of social learning, because successful strategies did worse when fixed in the population than when other strategies were present and providing information (Figure Ib).
The social learning strategies tournament was a computer-based competition in which entrants submitted a strategy specifying the best way for agents living in a simulated environment to learn [32]. The simulation environment was characterized as a multi-armed bandit [11] with, in this case, 100 possible arms or behaviour patterns that an agent could learn and subsequently exploit. Each behaviour had a payoff, drawn from an exponential distribution, and the payoff could change over time (the rate of change was a model parameter). This simulated environment contained a population of 100 agents, each controlled by one of the strategies entered into the tournament. In each model iteration, agents selected one of three moves, as specified by the strategy. The first, INNOVATE, resulted in an agent learning the identity and payoff of one new behaviour, selected at random. The second, EXPLOIT, represented an agent choosing to perform a behaviour it already knew and receiving the payoff associated with that behaviour (which might have changed from when the agent learned about it). The third, OBSERVE, represented an agent observing one or more of those agents who chose to play EXPLOIT, and learning the identity and payoff of the behaviour the
0
Tournament rank TRENDS in Cognitive Sciences
Figure I. Social learning strategies tournament results [32]. (a) Strategy score plotted against the proportion of the learning moves that were OBSERVE for that strategy. (b) Final score for the top ten strategies when competing simultaneously with other strategies (black) and individual fitness, measured as mean lifetime payoff, in populations containing only single strategies (red).
repertoire, thereby removing one of the drawbacks to copying identified in the analytical literature. The tournament also highlighted the role of copied individuals as filters of information. Previous theory had placed the onus on learners to perform this adaptive filtering [15], demanding selectivity, and therefore specific cognitive capabilities, on the part of the copier. However, the tournament established that even nonselective copying is beneficial relative to asocial learning, because copied individuals typically perform the highest payoff behaviour in their repertoire generating a non-random sample of high-performance behaviour for others to copy. These insights go some way to explaining the discrepancy between Rogers’ analysis and the empirical fact of human reliance on social information. They also help to explain why social learning is so widespread in nature, observed not just in primates and birds [3], but even in fruit flies and crickets [4]: even indiscriminate copying is generally more efficient than trial-and-error learning. However, because of its design, the tournament provided no information on the issue of from whom one should learn. A similar study incorporating individual identities would be potentially 72
informative, and we suspect that selectivity here would confer additional fitness benefits. Conclusions as to which strategies are likely to prosper depend inevitably on the assumptions built into the models. For example, the conditional strategies described above depend on individuals knowing immediately the payoff of a behavioural option, but this information is not always available. If everyone else is planting potatoes, should you plant potatoes or another crop? Information on the relative payoffs will not be available for months, so a simple conditional strategy is not viable. An influential view is that under such circumstances, it pays to conform to the local traditions [4,16]. Indeed, theoretical models suggest that natural selection should favour such a conformist bias over most conditions that favour social learning [16], which brings us closer to an evolutionary understanding of the behavioural alignment prevalent in human herding behaviour [33]. However, this view has been challenged by subsequent analyses pointing out that conformity can hinder the adoption of good new ideas (and, by inference, cumulative cultural evolution), and therefore can be expected to perform relatively poorly in some circumstances,
Review particularly in changing environments [19,20]. More recent analyses suggest, however, that the strength of conformity is expected to vary with environmental stability and learning costs [18,21]. One way through this debate stems from the suggestion that conformity is only widely favoured when weak, because weak conformity acts to increase the frequency of beneficial variants when they are common, but its action is insufficient to prevent their spread when rare [17]. Such debates, and the formal theory in general, have stimulated an increase in empirical research on the strategic nature of human social learning (Figure 1) that sets out to determine whether copying behaviour fits with the theoretical predictions. Empirical studies Empirical investigations of social learning strategies in humans span a range of scales, from laboratory studies that pick apart the factors affecting minute-by-minute decisions at the individual level [34,35] through to observational work that seeks to explain the population-level frequencies of socially transmitted traits in historical and archaeological data [36–38]. Laboratory-based experiments have been successful in revealing the variety and subtlety of human social information use. Although there is a long tradition of these studies in social psychology [39], the new wave of research that we review here is different because it is rooted in the formal evolutionary theory described above [40]. Thus, whereas social psychology can provide immediate descriptions of the way in which people use social information, more recent research on social learning strategies seeks to link such observations with functional evolutionary explanations [40]. The use of micro-societies [41] and transmission chains [28], in which social learning is studied experimentally in small groups or chains of subjects that change composition, has been very productive. Such experiments have provided evidence of many of the biases explored in the theoretical literature. Examples include a bias for copying successful [35,42] or knowledgeable [43] models, a tendency to conform to route choices [44] and increased reliance on social information when payoff information is delayed [45] or at low rates of environmental change [46]. These experiments have also provided new insights not anticipated by theory; for example, it has been shown that people prefer variants that are increasing in frequency [47] and that in some circumstances people pay more attention to social information that originates outside their own sociocultural group [48]. Recently, some researchers in economics have started to introduce social learning into the experimental study of strategic games. Studies have shown that introduction of intergenerational social information can establish longterm social conventions that do not necessarily represent the predicted optimal strategy for any player [49,50], can drive up contributions in public-goods games [51], and can reveal unexpected biases in people’s valuation of information sources, such as an over-weighting of private information in some conditions [52]. However, this research has yet to overlap with research on social learning strategies, which can potentially provide explanations for this appar-
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
ently suboptimal behaviour in terms of the inherent biases people have about using social information. Importantly, these studies can also throw up significant challenges to existing theory, such as individual variation in people’s responses to social information, which has not yet been considered in the theoretical literature. Some subjects show a greater propensity to use social information than others, and those who do use social information can do so in different ways [34,47,53]. In a recent study using a simple binary choice task (choose the red or the blue technology), only a subset of subjects behaved as predicted by the conformist learning model, with the remaining ‘maverick’ subjects apparently ignoring social information altogether [34]. In another example, reading positive reviews of a piece of music caused some subjects to increase their valuation of that tune, whereas a significant minority actually decreased their evaluations [53]. Social psychology studies suggest that people will switch between conformity and anti-conformity depending on the social context, and are more or less likely to use social information depending on their mood [54]. Such flexibility is not inconsistent with an evolutionary explanation, but rather implies context-specific use of strategies [7]. The extent to which current theory needs to incorporate state-dependent and contextual cues requires exploration, and new formal methods are becoming available that facilitate such extensions [55]. Another area in which empirical and theoretical studies can inform each other is the ontogeny of learning strategies. Early in life, a child is surrounded by adults who have presumably engaged in decades of the kind of knowledge filtering that can make social learning adaptive. Young children have a tendency to imitate even irrelevant actions indiscriminately [56], which might reflect this informational imbalance. Evidence from attention studies suggests that very young infants have evolved mechanisms to focus attention on subtle cues given by their carers that indicate when important information is being made available [57]. As they grow and interact with a wider range of people, the challenge becomes less a problem of when and more of from whom to learn. This is when modelbased, payoff-based, or frequency-dependent biases would become more pertinent. There is ample evidence of model-based learning biases in young children [58–60] and in a surprising number of instances these echo similar patterns observed in other animals (Box 2). For example, preschool-age children (3 years) tend to trust information presented to them by familiar teachers more strongly than that given by unfamiliar teachers [59]. In a follow-on study, older children (5 years) further increased their trust in the information supplied by a familiar teacher who presented information that the children knew to be accurate, but reduced trust when the teacher provided inaccurate information, whereas the trust of younger children in familiar teachers was unaffected by the accuracy of the information provided [61], an example of the way we might expect adaptive social learning strategies to vary ontogenetically. More studies of how learning biases change during life, extending into adolescence and adult life, would be highly instructive in both humans and other animals. 73
Review Recent empirical work on social learning has also escaped the laboratory, which is vital for external validity. For instance, studies in traditional Fijian populations have found that food taboos that lead pregnant and lactating females to avoid consumption of toxic fish are initially transmitted through families, but as individuals get older they preferentially seek out local prestigious individuals to refine their knowledge [62]. Formal theory suggests that such learning strategies are highly adaptive [5]. Another study used the two-technology choice task in the subsistence pastoralist population of the Bolivian Altiplano, where a comparative lack of reliance on social information demonstrated that subtle effects of setting and cultural background probably play an important role in human social learning [63]. These results emphasize flexibility in the use of social information. The combination of novel theory with empirical data has also been successful in understanding the spread of cultural traits across populations. Different social learning strategies lead to different transmission dynamics at the population level, generating detectable signatures in the frequency distributions and temporal dynamics of cultural traits. Comparison of real data with expected distributions can therefore indicate the processes behind the spread of ideas, trends and interests. This approach has been successful in highlighting several cultural domains where unbiased, or random, copying seems to dominate, such as the popularity of baby names, music and choice of dog breeds [37], and of the use of complementary and traditional medicines [64]. It has also illustrated the interactions between independent decisions and social transmission in the spread of interest in disease pandemics such as H5N1 and bird flu virus [65]. Here, random copying refers to unbiased copying in direct proportion to the rate a trait is observed, and does not imply that individual decision-making is random. For instance, in spite of all of the thought and care that individual parents put into choosing their child’s name, parents as a group behave in a manner that is identical to the case in which they choose names at random [37]. The reason for this is nothing more than that common names are more likely to be observed and considered by parents than obscure names, and the likelihood that a name is chosen is approximately proportional to its frequency at the time. These studies also reveal how the drift-like dynamics that result from random copying can be perturbed by the influence of key events, such as a spike in popularity of the Dalmatian dog breed after the re-release of 101 Dalmatians, a film that artificially inflated the number of Dalmatians observed [37]. This work is important because it provides potential tools for interpreting more ancient data when we have much less knowledge of the social context at the time [38,66,67]. Concluding remarks The work we have reviewed here opens up a rich seam of opportunities for future development in several disciplines, from anthropology and cultural evolution through to economics and artificial life. Here we focus on just three. The first is related to the study of cooperation. One of the more intriguing results from the social learning strategies tournament was the parasitic effect of strategies that used 74
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
only social learning. The way that a population learns can be viewed as a cooperation problem: innovators who engage in asocial learning are altruistic cooperators who introduce new information, whereas copiers are defectors who exploit that information. The tournament showed how, at the individual level, the temptation to defect (i.e. copy) is very powerful, but also that populations of defectors do worse than more mixed populations, which creates a classical cooperation dilemma. Although some have recognized the link [5,25,68], there is much to be done before the interactions between social learning strategies, cultural evolution and the evolution of cooperation are fully understood [69,70]. Second, we highlight the way in which computer scientists are now starting to use the concept of strategic social learning, and its interactions with individual learning and genetic evolution, to develop novel algorithms for evolutionary computing [71,72]. These studies show that social learning using a fixed strategy of copying from the most successful individuals significantly increases the success of agents exploring a complex fitness landscape (specifically the NK landscape widely adopted as a test bed for evolutionary computation), a result that striking parallels anthropological research on human social learning [35]. The prospect that research on social learning strategies can simultaneously provide inspiration for those working at the cutting edge of technology while benefiting from the novel insights such a dynamic field can produce is tremendously exciting. Finally, we see open fields for research into the neurobiological basis of social learning. Hitherto, most experimental neuroscience concerned with learning and decisionmaking has focused largely on asocial learning, in spite of the important role of social influences on human learning. Research exploring the brain pathways and structures used in social learning and socially biased decision-making is needed. One pressing question is to what extent different social learning processes and strategies map onto different neural circuits. A pioneering study exploring how the opinion of others affects the valuation of objects has revealed that the human anterior insula cortex or lateral orbitofrontal cortex uniquely responds to the unanimous opinions of others [53]. This finding is suggestive of an evolved neural sensitivity to consistency in demonstrator behaviour, and is consistent with an economics experiment that suggests that people are more reinforced by following social information than otherwise expected by payoff alone [8]. Another key issue is whether our brains contain circuitry specific to social information processing, or whether these processes piggyback on established reinforcement learning circuitry. Recent evidence is suggestive of the latter [73], but our general lack of knowledge in this area is profound. Clearly, the study of social learning strategies is a rapidly growing field with implications for multiple fields of research (Box 5). The empirical studies reviewed here reveal the subtlety and complexity of the learning strategies used by humans. An important contribution of this work, in parallel with studies on non-humans, is to challenge the notion of a single best strategy, or a strategy associated with a particular type of individual, or species.
Review Box 5. Questions for future research How are the performances of various learning strategies generalized across different learning environments? Can social learning be studied as a cooperation game? Innovators who engage in asocial learning could be viewed as altruistic cooperators who introduce new information, whereas copiers are defectors who exploit that information. Conversely, how might social learning strategies affect the establishment and maintenance of cooperation? Can social learning be used to develop novel algorithms for evolutionary computing and robotics? Do our brains contain circuitry specific to social information processing, or do these processes piggyback on established reinforcement learning circuitry?
Rather, recent work emphasizes instead the way in which the flexible context-dependent use of a range of subtle biases is a general feature of social learning, in both humans and other animals. In future, this should inspire theoretical researchers in turn to take on the challenge of incorporating meta-strategies into their models. Acknowledgements This work was funded by an ERC Advanced Fellowship to K.N.L.
References 1 Heyes, C.M. (1994) Social learning in animals: categories and mechanisms. Biol. Rev. 69, 207–231 2 Giraldeau, L-A. et al. (2003) Potential disadvantages of using socially acquired information. Philos. T. Roy. Soc. Lond. Ser. B: Biol. Sci. 357, 1559–1566 3 Hoppitt, W. and Laland, K.N. (2008) Social processes influencing learning in animals: a review of the evidence. Adv. Study Behav. 38, 105–165 4 Leadbeater, E. and Chittka, L. (2007) Social learning in insects — From miniature brains to consensus building. Curr. Biol. 17, R703–R713 5 Boyd, R. and Richerson, P.J. (1985) Culture and the Evolutionary Process, Chicago University Press 6 Henrich, J. and McElreath, R. (2003) The evolution of cultural evolution. Evol. Anthropol. 12, 123–135 7 Laland, K.N. (2004) Social learning strategies. Learn. Behav. 32, 4–14 8 Biele, G. et al. (2009) Computational models for the combination of advice and individual learning. Cognitive Sci. 33, 206–242 9 Cavalli-Sforza, L.L. and Feldman, M.W. (1981) Cultural Transmission and Evolution: A Quantitative Approach, Princeton University Press 10 Rogers, A. (1988) Does biology constrain culture? Am. Anthropol. 90, 819–831 11 Schlag, K.H. (1998) Why imitate, and if so, how? J. Econ. Theory 78, 130–156 12 Rendell, L. et al. (2010) Rogers’ paradox recast and resolved: population structure and the evolution of social learning strategies. Evolution 64, 534–548 13 Whitehead, H. and Richerson, P.J. (2009) The evolution of conformist social learning can cause population collapse in realistically variable environments. Evol. Hum. Behav. 30, 261–273 14 Kameda, T. and Nakanishi, D. (2003) Does social/cultural learning increase human adaptability? Rogers’s question revisited. Evol. Hum. Behav. 24, 242–260 15 Enquist, M. et al. (2007) Critical social learning: a solution to Rogers’ paradox of non-adaptive culture. Am. Anthropol. 109, 727–734 16 Henrich, J. and Boyd, R. (1998) The evolution of conformist transmission and the emergence of between group differences. Evol. Hum. Behav. 19, 215–241 17 Kandler, A. and Laland, K.N. (2009) An investigation of the relationship between innovation and cultural diversity. Theor. Popul. Biol. 76, 59–67 18 Kendal, J. et al. (2009) The evolution of social learning rules: payoffbiased and frequency-dependent biased transmission. J. Theor. Biol. 260, 210–219
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2 19 Eriksson, K. et al. (2007) Critical points in current theory of conformist social learning. J. Evol. Psychol. 5, 67–87 20 Wakano, J.Y. and Aoki, K. (2007) Do social learning and conformist bias coevolve? Henrich and Boyd revisited. Theor. Popul. Biol. 72, 504– 512 21 Nakahashi, W. (2007) The evolution of conformist transmission in social learning when the environment changes periodically. Theor. Popul. Biol. 72, 52–66 22 Henrich, J. and McElreath, R. (2007) Dual inheritance theory: the evolution of human cultural capacities and cultural evolution. In Oxford Handbook of Evolutionary Psychology (Barrett, R.D. and Marrett, L., eds), Oxford University Press, pp. 555–570 23 McElreath, R. et al. (2008) Beyond existence and aiming outside the laboratory: estimating frequency-dependent and pay-off-biased social learning strategies. Phil. Trans. R. Soc. B 363, 3515–3528 24 Franz, M. and Matthews, L.J. (2010) Social enhancement can create adaptive, arbitrary and maladaptive cultural traditions. Proc. R. Soc. Lond. B. 277, 3363–3372 25 Richerson, P.J. and Boyd, R. (2005) Not by Genes Alone, University of Chicago Press 26 Franz, M. and Nunn, C.L. (2009) Rapid evolution of social learning. J. Evol. Biol. 22, 1914–1922 27 Bentley, R.A. et al. (2004) Random drift and culture change. Proc. R. Soc. Lond. B 271, 1443–1450 28 Mesoudi, A. et al. (2006) A bias for social information in human cultural transmission. Brit. J. Psychol. 97, 405–423 29 Bangerter, A. and Heath, C. (2004) The Mozart effect: tracking the evolution of a scientific legend. Brit. J. Soc. Psychol. 43, 605–623 30 Heath, C. et al. (2001) Emotional selection in memes: the case of urban legends. J. Pers. Soc. Psychol. 81, 1028–1041 31 Henrich, J. and Gil-White, F.J. (2001) The evolution of prestige: freely conferred deference as a mechanism for enhancing the benefits of cultural transmission. Evol. Hum. Behav. 22, 165–196 32 Rendell, L. et al. (2010) Why copy others? Insights from the social learning strategies tournament. Science 328, 208–213 33 Raafat, R.M. et al. (2009) Herding in humans. Trends Cogn. Sci. 13, 420–428 34 Efferson, C. et al. (2008) Conformists and mavericks: the empirics of frequency-dependent cultural transmission. Evol. Hum. Behav. 29, 56– 64 35 Mesoudi, A. (2008) An experimental simulation of the ‘‘copy-successfulindividuals’’ cultural learning strategy: adaptive landscapes, producer–scrounger dynamics, and informational access costs. Evol. Hum. Behav. 29, 350–363 36 Atkinson, Q.D. et al. (2008) Languages evolve in punctuational bursts. Science 319, 588 37 Bentley, R.A. et al. (2007) Regular rates of popular culture change reflect random copying. Evol. Hum. Behav. 28, 151–158 38 Hamilton, M.J. and Buchanan, B. (2009) The accumulation of stochastic copying errors causes drift in culturally transmitted technologies: quantifying Clovis evolutionary dynamics. J. Anthropol. Archaeol. 28, 55–69 39 Bond, R. (2005) Group size and conformity. Group Processes & Intergroup Relations 8, 331–354 40 Mesoudi, A. (2009) How cultural evolutionary theory can inform social psychology and vice versa. Psychol. Rev. 116, 929–952 41 Baum, W.M. et al. (2004) Cultural evolution in laboratory microsocieties including traditions of rule giving and rule following. Evol. Hum. Behav. 25, 305–326 42 Apesteguia, J. et al. (2007) Imitation – theory and experimental evidence. J. Econ. Theory 136, 217–235 43 Henrich, J., and Broesch, J. (2011) On the nature of cultural transmission networks: evidence from Fijian villages for adaptive learning biases. Phil. Trans. R. Soc. B, in press 44 Reader, S.M. et al. (2008) Social learning of novel route preferences in adult humans. Biol. Lett. 4, 37–40 45 Caldwell, C.A. and Millen, A.E. (2010) Conservatism in laboratory microsocieties: unpredictable payoffs accentuate group-specific traditions. Evol. Hum. Behav. 31, 123–130 46 Toelch, U. et al. (2009) Decreased environmental variability induces a bias for social information use in humans. Evol. Hum. Behav. 30, 32–40 47 Toelch, U. et al. (2010) Humans copy rapidly increasing choices in a multiarmed bandit problem. Evol. Hum. Behav. 31, 326–333
75
Review 48 Healy, A. (2009) How effectively do people learn from a variety of different opinions? Exp. Econ. 12, 386–416 49 Schotter, A. and Sopher, B. (2003) Social learning and coordination conventions in intergenerational games: an experimental study. J. Polit. Econ. 111, 498–529 50 Ku¨bler, D. and Weizsa¨cker, G. (2004) Limited depth of reasoning and failure of cascade formation in the laboratory. Rev. Econ. Stud. 71, 425–441 51 Chaudhuri, A. et al. (2006) Social learning and norms in a public goods experiment with inter-generational advice. Rev. Econ. Stud. 73, 357– 380 52 Goeree, J.K. et al. (2007) Self-correcting information cascades. Rev. Econ. Stud. 74, 733–762 53 Campbell-Meiklejohn, D.K. et al. (2010) How the opinion of others affects our valuation of objects. Curr. Biol. 20, 1165–1170 54 Griskevicius, V. et al. (2006) Going along versus going alone: when fundamental motives facilitate strategic (non)conformity. J. Pers. Soc. Psychol. 91, 281–294 55 Tinker, M.T. et al. (2009) Learning to be different: acquired skills, social learning, frequency dependence, and environmental variation can cause behaviourally mediated foraging specializations. Evol. Ecol. Res. 11, 841–869 56 Lyons, D.E. et al. (2007) The hidden structure of overimitation. Proc. Natl. Acad. Sci. U. S. A. 104, 19751–19756 57 Gergely, G. et al. (2005) The social construction of the cultural mind: imitative learning as a mechanism of human pedagogy. Interaction Studies 6, 463–481 58 Harris, P.L. (2007) Trust. Dev. Sci 10, 135–138 59 Corriveau, K. and Harris, P.L. (2009) Choosing your informant: weighing familiarity and recent accuracy. Dev. Sci. 12, 426–437 60 Pasquini, E.S. et al. (2007) Preschoolers monitor the relative accuracy of informants. Dev. Psychol. 43, 1216–1226 61 Harris, P.L., and Corriveau, K. (2011) Young children’s selective trust in informants. Phil. Trans. R. Soc. B, in press 62 Henrich, J. and Henrich, N. (2010) The evolution of cultural adaptations: Fijian food taboos protect against dangerous marine toxins. Proc. R. Soc. Lond. B 277, 3715–3724 63 Efferson, C. et al. (2007) Learning, productivity, and noise: an experimental study of cultural transmission on the Bolivian Altiplano. Evol. Hum. Behav. 28, 11–17 64 Tanaka, M.M. et al. (2009) From traditional medicine to witchcraft: why medical treatments are not always efficacious. Plos One 4, e5192 65 Bentley, R.A. and Ormerod, P. (2010) A rapid method for assessing social versus independent interest in health issues: a case study of ‘bird flu’ and ‘swine flu’. Social Science & Medicine 71, 482–485 66 Shennan, S.J. and Wilkinson, J.R. (2001) Ceramic style change and neutral evolution: a case study from Neolithic Europe. Am. Antiq. 66, 577–593 67 Rogers, D.S. et al. (2009) Inferring population histories using cultural data. Proc. R. Soc. Lond. B 276, 3835–3843 68 Sigmund, K. et al. (2010) Social learning promotes institutions for governing the commons. Nature 466, 861–863 69 Boyd, R. and Richerson, P.J. (2009) Culture and the evolution of human cooperation. Phil. Trans. R. Soc. B 364, 3281–3288 70 West, S.A., et al. (2010) Sixteen common misconceptions about the evolution of cooperation in humans. Evol. Hum. Behav. DOI: 10.1016/ j.evolhumbehav.2010.08.001. 71 Hashimoto, T. et al. (2010) New composite evolutionary computation algorithm using interactions among genetic evolution, individual learning and social learning. Intell. Data Anal. 14, 497–514 72 Curran, D., et al. (2007) Evolving cultural learning parameters in an NK fitness landscape. In Proceedings of the 9th European Conference on Advances in Artificial Life, pp. 304–314, Springer-Verlag
76
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2 73 Klucharev, V. et al. (2009) Reinforcement learning signal predicts social conformity. Neuron 61, 140–151 74 Heyes, C.M. (1993) Imitation, culture and cognition. Anim. Behav. 46, 999–1010 75 Whiten, A. and Ham, R. (1992) On the nature and evolution of imitation in the animal kingdom: reappraisal of a century of research. Adv. Study Behav. 21, 239–283 76 Zentall, T.R. (1996) An analysis of imitative learning in animals. In Social Learning in Animals: The Roots of Culture (Heyes, C.M. and Galef, B.G., eds), pp. 221–243, Academic Press 77 Heyes, C.M. and Ray, E.D. (2000) What is the significance of imitation in animals? In Advances in the Study of Behavior (Slater, P.J.B. et al., eds), pp. 215–245, Academic Press 78 Whiten, A. et al. (2004) How do apes ape? Learn. Behav. 32, 36–52 79 Huber, L. et al. (2009) The evolution of imitation: what do the capacities of non-human animals tell us about the mechanisms of imitation? Phil. Trans. R. Soc. B 364, 2299–2309 80 Brass, M. and Heyes, C. (2005) Imitation: is cognitive neuroscience solving the correspondence problem? Trends Cogn. Sci. 9, 489– 495 81 Heyes, C. (2009) Evolution, development and intentional control of imitation. Phil. Trans. R. Soc. B 364, 2293–2298 82 Hoppitt, W. et al. (2010) Detecting social transmission in networks. J. Theor. Biol. 263, 544–555 83 Kendal, R.L. et al. (2009) Identifying social learning in animal populations: a new ‘option-bias’ method. Plos One 4, e6541 84 Hoppitt, W.J.E. et al. (2008) Lessons from animal teaching. Trends Ecol. Evol. 23, 486–493 85 Kendal, R.L. et al. (2005) Trade-offs in the adaptive use of social and asocial learning. In Advances in the Study of Behavior (Slater, P.J.B. et al., eds), pp. 333–379, Academic Press 86 van Bergen, Y. et al. (2004) Nine-spined sticklebacks exploit the most reliable source when public and private information conflict. Proc. R. Soc. Lond. B 271, 957–962 87 Kendal, J.R. et al. (2009) Nine-spined sticklebacks deploy a hillclimbing social learning strategy. Behav. Ecol. 20, 238–244 88 Pike, T.W. et al. (2010) Learning by proportional observation in a species of fish. Behav. Ecol. 21, 570–575 89 Webster, M.M. and Laland, K.N. (2008) Social learning strategies and predation risk: minnows copy only when using private information would be costly. Proc. R. Soc. Lond. B 275, 2869–2876 90 Webster, M.M., and Laland, K.N. Reproductive state affects reliance on public information in sticklebacks. Proc. R. Soc. Lond. B, Published online September 8 2010, DOI:10.1098/rspb.2010.1562 91 Pike, T.W. and Laland, K.N. (2010) Conformist learning in nine-spined sticklebacks’ foraging decisions. Biol. Lett. 6, 466–468 92 Dugatkin, L.A. and Godin, J-GJ. (1993) Female mate copying in the guppy (Poecilia reticulata): age-dependent effects. Behav. Ecol. 4, 289– 292 93 Duffy, G.A. et al. (2009) Size-dependent directed social learning in ninespined sticklebacks. Anim. Behav. 78, 371–375 94 Godin, J.G. and Dugatkin, L.A. (1996) Female mating preference for bold males in the guppy, Poecilia reticulata. Proc. Natl. Acad. Sci. U. S. A. 93, 10262–10267 95 Swaney, W. et al. (2001) Familiarity facilitates social learning of foraging behaviour in the guppy. Anim. Behav. 62, 591–598 96 Galef, B.G., Jr (2009) Strategies for social learning: testing predictions from formal theory. Adv. Study Behav. 39, 117–151 97 Horner, V. et al. (2010) Prestige affects cultural learning in chimpanzees. Plos One 5, e10625 98 van de Waal, E. et al. (2010) Selective attention to philopatric models causes directed social learning in wild vervet monkeys. Proc. R. Soc. Lond. B 277, 2105–2111
Review
Visual search in scenes involves selective and nonselective pathways Jeremy M. Wolfe, Melissa L.-H. Vo˜, Karla K. Evans and Michelle R. Greene Brigham & Women’s Hospital, Harvard Medical School, 64 Sidney St. Suite 170, Cambridge, MA 02139, USA
How does one find objects in scenes? For decades, visual search models have been built on experiments in which observers search for targets, presented among distractor items, isolated and randomly arranged on blank backgrounds. Are these models relevant to search in continuous scenes? This article argues that the mechanisms that govern artificial, laboratory search tasks do play a role in visual search in scenes. However, scene-based information is used to guide search in ways that had no place in earlier models. Search in scenes might be best explained by a dual-path model: a ‘selective’ path in which candidate objects must be individually selected for recognition and a ‘nonselective’ path in which information can be extracted from global and/or statistical information. Searching and experiencing a scene It is an interesting aspect of visual experience that you can look for an object that is, literally, right in front of your eyes, yet not find it for an appreciable period of time. It is clear that you are seeing something at the location of the object before you find it. What is that something and how do you go about finding that desired object? These questions have occupied visual search researchers for decades. Whereas visual search papers have conventionally described search as an important real-world task, the bulk of research had observers looking for targets among some number of distractor items, all presented in random configurations on otherwise blank backgrounds. During the past decade, there has been a surge of work using more naturalistic scenes as stimuli and this has raised the issue of the relationship of the search to the structure of the scene. In this article, we briefly summarize some of the models and solutions developed with artificial stimuli and then describe what happens when these ideas confront search in real-world scenes. We argue that the process of object recognition, required for most search tasks, involves the selection of individual candidate objects because all objects cannot be recognized at once. At the same time, the experience of a continuous visual field tells you that some aspects of a scene reach awareness without being limited by the selection bottleneck in object recognition. Work in the past decade has revealed how this nonselective processing is put to use when you search in real scenes. Classic guided search One approach to search, developed from studies of simple stimuli randomly placed on blank backgrounds, can be Corresponding author: Wolfe, J.M. (
[email protected]).
called ‘classic guided search’ [1]. It has roots in Treisman’s Feature Integration Theory [2]. As we briefly review below, it holds that search is necessary because object recognition processes are limited to one or, perhaps, a few objects at one time. The selection of candidate objects for subsequent recognition is guided by preattentively acquired information about a limited set of attributes, such as color, orientation and size. Object recognition is capacity limited You need to search because, although you are good at recognizing objects, you cannot recognize multiple objects simultaneously. For example, all of the objects in Figure 1 are simple in construction, but if you are asked to find ‘T’s that are both purple and green, you will find that you need to scrutinize each item until you stumble upon the targets (there are four). It is introspectively obvious that you can see a set of items and could give reasonable estimates for their number, color, and so forth. However, recognition of a specific type of item requires another step of binding the visual features together [3]. That step is capacity limited and, often, attention demanding [4] (however, see [5]). In the case of Figure 1, the ability to recognize one object is also going to be limited by the proximity of other, similar items. These ‘crowding’ phenomena have attracted increasing interest in the past few years ([6,7]). However, although it would be a less compelling demonstration, it would still be necessary to attend to item after item to bind their features and recognize them even if there were only a few items and even if those were widely spaced [8]. The selection mechanism is a serial–parallel hybrid Whereas it is clear that object recognition is capacity limited, the nature of that limitation has been less clear (for an earlier discussion of this issue, see [9]). The classic debate has been between ‘serial’ models that propose that items are processed one after the other [2] and ‘parallel’ models that hold that multiple objects, perhaps all objects, are processed simultaneously but that the efficiency of processing of any one item decreases as the number of items increases [10,11]. The debate has been complicated by the fact that the classic reaction time data, used in many experiments, are ambiguous in the sense that variants of serial and parallel models can produce the same patterns of data [12]. Neural evidence has been found in support of both types of process (Box 1). Similar to many cognitive science debates, the correct answer to the serial–parallel debate is probably ‘both’. Consider the timing parameters of search. One can esti-
1364-6613/$ – see front matter ß 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.tics.2010.12.001 Trends in Cognitive Sciences, February 2011, Vol. 15, No. 2
77
()TD$FIG][ Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
TRENDS in Cognitive Sciences
Figure 1. Find the four purple-and-green Ts. Even though it is easy to identify such targets, this task requires search.
mate the rate at which items are processed from the slopes of the reaction time (RT) by set size functions. Although the estimate depends on assumptions about factors such as memory for rejected distractors (Box 2), it is in the range of 20–50 msec/item for easily identified objects that do not need to be individually fixated [13]. This estimate is significantly faster than any estimate of the total amount of time required to recognize an object [14]. Even on the short end, object recognition seems to require more than 100 msec/item (<10 items/sec). Note that we are speaking about the time required to identify an object, not the minimum time that an observer must be exposed to an object, which can be very short indeed [15]. As a solution to this mismatch of times, Moore and Wolfe [16] proposed a metaphorical ‘carwash’ (also called ‘pipeline’ in computer science). Items might enter the binding
and recognition carwash one after another every 50 msec or so. Each item might remain in the process of recognition for several hundred milliseconds. As a consequence, if an experimenter looked at the metaphorical front or the back of the carwash, serial processing would dominate, but if one looked at the carwash as a whole, one would see multiple items in the process of recognition in parallel. Other recent models also have a serial–parallel hybrid aspect, although they are often different from the carwash in detail [17,18]. Consider, for example, models of search with a primary focus on eye movements [19–21]. Here, the repeated fixations impose a form of serial selection every 250 msec or so. If one proposes that five or six items are processed in parallel at each fixation, one can produce the throughput of 20–30 items/second found in search experiments. Interestingly, with large stimuli that can be re-
Box 1. Neural signatures of parallel and serial processing What would parallel and serial processing look like at a neuronal level? One type of parallel processing in visual search is the simultaneous enhancement of all items with a preferred feature (e.g. all the red items). Several studies have shown that, for cells demonstrating a preference for a specific feature, the preference is stronger when the task is to find items with that feature [77]. For serial processing, one would like to see the ‘spotlight’ of attention moving around from location to location. Buschman and Miller [78] saw something similar to this when it turned out that monkeys in their experiment liked to search a circular array of items in the same sequence on every trial. As a result, with multiple electrodes in place, the authors could see an attentional enhancement rise at the 3 o’clock position, then fall at 3 and rise at 6, as attention swept around in a serial manner to find a target that might be at the 9 o’clock position in that particular trial. Similar shifts of attention can be seen in human evoked potential recordings [79]. Bichot et al. [80] produced an attractive illustration of both processes at work in visual area, V4. When the monkey was searching for ‘red’, a cell that liked red would be more active, no matter where the monkey was looking and/or attending. If the next eye movement was going to take the target item into the receptive field of the cell, the cell showed another burst of activity as serial attention reached it in advance of the eyes.
78
Box 2. Memory in visual search There is a body of seemingly contradictory findings about the role of memory in search. First, there is the question of memory during a search. Do observers keep track of where they have been, for example, by inhibiting rejected distractors? There is some evidence for inhibition of return in visual search [81,82], although it seems clear that observers cannot use inhibition to mark every rejected distractor [16,83]. Plausibly, memory during search serves to prevent perseveration on single salient items [82,84]. What about memory for completed searches? If you find a target once, are you more efficient when you search for it again? A body of work on ‘repeated search’ finds that search efficiency does not improve even over hundreds of trials of repetition [85,86]. By contrast, observers can remember objects that have been seen during search [87] and implicit memory for the arbitrary layout of displays can speed their response [88]. How can all of these facts be true? Of course, observers remember some results of search. (Where did I find those scissors last time?). The degree to which these memories aid subsequent search depends on whether it is faster to retrieve the relevant memory or to repeat the visual search. In many simple tasks (e.g. with arrays of letters; [86]), memory access is slower than is visual search [85]. In many more commonplace searches (those scissors), memory will serve to speed the search.
Review solved in the periphery, the pattern of response time data is similar with and without eye movements [22]. Given the close relationship of eye movements and attention [23], it could be proposed that search is accomplished by selecting successive small groups of items, whether the eyes move or not. Note that all of these versions are hybrids of some serial selection and parallel processing. A set of basic stimulus attributes guide search Object recognition might require attention to an object [24], but not every search requires individual scrutiny of random items before the target is attended. For example, in Figure 1, it is trivial to find the one tilted ‘T’. Orientation is one of the basic attributes that can guide the deployment of attention. A limited set of attributes can be used to reduce the number of possible target items in a display. If you are looking for the big, red, moving vertical line, you can guide your attention toward the target size, color, motion and orientation. We label the idea of guidance by a limited set of basic attributes as ‘classic guided search’ [25]. The set of basic attributes is not perfectly defined but there are probably between one and two dozen [26]. In the search for the green-and-purple Ts of Figure 1, guidance fails. Ts and Ls both contain a vertical and a horizontal line, so orientation information is not useful. The nature of the T or L intersection is also not helpful [27]; neither can guidance help by narrowing the search to the items that are both green and purple. When you specify two features (here two colors) of the same attribute, attention is guided to the set of items that contain either purple or green. In Figure 1, this is the set of all items [28] so no useful guidance is possible. The internal representation of guiding attributes is different from the perceptual representation of the same attributes. What you see is not necessarily what guides your search. Consider color as an example. An item of unique color ‘pops out’. You would have no problem finding the one red thing among yellow things [29]. The red thing looks salient and it attracts attention. It is natural to assume that the ability to guide attention is basically the same as the perceived salience of the item [30,31]. However, look for the desaturated, pale targets in Figure 2 (there are two in each panel). In each case, the target lies halfway between the saturated and white distractors in a
[()TD$FIG]
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
perceptual color space. In the lab, although not in this figure, the colors can be precisely controlled so that the perceived difference between red and pale red is the same as the difference between pale green and green or pale blue and blue. Nevertheless, the desaturated red target will be found more quickly [32], a clear dissociation between guidance and perception. Similar effects occur for other guiding attributes, such as orientation [33]. The representation guiding attention should be seen as a control device, managing access to the binding and recognition bottleneck. It does not reveal itself directly in conscious perception. Visual search in natural(istic) scenes The failure of classic guided search To this point, we have described what could be called ‘classic guided search’ [1,25]. Now, suppose that we wanted to apply this classic guided search theory to the real world. Find the bread in Figure 3a. Guided search, and similar models, would say that the one to two dozen guiding attributes define a high-dimensional space in which objects would be quite sparsely represented. That is, ‘bread’ would be defined by some set of features [21]. If attention were guided to objects lying in the portion of the high-dimensional feature space specified by those features, few other objects would be found in the neighborhood [34]. Using a picture of the actual bread would produce better guidance than its abstract label (‘bread’) because more features of the specific target would be precisely described [35]. So in the real world, attention would be efficiently guided to the few bread-like objects. Guidance would reduce the ‘functional set size’ [36]. It is a good story, but it is wrong or, at least, incomplete. The story should be just as applicable to search for the loaf of bread in Figure 3b; maybe more applicable as these objects are clearly defined on a blank background. However, searches for isolated objects are inefficient [37], whereas searches such as the kitchen search are efficient (given some estimate of ‘set size’ in real scenes) [38]. Models such as guided search, based on bottom-up and top-down processing of a set of ‘preattentive’ attributes, seem to fail when it comes to explaining the apparent efficiency of search in the real world. Guiding attributes do some work [21,39], but not enough.
TRENDS in Cognitive Sciences
Figure 2. Find the desaturated color dots. Colors are only an approximation of the colors that would be used in a carefully calibrated experiment. The empirical result is that it is easier to find the pale-red (pink) targets than to find the pale-green or -blue targets.
79
()TD$FIG][ Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
(a)
(b)
TRENDS in Cognitive Sciences
Figure 3. Find the loaf of bread in each of (a) and (b).
The way forward: expanding the concept of guidance for search in scenes Part of the answer is that real scenes are complex, but never random. Elements are arranged in a rule-governed manner: people generally appear on horizontal surfaces [40,41], chimneys appear on roofs [42] and pots on stoves [43]. Those and other regularities of scenes can provide scene-based guidance. Borrowing from the memory literature, we refer to ‘semantic’ and ‘episodic’ guidance. Semantic guidance includes knowledge of the probability of the presence of an object in a scene [43] and of its probable location in that scene given the layout of the space [40,44], as well as interobject relations (e.g. knives tend to be near forks, [45]). Violations of these expectations impede object recognition [46] and increase allocation of attention [43]. It can take longer to find a target that is semantically misplaced, (e.g. searching for the bread in the sink [47]). Episodic guidance, which we will merely mention here, refers to memory for a specific, previously encountered scene that comprises information about specific locations of specific objects [48]. Having looked several times, you know that the bread is on the counter to the left, not in all scenes, but in this one. The role of memory in search is complex (Box 2), but it is the case that you will be faster, on average, to find bread in your kitchen than in another’s kitchen. When searching for objects in scenes, classic sources of guidance combine with episodic and semantic sources of guidance to direct your attention efficiently to those parts of the scene that have the highest probability of containing targets [40,49–51]. In naturalistic scenes, guidance of eye movements by bottom-up salience seems to play a minor role compared with guidance by more knowledge-based factors [51,52]. A short glimpse of a scene is sufficient to narrow down search space and efficiently guide gaze [53] as long as enough time is available to apply semantic knowledge to the initial scene representation [44]. However, semantic guidance cannot be too generic. Presenting a word prime (e.g. ‘kitchen’) instead of a preview of the scene does not produce much guidance [35]. Rather, the combination of semantic scene knowledge (kitchens) with infor80
mation about the structure of the specific scene (this kitchen) seems to be crucial for effective guidance of search in real-world scenes [44,51]. A problem: where is information about the scene coming from? It seems reasonable to propose that semantic and episodic information about a scene guides search for objects in the scene, but where does that information come from? For scene information to guide attention to probable locations of ‘bread’ in Figure 3a, you must know that the figure shows something like a kitchen. One might propose that the information about the scene develops as object after object is identified. A ‘kitchen’ hypothesis might emerge quickly if you were lucky enough to attend first to the microwave and then to the stove, but if you were less fortunate and attended to a lamp and a window, your kitchen hypothesis might come too late to be useful. A nonselective pathway to gist processing Fortunately, there is another route to semantic scene information. Humans are able to categorize a scene as a forest without selecting individual trees for recognition [54]. A single, brief fixation on the kitchen of Figure 3a would be enough to get the ‘gist’ of that scene. ‘Gist’ is an imperfectly defined term but, in this context, it includes the basic-level category of the scene, an estimate of the distributions of basic attributes, such as color and texture [55], and the spatial layout [54,56–58]. These statistical and structural cues allow brief exposures to support abovechance categorization of scenes into, for example, natural or urban [54,59,60] or containing an animal [15,61]. Within a single fixation, an observer would know that Figure 3a was a kitchen without the need to segment and identify its component objects. At 20–50 objects/second, that observer will have collected a few object identities as well but, on average, these would not be sufficient to produce categorization [54,62]. How is this possible? The answer appears to be a twopathway architecture somewhat different from, but per-
Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
haps related to, previous two-pathway proposals [63,64], and somewhat different from classic two-stage, preattentive-attentive models (Box 3). The basic idea is cartooned in Figure 4. Visual input feeds a capacity-limited ‘selective pathway’. As described earlier, selection into the bottleneck is mediated by classic guidance and, when possible, by semantic and episodic guidance. In this two-pathway view, the raw material for semantic guidance could be generated in a nonselective pathway that is not subject to the same capacity limits. Episodic guidance would be based on the results of selective and nonselective processing. What is a ‘nonselective pathway’? It is important not to invest a nonselective pathway with too many capabilities. If all processing could be done without selection and fewer capacity limits, one would not need a selective pathway. Global nonselective image processing allows observers to extract statistical information rapidly from the entire image. Observers can assess the mean and distribution of a variety of basic visual feature dimensions: size [65], orientation [66], some contrast texture descriptors [67], velocity and direction of motion [68], magnitude estimation [69], center of mass for a set of objects [70] and center of area [71]. Furthermore, summary statistics can be calculated for more complex attributes, such as emotion [72] or the presence of classes of objects (e.g. animal) in a scene [73]. Using these image statistics, models and (presumably) humans, can categorize scenes [54,56,57] and extract basic
[()TD$FIG]
Box 3. Old and new dichotomies in theories of visual search The dichotomy between selective and nonselective pathways, proposed here, is part of a long tradition of proposing dichotomies between processes with strong capacity limits that restrict their work to one or a few objects or locations and processes that are able to operate across the entire image. It is worth briefly noting the similarities and differences with some earlier formulations. Preattentive and attentive processing Preattentive processing is parallel processing over the entire image. Similar to nonselective processing, it is limited in its capabilities. In older formulations such as Feature Integration Theory [2], it handled only basic features, such as color and orientation, but it could be expanded to include the gist and statistical-processing abilities of a nonselective pathway. The crucial difference is embodied in the term ‘preattentive’. In its usual sense, preattentive processing refers to processing that occurs before the arrival in time or space of attentive processing [89]. Nonselective processing, by contrast, is proposed to occur in parallel with selective processing, with the outputs of both giving rise to visual experience. Early and late selection The nonselective pathway could be seen as a form of late selection in which processing proceeds to an advanced state before any bottleneck in processing [90]. The selective pathway embodies early selection with only minimal processing before the bottleneck. Traditionally, these have been seen as competing alternatives that coexist here. However, traditional late selection would permit object recognition (e.g. word recognition) before a bottleneck. The nonselective pathway, although able to extract some semantic information from scenes, is not proposed to have the ability to recognize either objects or letters.
y wa h t a
p ve
e
ns
No
ti lec
Episodic
Early vision
Semantic
Selecti
Features Color Orientation Size Depth Motion Etc.
Guidance
way ve path Binding & Recognition
Bottleneck TRENDS in Cognitive Sciences
Figure 4. A two-pathway architecture for visual processing. A selective pathway can bind features and recognize objects, but it is capacity limited. The limit is shown as a ‘bottleneck’ in the pathway. Access to the bottleneck is controlled by guidance mechanisms that allow items that are more likely to be targets preferential access to feature binding and object recognition. Classic guidance, cartooned in the box above the bottleneck, gives preference to items with basic target features (e.g. color). This article posits scene guidance (semantic and episodic), with semantic guidance derived from a nonselective pathway. This nonselective pathway can extract statistics from the entire scene, enabling a certain amount of semantic processing, but not precise object recognition.
81
()TD$FIG][ Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
TRENDS in Cognitive Sciences
Figure 5. What do you see? How does that change when you are asked to look for an untilted bird or trees with brown trunks and green boughs? It is proposed that a nonselective pathway would ‘see’ image statistics, such as average color or orientation, in a region. It could get the ‘gist’ of forest and, perhaps, the presence of animals. However, it would not know which trees had brown trunks or which birds were tilted.
spatial structure [54,59]. This nonselective information could then provide the basis for scene-based guidance of search. Thus, nonselective categorical information, perhaps combined with the identification of an object or two by the selective pathway, could strongly and rapidly suggest that Figure 3a depicts a kitchen. Nonselective structural information could give the rough layout of surfaces in the space. In principle, these sources of information could be used to direct the resources of the selective pathway intelligently so that attention and the eyes can be deployed to probable locations of bread. Your conscious experience of the visual world is comprised of the products of both pathways. Returning to the example at the outset of this article, when you have not yet found the object that is ‘right in front of your eyes’, your visual experience at that location must be derived primarily from the nonselective pathway. You cannot choose to see a nonselective representation in isolation, but you can gain some insight into the contributions of the two pathways from Figure 5. The nonselective pathway would ‘see’ the forest [54] and could provide some information about the flock of odd birds moving through it. However, identification of a tree with both green and brown boughs or of a bird heading to the right would require the work of the selective path [61]. Expert searchers, such as radiologists hunting for signs of cancer or airport security officers searching for threats, might have learned to make specific use of nonselective signals. With some regularity, such experts will tell you 82
that they sometimes sense the presence of a target before finding it. Indeed, this ‘Gestalt process’ is a component of a leading theory of search in radiology [74]. Doctors and technicians screening for cancer can detect abnormal cases at above-chance levels in a single fixation [75]. The abilities of a nonselective pathway might underpin this experience. Understanding how nonselective processing guides capacity-limited visual search could lead to improvements in search tasks that are, literally, a matter of life and death. Concluding remarks What is next in the study of search in scenes? It is still not understood how scenes are divided up into searchable objects or proto-objects [76]. There is much work to be done to describe fully the capabilities of nonselective processing and even more to document its impact on selective processes. Finally, we would like to know if there is a neurophysiological reality to the two pathways proposed here. Suppose one ‘lesioned’ the hypothetical selective pathway. The result might be an agnosic who could see something throughout the visual field but could not identify objects. A lesion of the nonselective pathway might produce a simultagnosic or Balint’s patient, able to identify the current object of attention but otherwise unable to see. This sounds similar to the consequences of lesioning the ventral and dorsal streams, respectively [64], but more research will be required before ‘selective’ and ‘nonselective’ can be properly related to ‘what’ and ‘where’.
Review Acknowledgments This work was supported by NIH EY017001 and ONR MURI N000141010278 to J.M.W. K.K.E. was supported by NIH/NEI 1F32EY019819-01, M.R.G. by NIH/NEI F32EY019815-01and M.L-H.V. by DFG 1683/1-1.
References 1 Wolfe, J.M. (1994) Guided search 2.0. A revised model of visual search. Psychon. Bull. Rev. 1, 202–238 2 Treisman, A.M. and Gelade, G. (1980) A feature-integration theory of attention. Cognit. Psychol. 12, 97–136 3 Treisman, A. (1996) The binding problem. Curr. Opin. Neurobiol. 6, 171–178 4 Mu¨ller-Plath, G. and Elsner, K. (2007) Space-based and objectbased capacity limitations in visual search. Vis. Cogn. 15, 599– 634 5 Dosher, B.A. et al. (2010) Information-limited parallel processing in difficult heterogeneous covert visual search. J. Exp. Psychol. Hum. Percept. Perform. 36, 1128–11128 6 Pelli, D.G. and Tillman, K.A. (2008) The uncrowded window of object recognition. Nat. Neurosci. 11, 1129–1135 7 Balas, B. et al. (2009) A summary-statistic representation in peripheral vision explains visual crowding. J. Vis. 9, 1–18 8 Wolfe, J.M. and Bennett, S.C. (1997) Preattentive object files: shapeless bundles of basic features. Vision Res. 37, 25–43 9 Wolfe, J.M. (2003) Moving towards solutions to some enduring controversies in visual search. Trends Cogn. Sci. 7, 70–76 10 Eckstein, M.P. (1998) The lower visual search efficiency for conjunctions is due to noise and not serial attentional processing. Psychol. Sci. 9, 111–1111 11 Verghese, P. (2001) Visual search and attention: a signal detection theory approach. Neuron 31, 523–535 12 Townsend, J.T. and Wenger, M.J. (2004) The serial–parallel dilemma: a case study in a linkage of theory and method. Psychon. Bull. Rev. 11, 391–1391 13 Horowitz, T.S. (2006) Revisiting the variable memory model of visual search. Vis. Cogn. 14, 668–684 14 Theeuwes, J. et al. (2004) A new estimation of the duration of attentional dwell time. Psychon. Bull. Rev. 11, 60–160 15 Kirchner, H. and Thorpe, S.J. (2006) Ultra-rapid object detection with saccadic eye movements: visual processing speed revisited. Vision Res. 46, 1762–1776 16 Moore, C.M. and Wolfe, J.M. (2001) Getting beyond the serial/parallel debate in visual search: a hybrid approach. In The Limits of Attention: Temporal Constraints on Human Information Processing (Shapiro, K., ed.), pp. 178–198, Oxford University Press 17 Thornton, T.L. and Gilden, D.L. (2007) Parallel and serial processes in visual search. Psychol. Rev. 114, 71–103 18 Fazl, A. et al. (2009) View-invariant object category learning, recognition, and search: how spatial and object attention are coordinated using surface-based attentional shrouds. Cognit. Psychol. 58, 1–48 19 Renninger, L.W. et al. (2007) Where to look next? Eye movements reduce local uncertainty. J. Vis. 7, 1–17 20 Geisler, W.S. et al. (2006) Visual search: the role of peripheral information measured using gaze-contingent displays. J. Vis. 6, 858–873 21 Zelinsky, G.J. (2008) A theory of eye movements during target acquisition. Psychol. Rev. 115, 787–1787 22 Zelinsky, G.J. and Sheinberg, D.L. (1997) Eye movements during parallel-serial visual search. J. Exp. Psychol. Hum. Percept. Perform. 23, 244–1244 23 Kowler, E. et al. (1995) The role of attention in the programming of saccades. Vision Res. 35, 1897–1916 24 Huang, L. (2010) What is the unit of visual attention? Object for selection, but Boolean map for access. J. Exp. Psychol. Gen. 139, 162–179 25 Wolfe, J.M. (2007) Guided Search 4.0: current Progress with a model of visual search. In Integrated Models of Cognitive Systems (Gray, W., ed.), pp. 99–119, Oxford University Press 26 Wolfe, J.M. and Horowitz, T.S. (2004) What attributes guide the deployment of visual attention and how do they do it? Nat. Rev. Neurosci. 5, 495–501
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2 27 Wolfe, J.M. and DiMase, J.S. (2003) Do intersections serve as basic features in visual search? Perception 32, 645–656 28 Wolfe, J.M. et al. (1990) Limitations on the parallel guidance of visual search: color color and orientation orientation conjunctions. J. Exp. Psychol. Hum. Percept. Perform. 16, 879–892 29 Treisman, A. and Gormican, S. (1988) Feature analysis in early vision: evidence from search asymmetries. Psychol. Rev. 95, 15–48 30 Parkhurst, D. et al. (2002) Modeling the role of salience in the allocation of overt visual attention. Vision Res. 42, 107–123 31 Itti, L. et al. (2002) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Machine Intell. 20, 1254– 1259 32 Lindsey, D.T. et al. (2010) Color channels, not color appearance or color categories, guide visual search for desaturated color targets. Psychol. Sci. 21, 1208–11208 33 Wolfe, J.M. et al. (1992) The role of categorization in visual search for orientation. J. Exp. Psychol. Hum. Percept. Perform. 18, 34–49 34 DiCarlo, J.J. and Cox, D.D. (2007) Untangling invariant object recognition. Trends Cogn. Sci. 11, 333–341 35 Castelhano, M.S. and Heaven, C. (2010) The relative contribution of scene context and target features to visual search in scenes. Atten. Percept. Psychophys. 72, 1283–1297 36 Neider, M.B. and Zelinsky, G.J. (2008) Exploring set size effects in scenes: identifying the objects of search. Vis. Cogn. 16, 1–10 37 Vickery, T.J. et al. (2005) Setting up the target template in visual search. J. Vis. 5, 81–92 38 Wolfe, J. et al. (2008) Search for arbitrary objects in natural scenes is remarkably efficient. J. Vis. 8, 1103–11103 39 Kanan, C. et al. (2009) SUN: top-down saliency using natural statistics. Vis. Cogn. 17, 979–1003 40 Torralba, A. et al. (2006) Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol. Rev. 113, 766–786 41 Droll, J. and Eckstein, M. (2008) Expected object position of two hundred fifty observers predicts first fixations of seventy seven separate observers during search. J. Vis. 8, 320–1320 42 Eckstein, M.P. et al. (2006) Attentional cues in real scenes, saccadic targeting, and Bayesian priors. Psychol. Sci. 17, 973–980 43 Vo˜, M.L.-H. and Henderson, J.M. (2009) Does gravity matter? Effects of semantic and syntactic inconsistencies on the allocation of attention during scene perception. J. Vis. 9, 1–15 44 Vo˜, M.L.-H. and Henderson, J.M. (2010) The time course of initial scene processing for eye movement guidance in natural scene search. J. Vis. 10, 1–13 45 Bar, M. (2004) Visual objects in context. Nat. Rev. Neurosci. 5, 617–629 46 Biederman, I. et al. (1982) Scene perception: detecting and judging objects undergoing relational violations. Cognit. Psychol. 14, 143–177 47 Malcolm, G.L. and Henderson, J.M. (2009) Combining top-down processes to guide eye movements during real-world scene search. J. Vis. 10, 1–11 48 Hollingworth, A. (2006) Scene and position specificity in visual memory for objects. J. Exp. Psychol. Learn. Mem. Cogn. 32, 58–69 49 Neider, M.B. and Zelinsky, G.J. (2006) Scene context guides eye movements during visual search. Vision Res. 46, 614–621 50 Ehinger, K.A. et al. (2009) Modelling search for people in 900 scenes: a combined source model of eye guidance. Vis. Cogn. 17, 945–978 51 Henderson et al. (2009) Searching in the dark: cognitive relevance drives attention in real-world scenes. Psychon. Bull. Rev. 16, 850–856 52 Henderson, J.M. (2007) Regarding scenes. Curr. Dir. Psychol. Sci. 16, 219–222 53 Castelhano, M.S. and Henderson, J.M. (2007) Initial scene representations facilitate eye movement guidance in visual search. J. Exp. Psychol. Hum. Percept. Perform. 33, 753–763 54 Greene, M.R. and Oliva, A. (2009) Recognition of natural scenes from global properties: seeing the forest without representing the trees. Cognit. Psychol. 58, 137–176 55 Rousselet, G. et al. (2005) How long to get to the ‘gist’ of real-world natural scenes? Vis. Cogn. 12, 852–877 56 Sanocki, T. (2003) Representation and perception of scenic layout. Cognit. Psychol. 47, 43–86 57 Oliva, A. and Torralba, A. (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175
83
Review 58 Biederman, I. et al. (1974) On the information extracted from a glance at a scene. J. Exp. Psychol. Hum. Percept. Perform. 103, 597–600 59 Greene, M.R. and Oliva, A. (2009) The briefest of glances: the time course of natural scene understanding. Psychol. Sci. 20, 464–472 60 Joubert, O.R. et al. (2007) Processing scene context: fast categorization and object interference. Vision Res. 47, 3286–3297 61 Evans, K.K. and Treisman, A. (2005) Perception of objects in natural scenes: is it really attention free? J. Exp. Psychol. Hum. Percept. Perform. 31, 1476–1492 62 Joubert, O.R. et al. (2008) Early interference of context congruence on object processing in rapid visual categorization of natural scenes. J. Vis. 8, 11–111 63 Held, R. (1970) Two modes of processing spatially distributed visual stimulation. In The Neurosciences: Second Study Program (Schmitt, F.O., ed.), pp. 317–324, MIT Press 64 Ungerleider, L.G. and Mishkin, M. (1982) Two cortical visual systems. In Analysis of Visual Behavior (Ingle, D.J. et al., eds), pp. 549–586, MIT Press 65 Chong, S.C. and Treisman, A. (2003) Representation of statistical properties. Vision Res. 43, 393–404 66 Parkes, L. et al. (2001) Compulsory averaging of crowded orientation signals in human vision. Nat. Neurosci. 4, 739–744 67 Chubb, C. et al. (2007) The three dimensions of human visual sensitivity to first-order contrast statistics. Vision Res. 47, 2237–2248 68 Williams, D.W. and Sekuler, R. (1984) Coherent global motion percepts from stochastic local motions. Vision Res. 24, 55–62 69 Demeyere, N. et al. (2008) Automatic statistical processing of visual properties in simultanagnosia. Neuropsychologia 46, 2861–2864 70 Alvarez, G.A. and Oliva, A. (2008) The representation of simple ensemble visual features outside the focus of attention. Psychol. Sci. 19, 392–398 71 Melcher, D. and Kowler, E. (1999) Shapes, surfaces and saccades. Vision Res. 39, 2929–2946 72 Haberman, J. and Whitney, D. (2007) Rapid extraction of mean emotion and gender from sets of faces. Curr. Biol. 17, R751–R753 73 Vanrullen, R. (2009) Binding hardwired versus on-demand feature conjunctions. Vis. Cogn. 17, 103–119
84
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2 74 Krupinski, E.A. (2010) Current perspectives in medical image perception. Atten. Percept. Psychophys. 72, 1205–1217 75 Kundel, H.L. and Nodine, C.F. (1975) Interpreting chest radiographs without visual search. Radiology 116, 527–532 76 Rensink, R.A. (2000) The dynamic representation of scenes. Vis. Cogn. 7, 17–42 77 Chelazzi, L. et al. (1993) A neural basis for visual search in inferior temporal cortex. Nature 363, 345–347 78 Buschman, T.J. and Miller, E.K. (2009) Serial, covert shifts of attention during visual search are reflected by the frontal eye fields and correlated with population oscillations. Neuron 63, 386–396 79 Woodman, G.F. and Luck, S.J. (2003) Serial deployment of attention during visual search. J. Exp. Psychol. Hum. Percept. Perform. 29, 121– 138 80 Bichot, N.P. et al. (2005) Parallel and serial neural mechanisms for visual search in macaque area V4. Science 308, 529–1529 81 Takeda, Y. and Yagi, A. (2000) Inhibitory tagging in visual search can be found if search stimuli remain visible. Percept. Psychophys. 62, 927–934 82 Klein, R. (2009) On the control of attention. Can. J. Exp. Psychol. 63, 240–252 83 Horowitz, T.S. and Wolfe, J.M. (1998) Visual search has no memory. Nature 394, 575–577 84 Klein, R.M. and MacInnes, W.J. (1999) Inhibition of return is a foraging facilitator in visual search. Psychol. Sci. 10, 346–1346 85 Kunar, M.A. et al. (2008) The role of memory and restricted context in repeated visual search. Percept. Psychophys. 70, 314–328 86 Wolfe, J.M. et al. (2000) Postattentive vision. J. Exp. Psychol. Hum. Percept. Perform. 26, 693–716 87 Hollingworth, A. and Henderson, J.M. (2002) Accurate visual memory for previously attended objects in natural scenes. J. Exp. Psychol. Hum. Percept. Perform. 28, 113–136 88 Jiang, Y. and Wagner, L.C. (2004) What is learned in spatial contextual cuing: configuration or individual locations? Percept. Psychophys. 66, 454–463 89 Neisser, U. (1967) Cognitive psychology, Appleton-Century-Crofts 90 Deutsch, J.A. and Deutsch, D. (1963) Attention: some theoretical considerations. Psychol. Rev. 70, 80–90
Review
Emotional processing in anterior cingulate and medial prefrontal cortex Amit Etkin1,2, Tobias Egner3 and Raffael Kalisch4 1
Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA Sierra Pacific Mental Illness, Research, Education, and Clinical Center (MIRECC) at the Veterans Affairs Palo Alto Health Care System, Palo Alto, CA, USA 3 Department of Psychology & Neuroscience and Center for Cognitive Neuroscience, Duke University, Durham, NC, USA 4 Institute for Systems Neuroscience and NeuroImage Nord, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany 2
Negative emotional stimuli activate a broad network of brain regions, including the medial prefrontal (mPFC) and anterior cingulate (ACC) cortices. An early influential view dichotomized these regions into dorsal–caudal cognitive and ventral–rostral affective subdivisions. In this review, we examine a wealth of recent research on negative emotions in animals and humans, using the example of fear or anxiety, and conclude that, contrary to the traditional dichotomy, both subdivisions make key contributions to emotional processing. Specifically, dorsal–caudal regions of the ACC and mPFC are involved in appraisal and expression of negative emotion, whereas ventral–rostral portions of the ACC and mPFC have a regulatory role with respect to limbic regions involved in generating emotional responses. Moreover, this new framework is broadly consistent with emerging data on other negative and positive emotions. Controversies about anterior cingulate and medial prefrontal functions Although the medial walls of the frontal lobes, comprising the anterior cingulate cortex (ACC) and the medial prefrontal cortex (mPFC), have long been thought to play a critical role in emotional processing [1], it remains uncertain what exactly their functional contributions might be. Some investigators have described evaluative (appraisal) functions of the ACC and mPFC, such as representation of the value of stimuli or actions [2–4] and the monitoring of somatic states [5]. Others hold that the ACC is primarily a generator of physiological or behavioral responses [6,7]. Still others have described a regulatory role for these regions, such as in the top-down modulation of limbic and endocrine systems for the purpose of emotion regulation [3,8–11]. An additional source of uncertainty lies in the way in which any one of these proposed functions might map onto distinct subregions of the ACC or mPFC (Box 1). Undoubtedly the most influential functional parcellation of this type has been the proposal that there exists a principal dichotomy between caudal–dorsal midline regions that serve a variety of cognitive functions and rostral–ventral midline regions that are involved in some form of emotional processing [12]. However, even this Corresponding author: Etkin, A. (
[email protected]).
broadly and long-held view of basic functional specialization in these regions has been shaken by considerable evidence over the past decade indicating that many types of emotional processes reliably recruit caudal–dorsal ACC and mPFC regions [13,14]. Here, we review recent human neuroimaging, animal electrophysiology, and human and animal lesion studies that have produced a wealth of data on the role of the ACC and mPFC in the processing of anxiety and fear. We chose to focus primarily on the negative emotions of anxiety and fear because they are by far the most experimentally tractable and most heavily studied, and they afford the closest link between animal and human data. We subsequently briefly examine whether a conceptual framework derived from fear and anxiety can be generalized to other emotions. Given the complexity [15] and multidimensional nature [16] of emotional responses, we address the specific functions or processes that constitute an emotional reaction, regardless of whether they are classically seen as emotional (e.g. a withdrawal response or a feeling) or cognitive Glossary Appraisal: evaluation of the meaning of an internal or external stimulus to the organism. Only stimuli that are appraised as motivationally significant will induce an emotional reaction, and the magnitude, duration and quality of the emotional reaction are a direct result of the appraisal process. Moreover, appraisal can be automatic and focus on basic affective stimulus dimensions such as novelty, valence or value, or expectation discrepancy, or may be slower and sometimes even require controlled conscious processing, which permits a more sophisticated context-dependent analysis. Fear conditioning: learning paradigm in which a previously neutral stimulus, termed the conditioned stimulus (CS), is temporally paired with a non-learned aversive stimulus, termed the unconditioned stimulus (US). After pairing, the CS predicts the US and hence elicits a conditioned response (CR). For example, pairing of a tone with a foot shock results in elicitation of fear behavior during subsequent responses to a non-paired tone. Extinction: learning process created by repeatedly presenting a CS without pairing with an US (i.e. teaching the animal that the CS no longer predicts the US) after fear conditioning has been established. This results in formation of an extinction memory, which inhibits expression of, but does not erase, the original fear memory. Reappraisal: specific method for explicit emotion regulation whereby a conscious deliberate effort is engaged to alter the meaning (appraisal) of an emotional stimulus. For example, a picture of a woman crying can be reappraised from a negative meaning to a positive one by favoring an interpretation that she is crying tears of joy. Regulation: general process by which conflicting appraisals and response tendencies are arbitrated between to allow selection of a course of action. Typically, regulation is thought to have an element of inhibition and/or enhancement for managing competing appraisals and response tendencies.
1364-6613/$ – see front matter . Published by Elsevier Ltd. doi:10.1016/j.tics.2010.11.004 Trends in Cognitive Sciences, February 2011, Vol. 15, No. 2
85
Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Box 1. Anatomy of the ACC and mPFC Within the ACC, a subdivision can be made between a more ventral portion, comprising areas 24a, 24b, 24c, 25, 32 and 33 [pregenual (pgACC) and subgenual ACC (sgACC) in Figure I] and a more dorsal portion, comprising areas 24a0 , 24b0 , 24c0 , 24d, 320 and 33 [dorsal ACC (dACC) in Figure 1]. This distinction is consistent with that of Vogt and colleagues between an anterior and a midcingulate cortex [63]. In the dACC, a further distinction exists between anterior and posterior portions of the dACC (adACC and pdACC), similar to partitioning of the midcingulate into anterior and posterior portions by Vogt et al. [64] and consistent with partitioning between rostral and caudal cingulate zones [65]. These subdivisions are also reflected in patterns of connectivity. Connectivity with core emotion-processing regions such as the amygdala, PAG and hypothalamus is strong throughout the sgACC, pgACC and adACC, but very limited in the pdACC [46,66–70]. In general, cingulo–amygdalar connectivity is focused on the basolateral complex of the amygdala. ACC subregions can also be distinguished based on connectivity with premotor and lateral prefrontal cortices, which are heaviest in the pdACC and adACC [67,71]. In summary, the pattern of anatomical connectivity supports an important role for the sgACC, pgACC and
[()TD$FIG]
adACC in interacting with the limbic system, including its effector regions, and for the adACC and pdACC in communicating with other dorsal and lateral frontal areas that are important for top-down forms of regulation [72]. Like the ACC (Figure I), the mPFC can be divided into several functionally distinct subregions, although borders between these subregions are generally less clear, and differential anatomical connectivity is less well described. Amygdalar, hypothalamic and PAG connectivity with mPFC subregions is considerably lighter than the connectivity of adjacent ACC subregions, with the strongest connections observed for the ventromedial (vmPFC) and dorsomedial PFC (dmPFC) [46,68–70]. Much like the nearby ACC subregions, the supplementary motor area (SMA) is heavily interconnected with primary motor cortex and is the origin for direct corticospinal projections [65,73]. The pre-SMA, by contrast, is connected with lateral prefrontal cortices, but not with primary motor cortex [65,73]. Premotor and lateral prefrontal connections are also present, albeit to a lesser degree, in the dmPFC [71]. Thus, the patterns of connectivity are similar between abutting ACC and mPFC subregions, with the difference being primarily in the density of limbic connectivity, which is substantially greater in the ACC.
SMA/preSMA
dmPFC pdACC adACC
rmPFC pgACC
sgACC vmPFC
TRENDS in Cognitive Sciences
Figure I. Parcellation of ACC and mPFC subregions. Abbreviations: sg, subgenual; pg, pregenual; vm, ventromedial; rm, rostromedial; dm, dorsomedial; ad, anterior dorsal; pd, posterior dorsal.
(e.g. attentional focusing on a relevant stimulus). We also distinguish between processes involved in emotional stimulus appraisal and consequential response expression [17] and those involved in emotion regulation. Regulation occurs when stimuli induce conflicting appraisals and hence incompatible response tendencies or when goal-directed activity requires suppression of interference from a single, emotionally salient, task-irrelevant stimulus source. We found that an appraisal or expression versus regulation contrast provides a robust framework for understanding ACC and mPFC function in negative emotion. Fear conditioning and extinction in humans The paradigms used in the acquisition and extinction of learned fear are particularly valuable for isolating the 86
neural substrates of fear processing because the anticipatory fear or anxiety triggered by the previously neutral conditioned stimulus (CS) can be dissociated from the reaction to the aversive unconditioned stimulus (US) per se. This is not possible in studies that, for example, use aversive images to evoke emotional responses. Furthermore, comparison between fear conditioning and fear extinction facilitates an initial coarse distinction between regions associated with either the appraisal of fear-relevant stimuli and generation of fear responses (fear conditioning), or the inhibitory regulation of these processes (extinction). Several recent quantitative meta-analyses of human neuroimaging studies examined activations associated with fear CS presentation compared to a control CS never paired with the US [13,14,18]. In Figure 1a we present
()TD$FIG][ Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
(a)
(b)
Fear appraisal/ expression
Learned fear
(d)
Extinction (d1)
Positive Negative
(c)
Sympathetic activity
(e)
Extinction (d2) TRENDS in Cognitive Sciences
Figure 1. Activation foci associated with fear and its regulation. Predominantly dorsal ACC and mPFC activations are observed during classical (Pavlovian) fear conditioning (a), as well as during instructed fear paradigms, which circumvent fear learning (b). Likewise, sympathetic nervous system activity correlates positively primarily with dorsal ACC and mPFC regions and negatively primarily with ventral ACC and mPFC regions, which supports a role for the dorsal ACC and mPFC in fear expression (c). During within-session extinction, activation is observed in both the dorsal and ventral ACC and mPFC (d), whereas during subsequent delayed recall and expression of the extinction memory, when the imaging data are less confounded by residual expression of fear responses, activation is primarily in the ventral ACC and mPFC (e). Information on the studies selected for this and all following peak voxel plots can be found in the online supplemental material.
plots of the location of each activation peak reported in the ACC or mPFC in the relevant fear conditioning studies, collapsing across left and right hemispheres. It is readily apparent that activations in fear conditioning studies are not evenly distributed throughout the ACC and mPFC, but rather are clustered heavily within the dorsal ACC (dACC), dorsomedial PFC (dmPFC), supplementary motor area (SMA) and pre-SMA. These activations, however, might reflect a variety of different processes that occur simultaneously or in rapid temporal succession, for example CS appraisal and expression of conditioned responses (CRs). These processes are intermixed with, and supported by, learning processes, namely, acquisition, consolidation and storage of a fear memory (CS–US association), and retrieval of the fear memory on subsequent CS presentations. The acquisition component of fear conditioning can, to some extent, be circumvented by instructing subjects about CS–US contingencies at the beginning of an experiment. Such instructed fear experiments nevertheless also consistently activate the dorsal ACC and mPFC (Figure 1b) [14,19]. Similarly, recalling and generating fear in the absence of reinforcement several days after conditioning activate dorsal midline areas, and are not confounded by fear learning [20]. Rostral parts of the dorsal ACC/mPFC are specifically involved in the (conscious) appraisal, but not direct expression, of fear responses, as shown by reduction of rostral dACC and dmPFC activity to threat by high working memory load in the context of unchanged physiological reactivity [2,14], and correlations of rostral dACC and dmPFC activity with explicit threat evaluations but not physiological threat reactions [21].
Response expression, conversely, seems to involve more caudal dorsal areas in SMA, pre-SMA and pdACC, and caudal parts of dmPFC and adACC, although some of the evidence for this contention is indirect and based on studies of the arousal component inherent to most fear and anxiety responses. For example, Figure 1c shows clusters that correlate with sympathetic nervous system activity, irrespective of whether the context was fear-related or not. Positive correlations are found throughout the mPFC, but are again primarily clustered in mid-to-caudal dorsal mPFC areas. Lesion [22] and electrical stimulation studies [23] confirmed this anatomical distribution. Considering these data in conjunction with observations that dACC activity correlates with fear-conditioned skin conductance responses [24] and with increases in heart rate induced by a socially threatening situation [25], as well as findings that direct electrical stimulation of the dACC can elicit subjective states of fear [26], strongly suggests that the dorsal ACC and mPFC are involved in generating fear responses. Neuroimaging studies of autonomic nervous system activity also indirectly suggest that the same areas do not exclusively function in response expression, but might also support appraisal processes. For example, the dorsal ACC and mPFC are associated with interoceptive awareness of heart beats [27], and, importantly, recruitment of the dorsal ACC and mPFC during interoceptive perception is positively correlated with subjects’ trait anxiety levels [27]. Thus, the dorsal ACC and mPFC seem to function generally in the appraisal and expression of fear or anxiety. These studies leave uncertain the role that the dorsal ACC and mPFC might play in the acquisition of conditioned fear, 87
Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Box 2. Studies of fear conditioning and extinction in rodents A rich literature has examined the role of the rodent medial frontal cortex in the acquisition and extinction of conditioned fear, as well as the expression of conditioned and unconditioned fear [74]. These studies facilitate a greater degree of causal inference than imaging studies. Much like the human dorsal ACC and mPFC, the rodent mPFC is strongly activated during fear conditioning [75,76]. Lesion or acute inactivation studies have revealed a role for the ventrally located infralimbic (IL) and dorsally located prelimbic (PL) subregions in conditioned fear expression when recall tests are performed within a few days after initial conditioning [77–81]. Interestingly, the mPFC does not seem to be required during fear acquisition itself, as evidenced by intact initial fear learning after disruption of IL or PL prior to conditioning [82–85]. As with expression of fear memories, activity in the rodent mPFC is also required for expression of unconditioned fear [86,87]. In terms of extinction, the recall and expression of an extinction memory more than 24 h after learning requires activity in IL [80,82,84,88] and to some degree PL [85,89]. By contrast, within-session extinction of CRs during repeated non-reinforced presentations of the CS does not require activity in IL or PL [80,82,84,88]. Thus, the role of the mPFC during extinction closely follows its role during fear conditioning: it is required for recall or expression, but not for initial acquisition. Electrical microstimulation of the rodent mPFC generally does not directly elicit fear behavior or produce overt anxiolysis, but rather
although converging evidence from studies in rodents (Box 2) suggests only a minor role in acquisition. To elucidate how fear is regulated, we next discuss activations associated with extinction of learned fear. In extinction, the CS is repeatedly presented in the absence of reinforcement, leading to the formation of a CS–no US association (or extinction memory) that competes with the original fear memory for control over behavior [28– 30]. Hence, extinction induces conflicting appraisals of, and response tendencies to, the CS because it now signals both threat and safety, a situation that requires regulation, as outlined above. We further distinguish between within-session extinction (Figure 1d, day 1) and extinction recall, as tested by CS presentation on a subsequent day (Figure 1e, day 2). Within-session extinction is associated with activation in both the dorsal ACC and mPFC (dACC, dmPFC, SMA and pre-SMA) and the ventral ACC and mPFC (pgACC and vmPFC; Figure 1d). Given the close association of dorsal ACC and mPFC with fear conditioning responses, it should be noted that the activations observed within these regions during fear extinction might in fact reflect remnants of fear conditioning, because in early extinction trials the CS continues to elicit a residual CR. Activation within the ventral ACC and mPFC is thus a candidate neural correlate of the fear inhibition that occurs during extinction (for convergent rodent data, see Box 2). Accordingly, acute reversal of a fear conditioning contingency, during which a neutral, non-reinforced, CS is paired with an aversive stimulus, whereas the previously reinforced CS is not and now inhibits fear, is associated with activation in the pgACC [31]. Likewise, exposure to distant threat is associated with ventral ACC and mPFC activation, presumably acting in a regulatory capacity to facilitate planning of adaptive responses, whereas more imminent threat is associated with dorsal ACC and mPFC activation, which is consistent with greater expression of fear responses [32]. Along with ventral ACC and mPFC activation dur88
exerts a modulatory function, gating behavioral output elicited by external fear-eliciting stimuli or by direct subcortical stimulation [90– 93]. Curiously, given the role of the mPFC in fear expression, it has been found that these effects are generally, but not exclusively, fearinhibitory and occur with stimulation in all mPFC subregions [90–93]. Of note, however, one recent study found a fear-enhancing effect of PL stimulation, but a fear-inhibiting effect of IL stimulation [92]. Together, these findings suggest that a model of mPFC function in fear or extinction must account for interactions of the mPFC with other elements of the fear circuit, because the mPFC itself functions primarily by modifying activity in other brain areas. With respect to one important interacting partner, the amygdala, it has been reported that stimulation in the IL or PL inhibits the activity of output neurons in the central amygdalar nucleus (CEA) [94], as well as the basolateral amygdalar complex (BLA) [95]. IL and PL stimulation can also directly activate BLA neurons [96]. Thus, the mPFC can promote fear expression through BLA activation and can inhibit amygdala output through CEA inhibition. CEA inhibition, however, is achieved through the action of excitatory glutamatergic mPFC projections onto inhibitory interneurons in the amygdala, probably through the intercalated cell masses [97,98]. Innervation of the intercalated cell masses originates predominantly from IL rather than PL [99,100], which supports a preferential role for IL in inhibitory regulation of the amygdala.
ing extinction, decreases in amygdalar responses have also been reported [33,34], consistent with the idea that amygdalar inhibition is an important component of extinction. In support of this conclusion, recall of extinction more than 24 h after conditioning, a process that is less confounded by residual CRs, yields primarily ventral ACC and mPFC activations (pgACC, sgACC, vmPFC; Figure 1e). It should be stressed, however, that extinction, like conditioning, involves multiple component processes, including acquisition, consolidation, storage and retrieval of the extinction memory, and the related appraisal of the CS as safe, of which CR inhibition is only the endpoint. The limited number of human neuroimaging studies of extinction do not allow a reliable parcellation of these processes, although a rich literature on rodents suggests that, like for fear conditioning, the role of the mPFC is primarily in expression rather than acquisition of inhibitory fear memories (Box 2). Moreover, our conclusions are also supported by findings of negative correlations primarily between ventral areas (pgACC and vmPFC) and sympathetic activity (Figure 1c), and with activation in an area consistent with the periaqueductal gray matter (PAG), which mediates heart rate increases under social threat [25,35]. In summary, neuroimaging studies of the learning and extinction of fear in humans reveal evidence of an important differentiation between dorsal ACC and mPFC subregions, which are implicated in threat appraisal and the expression of fear, and ventral ACC and mPFC subregions, which are involved in the inhibition of conditioned fear through extinction. Emotional conflict regulation Convergent evidence of the functional differentiation between dorsal and ventral ACC and mPFC comes from work on emotional conflict. Two recent studies used a task that required subjects to categorize face stimuli according to their emotional expression (fearful vs happy) while
()TD$FIG][ Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
(a)
(b)
Emotional conflict
(c)
Reappraisal Connectivity: Positive Negative
Appraisal/expression
Connectivity: Positive Negative
(d)
Regulation TRENDS in Cognitive Sciences
Figure 2. (a) Emotional conflict across a variety of experimental paradigms is associated with activation in the dorsal ACC and mPFC. (b) Decreasing negative emotion through reappraisal is associated with preferential activation of the dorsal ACC and mPFC. Targets of amygdalar connectivity during tasks involving appraisal or expression (c) or regulation (d) of negative emotion. Positive connectivity is observed primarily during appraisal or expression tasks, and most heavily in the dorsal ACC and mPFC. By contrast, negative connectivity is observed primarily in the ventral ACC and mPFC across both appraisal or expression and regulation tasks. These connectivity findings are therefore consistent with the dorsoventral functional–anatomical parcellation of the ACC and mPFC derived from activation analyses.
attempting to ignore emotionally congruent or incongruent word labels (Happy, Fear) superimposed over the faces. Emotional conflict, created by a word label incongruent with the facial expression, substantially slowed reaction times [8,36]. Moreover, when incongruent trials were preceded by an incongruent trial, reaction times were faster than if incongruent trials were preceded by a congruent trial [8,36], an effect that has previously been observed in traditional, non-emotional conflict tasks, such as the Stroop and flanker protocols [37]. According to the conflict-monitoring model [38], this data pattern stems from a conflict-driven regulatory mechanism, whereby conflict from an incongruent trial triggers an upregulation of top-down control, reflected in reduced conflict in the subsequent trial. This model can distinguish brain regions involved in conflict evaluation and those involved in conflict regulation [38,39]. In studies of emotional conflict, regions that activated more to post-congruent incongruent trials than post-incongruent incongruent trials, interpreted as being involved in conflict evaluation, included the amygdala, dACC, dmPFC and dorsolateral PFC [8,36]. The role of dorsal ACC and mPFC areas in detecting emotional conflict is further echoed by other studies of various forms of emotional conflict or interference, the findings of which we plot in Figure 2a. By contrast, regions more active in post-incongruent incongruent trials are interpreted as being involved in conflict regulation, and prominently include the pgACC [8,36]. Regulation-related activation in the pgACC was accompanied by a simultaneous and correlated reduction in conflict-related amygdalar activity and does not seem to involve biasing of early sensory processing streams [39],
but rather the regulation of affective processing itself [36]. These data echo the dorsal–ventral dissociation discussed above with respect to fear expression and extinction in the ACC and mPFC. The circuitry we find to be specific for regulation of emotional conflict (ventral ACC and mPFC and amygdala) is very similar to that involved in extinction. Although these two processes are unlikely to be isomorphic, and each can be understood without reference to the other, we consider the striking similarity between extinction and emotional conflict regulation to be potentially important. Much like the relationship between improved emotional conflict regulation and decreased conflict evaluation-related activation in the dorsal ACC and mPFC, more successful extinction is associated with decreased CS-driven activation in the dorsal ACC and mPFC of humans and rodents [40,41]. Thus, the most parsimonious explanation for these data is that emotional conflict evaluation-related functions involve overlapping neural mechanisms with appraisal and expression of fear, and that regulation of emotional conflict also involves circuitry that overlaps with fear extinction. These conceptual and functional–anatomical similarities between evaluation and regulation of emotional conflict and fear also support the generalizability of our account of ACC and mPFC functional subdivisions beyond simply fear-related processing, but more generally to negative emotional processing. Of note, although the intensity of the negative emotions elicited during fear conditioning and evoked by emotional conflict differ significantly, they nonetheless engage a similar neural circuitry, probably because both fear and emotional conflict reflect biologically salient events. 89
Review Top-down control of emotion During emotional conflict regulation, emotional processing is spontaneously modulated in the absence of an explicit instruction to regulate emotion. Emotional processing can also be modulated through deliberate and conscious application of top-down executive control over processing of an emotional stimulus. The best-studied strategy for the latter type of regulation is reappraisal, a cognitive technique whereby appraisal of a stimulus is modified to change its ability to elicit an emotional reaction [42]. Reappraisal involves both the initial emotional appraisal process and the reappraisal process proper, whereby an additional positive appraisal is created that competes with the initial negative emotional appraisal. Thus, we would expect reappraisal to involve the dorsal ACC and mPFC regions that we observed to be important for emotional conflict detection (Figure 2a). Consistent with this prediction, a meta-analysis found that reappraisal was reliably associated with activation in the dorsal ACC and mPFC (Figure 2b) [43]. This reappraisal meta-analysis, interestingly, did not implicate a consistent role for the ventral ACC and mPFC [43], which suggests that reappraisal does not primarily work by suppressing the processing of an undesired emotional stimulus. Nevertheless, activity in the ventral ACC and mPFC in some instances is negatively correlated with activity in the amygdala in paradigms in which reappraisal resulted in downregulation of amygdalar activity in response to negative pictures [44,45]. Thus, the ventral ACC and mPFC might be mediators between activation in dorsal medial and lateral prefrontal areas, involved in reappraisal [43], and the amygdala, with which lateral prefrontal structures in particular have little or no direct connectivity [46]. Consistent with this idea, the ventral ACC and mPFC are also engaged when subjects perform affect labeling of emotional faces [47] or when they selfdistract from a fear-conditioned stimulus [48], two other emotion regulation strategies that result in downregulation of amygdalar activity. These data suggest that controlled top-down regulation, like emotional conflict regulation, uses ventral ACC and mPFC areas to inhibit negative emotional processing in the amygdala, thus dampening task interference. The ventral ACC and mPFC might thus perform a generic negative emotion inhibitory function that can be recruited by other regions (e.g. dorsal ACC and mPFC and lateral PFC) when there is a need to suppress limbic reactivity [10]. This would be a prime example of parsimonious use of a basic emotional circuitry, conserved between rodents and humans (Box 2), for the purpose of higher-level cognitive functions possible only in humans. Amygdala–ACC and –mPFC functional connectivity Our analysis of the neuroimaging data has emphasized task-based activation studies. Complementary evidence can be found in analyses of functional connectivity, because ACC and mPFC subregions can be distinguished through their differential anatomical connectivity (Box 1). In some ways, psychological context-specific temporal covariation (i.e. task-dependent connectivity) between regions might provide an even stronger test of the nature of inter-regional 90
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
relationships than consistency with regions that simply coactivate in a task. Figure 2c,d shows the ACC and mPFC connectivity peaks for all such connectivity studies, irrespective of the specific paradigm or instructions used (primarily general negative stimuli), as long as the task facilitated discrimination between appraisal or expression (Figure 2c) and regulation (Figure 2d). The spatial distribution of peaks during appraisal/expression tasks shows a relative preponderance of positive connectivity peaks in the dorsal ACC and mPFC and of negative connectivity peaks in the ventral ACC and mPFC. In addition, during regulation tasks, connectivity was restricted to the ventral ACC and mPFC and was primarily negative (Figure 2d). These data thus lend further support to our proposal of a dorso–ventral separation in terms of negative emotion generation (appraisal and expression) and inhibition (regulation). Integration with other perspectives on ACC and mPFC function and other emotions Although less developed than the literature on fear and anxiety, studies on other emotions are broadly consistent with our formulation of ACC and mPFC function. On the negative emotion appraisal and expression side, direct experience of pain, or empathy for others experiencing pain, activates the dorsal ACC and mPFC [49], and lesions of the dACC also serve as treatment for chronic pain [50]. Similarly, increased sensitivity to a range of negative emotions is associated with greater engagement of the dorsal ACC and mPFC, including disgust [51] and rejection [52], and transcranial-magnetic-stimulation-induced disruption of the dmPFC interferes with anger processing [53]. Uncertainty or ambiguity, which can induce anxiety and relates to emotional conflict, leads to activation in the dACC and dmPFC [54]. On the regulation side, endogenously driven analgesia by means of the placebo effect has been closely tied to the pgACC, which is thought to engage in top-down modulation of regions that generate opioidmediated anti-nociceptive responses, such as the amygdala and PAG [55,56]. It remains unclear how sadness is evaluated and regulated, and what role the sgACC plays in these processes, because it is a common activation site in response to sad stimuli [57]. Positive emotion, which can serve to regulate and diminish negative emotion, has been associated in a metaanalysis with activation in the sgACC, vmPFC and pgACC [58]. Extinction of appetitive learning activates the vmPFC [59], much as extinction of learned fear does. The evaluation of positive stimuli and reward is more complicated. For instance, Rushworth and co-workers proposed that the processes carried out by the adACC are mirrored by similar contributions to reinforcement-guided decision-making from the orbitofrontal cortex, with the distinction that the adACC is concerned with computing reinforcement value of actions whereas the orbitofrontal cortex is concerned with gauging the reinforcement values of stimuli [60]. Taken together, these data broadly support our dorsal– ventral distinction along appraisal–expression versus regulation lines, with respect specifically to negative emotion. Conversely, it is not obvious how to accommodate our
()TD$FIG][ Review
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2
Appraisal/ expression
Regulation
TRENDS in Cognitive Sciences
Figure 3. Graphical depiction of the ACC and mPFC functional model aligned across an appraisal or expression versus the regulation dimension for negative emotion. The imperfect separation of these functions across the dorsal and ventral ACC and mPFC noted in the reviewed studies is represented schematically as an intermixing of red (appraisal or expression) and blue (regulation) circles.
Box 3. Future directions Further work is needed, in particular in exploring the neurophysiological basis for appraisal and expression versus regulationrelated signaling in the ACC and mPFC of experimental animals. Specifically, how does this coding take place at the single-cell level and how do these effects result in the dorsal–ventral division in ACC and mPFC functions observed in human imaging studies? We have left out discussion of other regions, such as the insula and brainstem, that are probably important partners of the ACC and mPFC, although far less is known about these interactions. Additional work is required to bring to these interactions the depth of understanding currently available for interactions with the amygdala. Moreover, a better systems-level understanding of how ACC and mPFC activity is shaped by its input regions, such as the amygdala, hippocampus and thalamus, is necessary. Although we hint at levels of similarity between our model of ACC and mPFC in negative emotion and other models of the roles of this region in other functions, additional work is required to directly contrast and harmonize other conceptualizations of ACC and mPFC functions to create a more comprehensive framework capable of making predictions about a wide range of task contexts. With a few notable exceptions [40,61], the sophisticated cognitive neuroscience models described above have not been extended to populations with anxiety-related disorders. A great deal of work will be needed to translate our increasingly nuanced descriptions of ACC and mPFC functions into a better understanding of psychopathology.
analysis with the suggestion that the vmPFC specifically assesses stimulus values [10], but not action values, with the opposite being the case for the dACC [60]. Thus, this should be seen as an early attempt to integrate these and other models of ACC and mPFC function and can serve to stimulate further research in this area. It is also worth examining why the conceptualization proposed in this review differs significantly from the earlier view of a cognitive–affective division [12]. Although the meta-analysis reported in the earlier paper did not indicate which specific studies were included, it seems that much of the support for this scheme comes from studies of patients
with affective disorders, in whom ventral ACC and mPFC dysfunction can be more readily observed in the context of deficits in regulation [40,61]. Moreover, the dorsal–ventral dissociation between dACC activation in a counting Stroop and pgACC in an emotional counting Stroop [12] has not held up to subsequent evidence (Figure 2a) or direct contrasts between emotional and non-emotional conflict processing [36], nor does the emotional counting Stroop involve a true Stroop conflict effect in the way that the counting Stroop does [62]. Concluding remarks This review has highlighted several important themes. First, the empirical data do not support the long-held popular view that dorsal ACC and mPFC regions are involved in cognitive but not emotional functions, whereas ventral regions do the reverse [12]. Rather, the key functional distinction between these regions relates to evaluative function on the one hand, and regulatory function on the other hand for the dorsal and ventral ACC and mPFC, respectively (Figure 3). This new framework can also be broadly generalized to other negative and positive emotions, and points to multiple exciting lines of future research (Box 3). Disclosure statement Amit Etkin receives consulting fees from NeoStim. The other authors report no financial conflicts. Acknowledgements We would like to thank Gregory Quirk, Kevin LaBar, James Gross and Carsten Wotjak for their helpful comments and criticisms of this manuscript. This work was supported by NIH grants P30MH089888 and R01MH091860, and the Sierra-Pacific Mental Illness Research Education and Clinical Center at the Palo Alto VA Health Care System.
Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.tics. 2010.11.004. References 1 Papez, J.W. (1937) A proposed mechanism of emotion. Arch. Neurol. Psychiatry 38, 725–743 2 Kalisch, R. et al. (2006) Levels of appraisal: a medial prefrontal role in high-level appraisal of emotional material. Neuroimage 30, 1458– 1466 3 Ochsner, K.N. and Gross, J.J. (2005) The cognitive control of emotion. Trends Cogn. Sci. 9, 242–249 4 Rushworth, M.F. et al. (2007) Functional organization of the medial frontal cortex. Curr. Opin. Neurobiol. 17, 220–227 5 Bechara, A. et al. (2000) Emotion, decision making and the orbitofrontal cortex. Cereb. Cortex 10, 295–307 6 Craig, A.D. (2009) How do you feel – now? The anterior insula and human awareness. Nat. Rev. Neurosci. 10, 59–70 7 Critchley, H.D. (2005) Neural mechanisms of autonomic, affective, and cognitive integration. J. Comp. Neurol. 493, 154–166 8 Etkin, A. et al. (2006) Resolving emotional conflict: a role for the rostral anterior cingulate cortex in modulating activity in the amygdala. Neuron 51, 871–882 9 Quirk, G.J. and Beer, J.S. (2006) Prefrontal involvement in the regulation of emotion: convergence of rat and human studies. Curr. Opin. Neurobiol. 16, 723–727 10 Schiller, D. and Delgado, M.R. (2010) Overlapping neural systems mediating extinction, reversal and regulation of fear. Trends Cogn. Sci. 14, 268–276 91
Review 11 Vogt, B.A. et al. (1992) Functional heterogeneity in cingulate cortex: the anterior executive and posterior evaluative regions. Cereb. Cortex 2, 435–443 12 Bush, G. et al. (2000) Cognitive and emotional influences in anterior cingulate cortex. Trends Cogn. Sci. 4, 215–222 13 Etkin, A. and Wager, T.D. (2007) Functional neuroimaging of anxiety: a meta-analysis of emotional processing in PTSD, social anxiety disorder, and specific phobia. Am. J. Psychiatry 164, 1476–1488 14 Mechias, M.L. et al. (2010) A meta-analysis of instructed fear studies: implications for conscious appraisal of threat. Neuroimage 49, 1760– 1768 15 Levenson, R.W. (2003) Blood, sweat, and fears: the autonomic architecture of emotion. Ann. N. Y. Acad. Sci. 1000, 348–366 16 Pessoa, L. (2008) On the relationship between emotion and cognition. Nat. Rev. Neurosci. 9, 148–158 17 Roseman, I.J. and Smith, C.A. (2001) Appraisal theory: overview, assumptions, varieties, controversies. In Appraisal Processes in Emotion: Theory, Methods, Research (Scherer, K.R. et al., eds), pp. 3–19, Oxford University Press 18 LaBar, K.S. and Cabeza, R. (2006) Cognitive neuroscience of emotional memory. Nat. Rev. Neurosci. 7, 54–64 19 Klucken, T. et al. (2009) Neural, electrodermal and behavioral response patterns in contingency aware and unaware subjects during a picture–picture conditioning paradigm. Neuroscience 158, 721–731 20 Kalisch, R. et al. (2009) The NMDA agonist D-cycloserine facilitates fear memory consolidation in humans. Cereb. Cortex 19, 187–196 21 Raczka, K.A. et al. (2010) A neuropeptide S receptor variant associated with overinterpretation of fear reactions: a potential neurogenetic basis for catastrophizing. Mol. Psychiatry 15, 1067– 1074 22 Critchley, H.D. et al. (2003) Human cingulate cortex and autonomic control: converging neuroimaging and clinical evidence. Brain 126, 2139–2152 23 Gentil, A.F. et al. (2009) Physiological responses to brain stimulation during limbic surgery: further evidence of anterior cingulate modulation of autonomic arousal. Biol. Psychiatry 66, 695–701 24 Milad, M.R. et al. (2007) A role for the human dorsal anterior cingulate cortex in fear expression. Biol. Psychiatry 62, 1191–1194 25 Wager, T.D. et al. (2009) Brain mediators of cardiovascular responses to social threat: part I: Reciprocal dorsal and ventral sub-regions of the medial prefrontal cortex and heart-rate reactivity. Neuroimage 47, 821–835 26 Meyer, G. et al. (1973) Stereotactic cingulotomy with results of acute stimulation and serial psychological testing. In Surgical Approaches in Psychiatry (Laitinen, L.V. and Livingston, K.E., eds), pp. 39–58, University Park Press 27 Critchley, H.D. et al. (2004) Neural systems supporting interoceptive awareness. Nat. Neurosci. 7, 189–195 28 Bouton, M.E. (2004) Context and behavioral processes in extinction. Learn. Mem. 11, 485–494 29 Delamater, A.R. (2004) Experimental extinction in Pavlovian conditioning: behavioural and neuroscience perspectives. Q. J. Exp. Psychol. B 57, 97–132 30 Myers, K.M. and Davis, M. (2002) Behavioral and neural analysis of extinction. Neuron 36, 567–584 31 Schiller, D. et al. (2008) From fear to safety and back: reversal of fear in the human brain. J. Neurosci. 28, 11517–11525 32 Mobbs, D. et al. (2009) From threat to fear: the neural organization of defensive fear systems in humans. J. Neurosci. 29, 12236–12243 33 Milad, M.R. et al. (2008) Presence and acquired origin of reduced recall for fear extinction in PTSD: results of a twin study. J. Psychiatr. Res. 42, 515–520 34 Phelps, E.A. et al. (2004) Extinction learning in humans: role of the amygdala and vmPFC. Neuron 43, 897–905 35 Wager, T.D. et al. (2009) Brain mediators of cardiovascular responses to social threat, part II: Prefrontal-subcortical pathways and relationship with anxiety. Neuroimage 47, 836–851 36 Egner, T. et al. (2008) Dissociable neural systems resolve conflict from emotional versus nonemotional distracters. Cereb. Cortex 18, 1475– 1484 37 Egner, T. (2007) Congruency sequence effects and cognitive control. Cogn. Affect. Behav. Neurosci. 7, 380–390
92
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2 38 Botvinick, M.M. et al. (2001) Conflict monitoring and cognitive control. Psychol. Rev. 108, 624–652 39 Egner, T. and Hirsch, J. (2005) Cognitive control mechanisms resolve conflict through cortical amplification of task-relevant information. Nat. Neurosci. 8, 1784–1790 40 Milad, M.R. et al. (2009) Neurobiological basis of failure to recall extinction memory in posttraumatic stress disorder. Biol. Psychiatry 66, 1075–1082 41 Burgos-Robles, A. et al. (2007) Consolidation of fear extinction requires NMDA receptor-dependent bursting in the ventromedial prefrontal cortex. Neuron 53, 871–880 42 Gross, J.J. (2002) Emotion regulation: affective, cognitive, and social consequences. Psychophysiology 39, 281–291 43 Kalisch, R. (2009) The functional neuroanatomy of reappraisal: time matters. Neurosci. Biobehav. Rev. 33, 1215–1226 44 Johnstone, T. et al. (2007) Failure to regulate: counterproductive recruitment of top-down prefrontal-subcortical circuitry in major depression. J. Neurosci. 27, 8877–8884 45 Urry, H.L. et al. (2006) Amygdala and ventromedial prefrontal cortex are inversely coupled during regulation of negative affect and predict the diurnal pattern of cortisol secretion among older adults. J. Neurosci. 26, 4415–4425 46 Amaral, D.G. et al. (1992) Anatomical organization of the primate amygdaloid complex. In The Amygdala: Neurobiological Aspects of Emotion, Memory and Mental Dysfunction (Aggleton, J.P., ed.), pp. 1– 66, Wiley-Liss 47 Lieberman, M.D. et al. (2007) Putting feelings into words: affect labeling disrupts amygdala activity in response to affective stimuli. Psychol. Sci. 18, 421–428 48 Delgado, M.R. et al. (2008) Neural circuitry underlying the regulation of conditioned fear and its relation to extinction. Neuron 59, 829–838 49 Lamm, C. et al. (2010) Meta-analytic evidence for common and distinct neural networks associated with directly experienced pain and empathy for pain. Neuroimage DOI: 10.1016/j.neuroimage. 2010.10.014 50 Wilkinson, H.A. et al. (1999) Bilateral anterior cingulotomy for chronic noncancer pain. Neurosurgery 45, 1129–1134 51 Mataix-Cols, D. et al. (2008) Individual differences in disgust sensitivity modulate neural responses to aversive/disgusting stimuli. Eur. J. Neurosci. 27, 3050–3058 52 Eisenberger, N.I. et al. (2003) Does rejection hurt? An FMRI study of social exclusion. Science 302, 290–292 53 Harmer, C.J. et al. (2001) Transcranial magnetic stimulation of medial-frontal cortex impairs the processing of angry facial expressions. Nat. Neurosci. 4, 17–18 54 Nomura, M. et al. (2003) Frontal lobe networks for effective processing of ambiguously expressed emotions in humans. Neurosci. Lett. 348, 113–116 55 Eippert, F. et al. (2009) Activation of the opioidergic descending pain control system underlies placebo analgesia. Neuron 63, 533–543 56 Petrovic, P. et al. (2002) Placebo and opioid analgesia – imaging a shared neuronal network. Science 295, 1737–1740 57 Phan, K.L. et al. (2002) Functional neuroanatomy of emotion: a metaanalysis of emotion activation studies in PET and fMRI. Neuroimage 16, 331–348 58 Wager, T.D., et al. (2008) The neuroimaging of emotion. In Handbook of Emotions (3rd edn) (Lewis, M., ed.), pp. 249–271, The Guilford Press 59 Finger, E.C. et al. (2008) Dissociable roles of medial orbitofrontal cortex in human operant extinction learning. Neuroimage 43, 748–755 60 Rushworth, M.F. et al. (2007) Contrasting roles for cingulate and orbitofrontal cortex in decisions and social behaviour. Trends Cogn. Sci. 11, 168–176 61 Etkin, A. et al. (2010) Failure of anterior cingulate activation and connectivity with the amygdala during implicit regulation of emotional processing in generalized anxiety disorder. Am. J. Psychiatry 167, 545–554 62 Algom, D. et al. (2004) A rational look at the emotional Stroop phenomenon: a generic slowdown, not a Stroop effect. J. Exp. Psychol. Gen. 133, 323–338 63 Vogt, B.A. (2004) Cingulate gyrus. In The Human Nervous System (2nd edn) (Paxinos, G., and Mai, J.K., eds.), pp. 915–949, Elsevier 64 Vogt, B.A. et al. (2003) Structural and functional dichotomy of human midcingulate cortex. Eur. J. Neurosci. 18, 3134–3144
Review 65 Picard, N. and Strick, P.L. (2001) Imaging the premotor areas. Curr. Opin. Neurobiol. 11, 663–672 66 Ghashghaei, H.T. et al. (2007) Sequence of information processing for emotions based on the anatomic dialogue between prefrontal cortex and amygdala. Neuroimage 34, 905–923 67 Beckmann, M. et al. (2009) Connectivity-based parcellation of human cingulate cortex and its relation to functional specialization. J. Neurosci. 29, 1175–1190 68 Chiba, T. et al. (2001) Efferent projections of infralimbic and prelimbic areas of the medial prefrontal cortex in the Japanese monkey, Macaca fuscata. Brain Res. 888, 83–101 69 Rempel-Clower, N.L. and Barbas, H. (1998) Topographic organization of connections between the hypothalamus and prefrontal cortex in the rhesus monkey. J. Comp. Neurol. 398, 393–419 70 An, X. et al. (1998) Prefrontal cortical projections to longitudinal columns in the midbrain periaqueductal gray in macaque monkeys. J. Comp. Neurol. 401, 455–479 71 Bates, J.F. and Goldman-Rakic, P.S. (1993) Prefrontal connections of medial motor areas in the rhesus monkey. J. Comp. Neurol. 336, 211– 228 72 Mansouri, F.A. et al. (2009) Conflict-induced behavioural adjustment: a clue to the executive functions of the prefrontal cortex. Nat. Rev. Neurosci. 10, 141–152 73 Nachev, P. et al. (2008) Functional role of the supplementary and presupplementary motor areas. Nat. Rev. Neurosci. 9, 856–869 74 Sotres-Bayon, F. and Quirk, G.J. (2010) Prefrontal control of fear: more than just extinction. Curr. Opin. Neurobiol. 20, 231–235 75 Burgos-Robles, A. et al. (2009) Sustained conditioned responses in prelimbic prefrontal neurons are correlated with fear expression and extinction failure. J. Neurosci. 29, 8474–8482 76 Herry, C. et al. (1999) Plasticity in the mediodorsal thalamo– prefrontal cortical transmission in behaving mice. J. Neurophysiol. 82, 2827–2832 77 Resstel, L.B. et al. (2008) The expression of contextual fear conditioning involves activation of an NMDA receptor–nitric oxide pathway in the medial prefrontal cortex. Cereb. Cortex 18, 2027–2035 78 Blum, S. et al. (2006) A role for the prefrontal cortex in recall of recent and remote memories. Neuroreport 17, 341–344 79 Corcoran, K.A. and Quirk, G.J. (2007) Activity in prelimbic cortex is necessary for the expression of learned, but not innate, fears. J. Neurosci. 27, 840–844 80 Laurent, V. and Westbrook, R.F. (2009) Inactivation of the infralimbic but not the prelimbic cortex impairs consolidation and retrieval of fear extinction. Learn Mem. 16, 520–529 81 Laviolette, S.R. et al. (2005) A subpopulation of neurons in the medial prefrontal cortex encodes emotional learning with burst and frequency codes through a dopamine D4 receptor-dependent basolateral amygdala input. J. Neurosci. 25, 6066–6075 82 Quirk, G.J. et al. (2000) The role of ventromedial prefrontal cortex in the recovery of extinguished fear. J. Neurosci. 20, 6225–6231
Trends in Cognitive Sciences February 2011, Vol. 15, No. 2 83 Runyan, J.D. et al. (2004) A role for prefrontal cortex in memory storage for trace fear conditioning. J. Neurosci. 24, 1288–1295 84 Sierra-Mercado, D., Jr et al. (2006) Inactivation of the ventromedial prefrontal cortex reduces expression of conditioned fear and impairs subsequent recall of extinction. Eur. J. Neurosci. 24, 1751–1758 85 Morgan, M.A. and LeDoux, J.E. (1995) Differential contribution of dorsal and ventral medial prefrontal cortex to the acquisition and extinction of conditioned fear in rats. Behav. Neurosci. 109, 681–688 86 Maaswinkel, H. et al. (1996) Effects of an electrolytic lesion of the prelimbic area on anxiety-related and cognitive tasks in the rat. Behav. Brain Res. 79, 51–59 87 Shah, A.A. and Treit, D. (2003) Excitotoxic lesions of the medial prefrontal cortex attenuate fear responses in the elevated-plus maze, social interaction and shock probe burying tests. Brain Res. 969, 183–194 88 Lebron, K. et al. (2004) Delayed recall of fear extinction in rats with lesions of ventral medial prefrontal cortex. Learn. Mem. 11, 544–548 89 Hugues, S. et al. (2004) Postextinction infusion of a mitogenactivated protein kinase inhibitor into the medial prefrontal cortex impairs memory of the extinction of conditioned fear. Learn Mem. 11, 540–543 90 al Maskati, H.A. and Zbrozyna, A.W. (1989) Stimulation in prefrontal cortex area inhibits cardiovascular and motor components of the defence reaction in rats. J. Autonom. Nerv. Syst. 28, 117 91 Milad, M.R. and Quirk, G.J. (2002) Neurons in medial prefrontal cortex signal memory for fear extinction. Nature 420, 70–74 92 Vidal-Gonzalez, I. et al. (2006) Microstimulation reveals opposing influences of prelimbic and infralimbic cortex on the expression of conditioned fear. Learn Mem. 13, 728–733 93 Zbrozyna, A.W. and Westwood, D.M. (1991) Stimulation in prefrontal cortex inhibits conditioned increase in blood pressure and avoidance bar pressing in rats. Physiol. Behav. 49, 705–708 94 Quirk, G.J. et al. (2003) Stimulation of medial prefrontal cortex decreases the responsiveness of central amygdala output neurons. J. Neurosci. 23, 8800–8807 95 Rosenkranz, J.A. et al. (2003) The prefrontal cortex regulates lateral amygdala neuronal plasticity and responses to previously conditioned stimuli. J. Neurosci. 23, 11054–11064 96 Likhtik, E. et al. (2005) Prefrontal control of the amygdala. J. Neurosci. 25, 7429–7437 97 Amano, T. et al. (2010) Synaptic correlates of fear extinction in the amygdala. Nat. Neurosci. 13, 489–494 98 Ehrlich, I. et al. (2009) Amygdala inhibitory circuits and the control of fear memory. Neuron 62, 757–771 99 McDonald, A.J. et al. (1996) Projections of the medial and lateral prefrontal cortices to the amygdala: a Phaseolus vulgaris leucoagglutinin study in the rat. Neuroscience 71, 55–75 100 Vertes, R.P. (2004) Differential projections of the infralimbic and prelimbic cortex in the rat. Synapse 51, 32–58
93