January 2011 Volume 15, Number 1 pp. 1–46 Editor Stavroula Kousta Executive Editor, Neuroscience Katja Brose
Editorial
1
Journal Manager Rolf van der Sanden Journal Administrator Myarca Bonsink Advisory Editorial Board R. Adolphs, Caltech, CA, USA R. Baillargeon, U. Illinois, IL, USA N. Chater, University College, London, UK P. Dayan, University College London, UK S. Dehaene, INSERM, France D. Dennett, Tufts U., MA, USA J. Driver, University College, London, UK Y. Dudai, Weizmann Institute, Israel A.K. Engel, Hamburg University, Germany M. Farah, U. Pennsylvania, PA, USA S. Fiske, Princeton U., NJ, USA A.D. Friederici, MPI, Leipzig, Germany O. Hikosaka, NIH, MD, USA R. Jackendoff, Tufts U., MA, USA P. Johnson-Laird, Princeton U., NJ, USA N. Kanwisher, MIT, MA, USA C. Koch, Caltech, CA, USA M. Kutas, UCSD, CA, USA N.K. Logothetis, MPI, Tübingen, Germany J.L. McClelland, Stanford U., CA, USA E.K. Miller, MIT, MA, USA E. Phelps, New York U., NY, USA R. Poldrack, U. Texas Austin, TX, USA M.E. Raichle, Washington U., MO, USA T.W. Robbins, U. Cambridge, UK A. Wagner, Stanford U., CA, USA V. Walsh, University College, London, UK
TiCS evolution
Stavroula Kousta
Update Letters
2
What causes dyslexia?: comment on Goswami
Mark S. Seidenberg
Opinion
3
A temporal sampling framework for developmental dyslexia
Usha Goswami
Review
11
Mind the gap: bridging economic and naturalistic risk-taking with cognitive neuroscience
Tom Schonberg, Craig R. Fox and Russell A. Poldrack
20
The critical role of retrieval practice in long-term retention
Henry L. Roediger III and Andrew C. Butler
28
Cognitive enhancement by drugs in health and disease
Masud Husain and Mitul A. Mehta
37
Reward, dopamine and the control of food intake: implications for obesity
Nora D. Volkow, Gene-Jack Wang and Ruben D. Baler
Editorial Enquiries Trends in Cognitive Sciences
Cell Press 600 Technology Square Cambridge, MA 02139, USA Tel: +1 617 397 2817 Fax: +1 617 397 2810 E-mail:
[email protected]
Forthcoming articles Emotional processing in anterior cingulate and medial prefrontal cortex Amit Etkin, Tobias Egner and Raffael Kalisch
Visual search in scenes involves selective and non-selective pathways Jeremy Wolfe, Melissa L Vo, Karla K Evans and Michelle R. Greene
Cognitive Culture: Theoretical and empirical insights into social learning strategies Luke Rendell, Laurel Fogarty, William JE Hoppitt, Thomas JH Morgan, Mike M Webster and Kevin N Laland Cover: Obesity has reached epidemic proportions in several countries, having profound societal and health care implications. On pages 37–46, Nora D. Volkow, Gene-Jack Wang and Ruben D. Baler overview the neural bases of obesity and the failure to resist the urge to eat, and conclude that, much as in addiction, obesity is associated with enhanced sensitivity of reward-related circuits to conditioned stimuli linked to energy-dense food, coupled with impaired function of the executive control network that regulates the urge to eat. Understanding the neural (as well as genetic and environmental) bases of obesity undoubtedly holds the key to curbing the current obesity epidemic. Cover image: Tooga/Digital Vision/ Getty Images.
Editorial
TiCS evolution Stavroula Kousta Editor, Trends in Cognitive Sciences
This month there is an additional reason why you should take a look at the masthead of the print issue of the journal. Or, if you are accessing the issue online through the TiCS website, I encourage you to take a look at the journal’s home page. With the new year, the Advisory Editorial Board of the journal has been extensively revised and expanded to reflect more fully the continuously evolving landscape of the cognitive sciences. I would like to take this opportunity to thank the retiring members for their invaluable contributions to the journal over the years. Their input to, feedback on and ambassadorship for TiCS in the scientific community helped to shape the content and scope of the journal, as well as the way it is perceived in the community. At the same time, I extend a very warm welcome to the members who are now joining the Board and thank them for their enthusiastic undertaking of this new role and their commitment to help TiCS to keep abreast of novel exciting developments in the field. In keeping with the interdisciplinary mission of the journal, the expertise of these 13 new members spans diverse fields (from cognitive neuroscience and neurology to philosophy and linguistics), and thus will strengthen and extend the scope of the journal in areas that have seen rapid growth in recent years and have contributed significantly to the developing landscape of the cognitive sciences. When TiCS was first launched in 1997, its mission was to provide a platform for interaction of the disciplines making up the cognitive sciences and to feature engaging and timely overviews of the most exciting research and insights for scientists, students and teachers who want to keep up with the latest developments in the field. This remains the mission of the journal today: to be inclusive of any discipline that contributes to and furthers our understanding of the way in which the mind/brain achieves the
amazing cognitive feats that it does; to encourage interdisciplinary contributions and perspectives; to publish authoritative, insightful and thought-provoking review and opinion pieces that are accessible to a broad audience; and to provide a forum for discussion, debate and commentary. At the same time, and as a direct reflection of the growing effort in the scientific community to raise awareness, explore and discuss the links between scientific research and the world outside the laboratory, we launched a new article type last year devoted to issues at the interface between Science and Society. We also introduced longer, more comprehensive Feature Reviews that address broad topic areas, covering them in more depth, to complement our more focused standard mini-reviews. The online presence of TiCS (www.cell.com/trends/cognitive-sciences) is also stronger than ever before: the journal pages feature not only the current issue, archive and online-first articles, but also thematically organized collections of articles, free access to a review or opinion piece from the current issue on a monthly basis, articles of interest in other Cell Press journals and much, much more. As part of the Neuroscience portfolio in the Cell Press family of journals, the TiCS pages are hosted alongside the pages of its sister journals, Neuron and Trends in Neurosciences. And all other Cell Press titles that carry content relevant to our readership, such as Current Biology and Trends in Ecology and Evolution, are also just a click away. With its Impact Factor reaching 11.667 in 2009 (Journal Citation Reports, Thomson Reuters) and a growing readership, TiCS is the premier monthly review journal for cognitive science. The journal is committed to catering in the best possible way for the needs of the community it serves, so this editorial is also a call to you, the readers, to contact us at
[email protected] with your thoughts, comments and suggestions on how we can make TiCS better.
1364-6613/$ – see front matter ß 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.tics.2010.11.003 Trends in Cognitive Sciences, January 2011, Vol. 15, No. 1
1
Update Letters
What causes dyslexia?: comment on Goswami Mark S. Seidenberg Department of Psychology, University of Wisconsin-Madison, 1202 West Johnson Street, Madison, WI 53706, USA
Finding an underlying deficit that links the disparate impairments associated with dyslexia would be major breakthrough. In a recent article in TiCS, Goswami [1] offers a viable candidate for such a deficit – and has done a remarkable job of finding links that are plausible if still mostly circumstantial. Her stimulating article raises numerous directions for future research. The epidemiology of dyslexia is poorly documented. Although many deficits have been reported [2], how often they occur and co-occur at different ages in different languages is still unknown. Goswami’s descriptions of what is found in dyslexia should be read as what was found in some studies. More facts are needed; for example, with respect to the greater sensitivity of dyslexics to allophonic variation (which is crucial to Goswami’s account of phoneme-level difficulties), many have sought this effect, and some have found it. The null results, however, get filed away. Studies testing the Temporal Sampling Framework (TSF) need to be conducted more widely and with a variety of methods, together with tests of other putative deficits. The challenge is not only to establish closer mechanistic, causal connections between the hypothesized deficit and diverse behavioral impairments, but also to explain the distribution of impairments. The theory also needs to accommodate strong evidence for mainly left-hemisphere subcortical anomalies in dyslexia [3,4]. Short of a large-scale international epidemiological study, researchers (both of brain and behavior) need to test the same subjects using each other’s measures, and post all results, both positive and negative. An online archive in which researchers could deposit their stimulus materials would make this possible. Perhaps a major funding agency could see the value in this undertaking. Because reading is a complex task drawing on numerous capacities, it is unsurprising that multiple genetic polymorphisms are apparently involved. The TSF would be stronger if there were a reason why various genetic anomalies converge on low-frequency oscillation in the right hemisphere. I would be inclined to search for anomalies in brain development that have, as one highly salient consequence, the deficit that Goswami has identified. I present the following speculative sketch to illustrate the type of multilevel theory to which we might aspire. (i) Prominent candidate genes for dyslexia are implicated in cell migration [5]. (ii) Disorders of brain development often involve disturbances of interneuron migration and integration [6].
Corresponding author: Seidenberg, M.S. (
[email protected]).
2
(iii) Anomalies in the migration of GABAergic (inhibitory) interneurons may underlie a variety of developmental disorders. Such anomalies can be regional rather than global [7]. (iv) GABAergic interneuron pathology impairs lateral inhibition, affecting discrimination of competing types of sensory information [8]. (v) Auditory processing at multiple time and frequency scales in parallel requires resolution of such competing information, and similarly for vision. (vi) For some unknown reason, the processing of lower temporal frequency auditory information is particularly vulnerable, (vii) From which deficits on tasks that rely on this information follow. Therefore, rather than the auditory-processing deficit causing associated impairments in vision, motor performance, attention, learning, memory, and so on, all these impairments arise from a common source: an attested type of neurodevelopmental anomaly, caused by multiple genes, creating the observed variability in the phenotypic outcome. Processing of low temporal frequency auditory signals might be especially affected, with multiple downstream consequences. Goswami’s article is an interesting addition to the literature and her theory will stimulate much valuable research. However, much remains to be learned. References 1 Goswami, U. (2010) A temporal sampling framework for developmental dyslexia. Trends Cogn. Sci. 15, 3–10 2 Pennington, B.F. and Bishop, D.V.M. (2009) Relations among speech, language, and reading disorders. Annu. Rev. Psychol. 60, 283–306 3 Deutsch, G.K. et al. (2005) Children’s reading performance is correlated with white matter structure as measured by diffusion tensor imaging. Cortex 41, 354–363 4 Preston, J.L. et al. (2010) Early and late talkers: school-age language, literacy and neurolinguistic differences. Brain 133, 2185–2195 5 Harold, D. et al. (2006) Further evidence that the KIAA0319 gene confers susceptibility to developmental dyslexia. Mol. Psychiatry 11, 1085–1091 6 Haydar, T.F. (2005) Advanced microscopic imaging methods to investigate cortical development and the etiology of mental retardation. Ment. Retard. Dev. Disabil. Res. Rev. 11, 303–316 7 Levitt, P. et al. (2004) Regulation of neocortical interneuron development and the implications for neurodevelopmental disorders. Trends Neurosci. 7, 400–406 8 Casanova, M.F. et al. (2002) Minicolumnar pathology in autism. Neurology 58, 428–432 1364-6613/$ – see front matter ß 2010 Published by Elsevier Ltd. doi:10.1016/j.tics.2010.10.003 Trends in Cognitive Sciences, January 2011, Vol. 15, No. 1
Opinion
A temporal sampling framework for developmental dyslexia Usha Goswami Centre for Neuroscience in Education, University of Cambridge, Downing St, Cambridge, UK, CB2 3EB
Neural coding by brain oscillations is a major focus in neuroscience, with important implications for dyslexia research. Here, I argue that an oscillatory ‘temporal sampling’ framework enables diverse data from developmental dyslexia to be drawn into an integrated theoretical framework. The core deficit in dyslexia is phonological. Temporal sampling of speech by neuroelectric oscillations that encode incoming information at different frequencies could explain the perceptual and phonological difficulties with syllables, rhymes and phonemes found in individuals with dyslexia. A conceptual framework based on oscillations that entrain to sensory input also has implications for other sensory theories of dyslexia, offering opportunities for integrating a diverse and confusing experimental literature. Dyslexia and auditory neuroscience Developmental dyslexia affects 7% of children and is defined as a specific learning difficulty affecting reading and spelling that is not due to low intelligence, poor educational opportunities or overt sensory or neurological damage [1]. Across languages, children with dyslexia have poor phonological processing skills, leading to the dominant phonological core deficit [2] model of this heterogeneous disorder. Here, I propose a novel causal framework for developmental dyslexia, the temporal sampling framework (TSF), which has this phonological model as its focus. Temporal coding is an important aspect of information coding in the brain [3,4], and temporal coding via the synchronous activity of oscillating networks of neurons at different frequency bands (e.g. Delta, 1.5–4 Hz; Theta, 4–10 Hz; and Gamma, 30–80 Hz [3]) is crucial in the perceptual processing of speech [5]. For example, stimulusinduced modulation (‘phase locking’) of inherent neural oscillations at specific frequencies is important for syllabic perception (Theta) [5,6] and for prosodic perception (Delta) [7]. The acoustic speech signal can be considered as a summation of several frequency bands fluctuating in intensity (amplitude) over time (the ‘amplitude envelope’, AE). Neurally, the auditory system codes amplitude modulation in natural sounds both across different frequency channels and on different timescales [8]. The AE can be analysed in terms of its constituent temporal modulation frequencies. The dominant modulation frequencies are 4–6 Hz, irrespective of the audio frequency band, type of speech or the speaker, reflecting the sequential rate of words and syllables [9]. In auditory cognitive neuroscience, Corresponding author: Goswami, U. (
[email protected]).
these insights are exploited in multi-time resolution speech-processing models (Box 1) [5,6]. The framework proposed here integrates difficulties in processing the rate of change of amplitude (rise time) at AE onset (found in dyslexia across languages [10–15]) with impaired temporal sampling of input by low-frequency Theta and Delta oscillatory mechanisms. Rise time difficulties suggest impairments in distinguishing the different modulation frequency ranges in speech, perhaps arising Glossary Allophone: acoustically different forms of the same phoneme; for example, the sound corresponding to the letter P in the spoken syllables ‘spin’ and ‘pin’ is acoustically different, the sound in ‘spin’ being more like /b/, but both sounds are treated in English as the phoneme /p/. Amplitude: volume of sound (intensity). Amplitude envelope (AE): the summation over time of the intensity fluctuations (amplitude modulations) in the different frequency channels in the speech signal. Formant: a concentration of acoustic energy within a narrow frequency band in the speech signal. Formant transition duration: the time taken from the mouth obstruction that forms a consonant to the steady position marking the succeeding vowel (usually rapid, <50 ms). Magnocellular: visual processing pathway containing neurons with large cell bodies, involved in motion perception (also called the dorsal pathway). Onset/Rime: phonological units created by dividing any syllable at the vowel (s-eat sw-eet str-eet). Phase: the fraction of a wave cycle at a certain frequency that has elapsed relative to an arbitrary point. Phase is a circular measure. Phase coherence: whether phase is temporally correlated at particular time points at a certain oscillatory frequency. If inter-trial phase coherence is high then phase is correlated at these frequencies. Phase locking: when the phase of an oscillator signal is tied to the phase or timing of a reference signal. Thought to indicate that neural oscillations are being driven by aspect/s of an external stimulus. Phoneme: a theoretical unit of sound that distinguishes word meanings, such as CAT–HAT, CAT–COT or CAT–CAP. In practice, phonemes usually correspond to several phonetically distinct sounds so a shorthand for these abstract units is that they correspond to the sounds associated with alphabetic letters. Phoneme awareness: ability to reflect on the sound units in words that correspond to alphabetic letters, usually measured by oral phoneme elision or Spoonerism tasks (Bob Dylan to Dob Bylan). Phonetic features: aspects of sound production, such as degree of voicing and nasalization, which distinguish sound elements from each other. Phonology: the sound system of a particular language. Rhyme awareness: ability to reflect on rhyme units within words, usually measured by tasks such as same-different judgement (sign–wine) or spotting the non-rhyming word in a triple of words ( fat pit cat). Syllable: a unit of speech comprising a vowel sound (nucleus) and usually some consonant sound/s preceding the vowel (onset) and/or following it (coda). The most frequent syllable type across languages is a consonant–vowel (e.g. BA). Syllable awareness: ability to reflect on the syllabic structure of words, usually measured by syllable-counting tasks or same–different judgement tasks. Syllable stress: the more prominent syllables in the speech stream, which require greater muscular effort by the speaker and usually have greater amplitude, duration and pitch as well as larger rise times. Temporal fine structure: the rapid oscillations in the speech envelope over time, which contribute primarily to changes in fundamental frequency (F0), harmonics and formant transition.
1364-6613/$ – see front matter ß 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.tics.2010.10.001 Trends in Cognitive Sciences, January 2011, Vol. 15, No. 1
3
Opinion
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Box 1. The multi-time resolution model of cortical speech processing (MTRM) Just as visual input is processed on multiple spatial scales, auditory input is processed on multiple temporal scales. The auditory signal can be fractionated on the basis of both frequency and time, and cochlear mechanisms with respect to coding frequency are well documented. However, the ear is more than a frequency analyser, and energy (amplitude) changes over time provide crucial information, particularly for syllabic segmentation. In the auditory cortex, there is spontaneous neural activity at oscillatory frequencies of 3– 6 Hz (Theta) and 28–40 Hz (Gamma) [25] and, by MTRM [6], stimulus-induced modulation of these inherent cortical rhythms is important for speech analysis. In MTRM, a right-lateralized ‘theta sampling network’ is preferentially driven by slower temporal rates and codes the lower modulation frequencies in the speech signal [6], enabling temporal integration at the syllable scale. The phase pattern of the Theta band tracks and discriminates spoken sentences, segmenting the incoming speech signal into syllable-sized packets and resetting and sliding to track speech dynamics (e.g. as speech rate varies) [5]. This phase resetting mechanism is thought to be driven by the onsets or edges of sounds (e.g. syllable rise time), and to reflect neural coding of the AE. Other neurofunctional models include low-frequency temporal sampling by Delta oscillators [7]. In MTRM, higher frequency modulations in the signal are coded by a ‘Gamma sampling network’ that is bilateral and enables temporal integration at the phonetic (‘phoneme’) scale. The different temporal integration windows used by the different oscillatory networks yield packets of information at different grain sizes (syllable and phoneme) that would (by hypothesis) be one source of phonological learning and that can be matched during lexical access to stored representations in the mental lexicon. Application of this model to dyslexia suggests that impaired syllable-level processing (less efficient Theta phase locking) is accompanied by unimpaired Gamma sampling, resulting in greater weighting of phonetic-feature information in phonological development. Therefore, children with dyslexia might be sensitive to all the phonetic contrasts that are used in human languages as are typically developing infants [31]. Whereas impaired syllable-level processing could explain the impaired development of phonological awareness in developmental dyslexia, enhanced phonetic-level coding could amplify the pervasive difficulty in mapping sounds to letters. For many phonetic continua (e.g. da-ta), there would be more candidate phonemes (e.g. the distinction between/d/,/th/and/t/) than letters (D/T) [52].
from inefficient phase locking to these frequency ranges by neuronal oscillations. An important modulation frequency range for speech is 3–10 Hz, where the modulation spectrum peaks, reflecting the underlying syllabic structure of the AE [16]. I argue that an oscillatory perspective can explain why auditory sensory difficulties lead to phonological impairments in dyslexia [and, by extension, specific language impairment (SLI)] and might also enable a systematic approach to integrating other sensory impairments in dyslexia (Table 1). In particular, I propose that a difficulty with slower temporal modulations [in the Theta and Delta range (1.5–10 Hz) [3]] explains difficulties in dyslexia with syllable parsing and perceiving both syllable stress and the phonetic constituents of the syllable. By proposing a specific difficulty with Theta and Delta oscillators, I suggest that auditory rhythmic entrainment is also likely to be impaired. When oscillations entrain to an input rhythm [4], their high excitability phases coincide with events in the stimulus stream, such as syllable onsets in speech. Impairments in auditory entrainment are likely to have consequences for attention and also auditory– visual integration. This novel framework links the sensory and phonological deficits found in dyslexia to recent auditory neuroscience and neurobiological models of speech processing (Box 1). The temporal sampling framework Although current phonological models of dyslexia are based on deficits in subsyllabic phonology (e.g. awareness of onset-rimes and phonemes, see Glossary), developmental dyslexia also involves impaired syllabic and prosodic perception [17] (Table 1). A general difficulty in distinguishing different modulation frequency ranges, which particularly affects the slower temporal rate in speech processing and tracking of the AE, would affect the efficiency of syllabic segmentation. Rise times are crucial events in the speech signal, as they reflect the patterns of amplitude modulation that facilitate the temporal
Table 1. Cognitive and sensory features of developmental dyslexiaa Phonological deficits in dyslexia Phonological awareness [30] Examples of tasks Count syllables: three syllables in oasis Judge rhyme: fit, cat, pit
Phonological memory [30]
Rapid phonological output [30]
Syllable stress [22]
Prosodic perception [17]
Digit span
Rapid automatised naming (RAN) RAN colours and RAN objects RAN digits and RAN letters
Matching multi-syllabic words for stress pattern
DeeDee tasks
Yes
Yes
Yes
Yes
Noise exclusion [36]
Sluggish attention shifting [37]
Cerebellar function (balance) [40]
Rhythmic entrainment [19,20]
Yes, via impaired modulations
Yes, via impaired phase locking
Yes, via Kotz’s cerebellar model [71]
Yes, via impaired phase locking
Non-word repetition
Manipulate sounds: Bob Dylan to Dob Bylan Explained by TSF? Yes Other reported deficits Magnocellular function [35,38,39] Consistent with TSF? Yes, via AV integration
a Classically, the core cognitive features of developmental dyslexia cluster around word-level phonology (phonological awareness of syllables, onset-rimes and phonemes), phonological memory for sequences of digits or monosyllabic words, and RAN of digits, letters, colours and objects. However, there are also suprasegmental deficits in the perception of prosody and syllable stress. Moreover, deficits in rhythmic entrainment, coherent motion detection (via the Magnocellular pathway), spatial attention, balance and noise exclusion are also reported. The TSF can explain how the difficulties in phonology arise. The TSF is also consistent with several current sensory theories of dyslexia, if some extra assumptions derived from current work on multisensory integration and the role of low frequency modulations are allowed.
4
()TD$FIG][ Opinion
Trends in Cognitive Sciences
January 2011, Vol. 15, No. 1
Acoustic Input Signal Modelled as envelope and fine structure
Rise time (helps determine modulation rates)
“Gamma Sampling Network”: HF modulations Bi-lateral?
Segment level representations Too specific in dyslexia?
“Theta Sampling Network”: LF modulations (also Delta?) Right lateralised?
Syllable level representations Prosodic structure Deficient in dyslexia
Phonological representations of words
Reading and spelling acquisition TRENDS in Cognitive Sciences
Figure 1. The temporal sampling framework. The TSF assumes a specific dyslexic difficulty with slower temporal modulations, as rise-time perception difficulties in dyslexia involve more extended rise times. As slower modulations are preferentially processed by the right hemisphere, the TSF assumes a right-lateralised impairment in Theta and Delta oscillators. The proposed range of low-frequency modulations processed by the Theta oscillators varies across published studies, but is proposed by Poeppel et al. [6] to yield a temporal integration window of 100–300 ms (approximating the syllable rate). The proposed range for high-frequency modulations also varies across studies, but in [6] yields a temporal integration window of 20– 50 ms, approximating the phonetic rate. According to the TSF, the proposed temporal integration window for syllabic parsing in the right hemisphere might function atypically in dyslexia, yielding the auditory basis of the associated phonological and language deficits (deficits indicated by blue shading).
segmentation of the acoustic signal into syllables. Risetime discrimination is impaired in dyslexia in English [10], French [11], Hungarian [12], Spanish [13], Chinese [13] and Finnish [14], with conflicting data only for Greek [15] (where control children performed unusually poorly, however). Rise time is a significant predictor of phonological awareness as measured in these languages [e.g. awareness of rhyme (‘wine-sign’) in English [10], awareness of lexical tone (rising or falling syllable pitch) in Chinese [13]], is almost a significant predictor of phonological awareness in Greek (p = .06 [15]) and predicts novel word learning in English [18]. Furthermore, there is evidence of impaired rhythmic entrainment (tapping to a beat) in dyslexia, particularly at 2 Hz [19,20]. Although syllables occur approximately every 200 ms across languages (within the Theta band of 3– 10 Hz), linguistic analyses suggest that stressed syllables occur approximately every 500 ms (2 Hz) [21], implicating also Delta band processing. Impairments in perceiving syllable stress have recently been found in dyslexia [22]. Furthermore, children with dyslexia appear to have difficulties in discriminating more extended rise times [23] (e.g. a syllable such as WA has a more extended rise time than does a syllable such as BA). Extended rise times are related mathematically to lower frequency modulations and slower temporal rates. Children with dyslexia also appear to be impaired relative to controls in perceiving speech
presented as low frequency modulations (<4 Hz) but not as high frequency modulations (22–40 Hz; U. Goswami et al., unpublished data). The TSF therefore adapts the multi-time resolution model of Poeppel and colleagues [6] (MTRM) to a syllabic rather than phonemic perspective on phonological development (Figure 1). According to this adaptation of MTRM, the primary neural deficit in dyslexia should be impaired phase locking by (rightward lateralized) Theta (and possibly Delta) oscillatory networks in auditory cortex (impaired temporal processing could also arise from problems lower in the auditory pathway, however, supporting evidence is not yet available). Theta networks enable temporal integration at the syllable rate [5,6] and Delta networks should be important for perceiving prosody [7] (strong versus weak syllabic beats [24]). Impaired Theta mechanisms would also have consequences for phoneme perception. Impaired phase locking by Theta generators might hamper the integration between different acoustic features contributing to the perception of the same phoneme. Alternatively, impaired Theta mechanisms could lead developmentally to a phonological system that is weighed towards the information coded bilaterally by Gamma oscillations, which are analysed by MTRM independently and then bound perceptually with the output from Theta oscillators. Accordingly, phoneme perception could be different in dyslexia. Dyslexic difficulties with rise 5
Opinion time could be a neural marker for the postulated dyslexic difficulty in distinguishing different modulation frequency ranges. If difficulties with lower frequency modulations arise from impaired phase locking by Theta oscillatory networks in the right hemisphere [5,6,7,25,26], this could throw light on atypical right hemisphere activity found in both developmental dyslexia and SLI [27,28]. A developmental perspective As both dyslexia and SLI are developmental disorders of learning, the phonological deficits in these disorders, according to the TSF, must arise because basic auditory processing is atypical from birth. Indeed, human infants show syllabic sensitivity as neonates [29], using rhythmic cues to segment syllables and words from the acoustic signal to build a lexicon of spoken word forms. Deficiencies in processing low-frequency modulations in infancy would reduce rhythmic sensitivity and impair phonological development. Syllable awareness is also primary in early childhood, as phonological awareness (the ability to recognise and manipulate phonological units in words) follows a developmental sequence across languages, from syllable to onset-rime to (once reading is taught) phoneme [30]. Audiovisual learning in infancy also supports a syllabic focus [31] as, similar to syllabic rhythms, natural mandibular cycles occur at 4 Hz [32]. Hence, low-frequency information in visual input supports syllable perception [26]. Impaired processing of low-frequency modulations in infancy would affect auditory–visual integration (‘speech reading’), further affecting phonological development. The experience of making sounds (talking) is also important for the quality of the phonological representations developed by the child. This would implicate motor processes [33] and, by the MTRM, an important developmental role for right superior temporal gyrus (STG, thought to be crucial in prosodic analysis) [6]. Interestingly, a recent report implicates right STG as a key neural structure related to the genetics of developmental dyslexia (Black et al., unpublished data), and a study of two- and three-year-olds at genetic risk for dyslexia showed that those children who later had reading difficulties also had speech-timing difficulties, producing significantly fewer syllables per second (4.8 at age 3 compared to 7.1 for non-risk children) and pausing for longer between articulations [34]. Furthermore, the spontaneous oscillatory neural activity at both Theta and Gamma frequencies demonstrated in auditory cortex by Giraud and colleagues (Box 1) is correlated with spontaneous activity in visual and premotor regions [25]. It is thus possible that inefficient phase locking in auditory cortex has associated effects on the development of visual and motor processing and could also be the source of some of the observed visual, motor and attentional difficulties in developmental dyslexia [35–40]. For example, according to dynamic attending theory [41], stimulus discrimination is enhanced when an auditory event is anticipated at a regular and predictable rhythmic rate, narrowing the window of attention. As rhythmic entrainment is impaired in dyslexia (and, by hypothesis, is governed by neural phase locking), difficulties in forming an internal representation of rhythmic timing might have knock-on effects 6
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
for the development of visual attention [38], auditory-– visual integration, sluggish attention shifting [37] and cerebellar function [40]. Predictions from the TSF The TSF makes several novel predictions about sensory, cognitive and behavioural deficits in dyslexia, some of which have been previously been explored and some of which can be evaluated using data from other research perspectives [22]. In particular, the postulated rise-time deficits can explain a host of seemingly disparate perceptual and linguistic deficits. For example, rise time is a crucial cue to the perception of syllable stress and also to rhythmic timing [42]; therefore, both should be impaired in dyslexia. In speech, rhythm depends on the motor constraints inherent in producing syllables, with long and short, or stressed and unstressed, syllables following each other (metrical structure: alternating strong and weak beats). These intensity fluctuations vary in rise time, hence rise-time deficits might also cause difficulties in perceiving metrical structure. Here, I discuss recent evidence for prosodic, syllable stress and metrical perceptual deficits in dyslexia, as well as evidence of difficulties in rhythmic entrainment at syllable-relevant rates. I also mention preliminary evidence for impaired neural phase locking. Language development: prosody and syllable stress With respect to prosodic perception, dyslexic impairment is found in reiterative speech tasks, in which strong syllables are replaced by the syllable ‘DEE’ and weak syllables by ‘dee’ (hence Casablanca becomes DEEdeeDEEdee, Kitzen, unpublished data). In a version for children [17], significant dyslexic impairments were found in recognising pictures of famous people spoken in ‘DeeDees’ (Harry Potter as DEEdeeDEEdee). Individual differences were predicted by rise-time discrimination. The perception of syllable stress has also been measured directly [22]. Adults with dyslexia were asked to judge whether two four-syllable words were stressed in the same way (e.g. maternity ridiculous). They showed significant impairments, and individual differences were predicted by rise-time discrimination rather than by other auditory measures (frequency or intensity discrimination). The imitation of weak–strong or strong–weak syllable stress is also impaired in children with dyslexia (Huss and Goswami, unpublished data). Rhythmic timing: auditory entrainment and musical meter With respect to rhythmic timing, Wolff has long argued for a difficulty in dyslexia in finger-tapping tasks [43]. In a recent study, it was demonstrated that keeping time with a metronome beat by finger tapping (rhythmic entrainment) was impaired at syllable-relevant rates in developmental dyslexia (particularly 500 ms) [19,20]. Both adults and children with dyslexia were impaired at keeping time with a beat at 2 Hz (500 ms); children were also impaired at 2.5 Hz (400 ms) but not at 1.5 Hz (666 ms [19]). By contrast, adults were impaired compared with controls at 1.5 Hz but not at 2.5 Hz [20]. For both children and adults, individual differences in rhythmic entrainment
Opinion were predicted by rise-time discrimination. A metrical musical task has also been developed in which short ‘tunes’ are presented to children, each tune comprising three repetitions of phrases of two to five notes, with downbeat on the first, second or third note [44]. For half of the trials, the meter was disrupted by making the note carrying the downbeat slightly longer in a second repetition of the tune. Children with dyslexia were impaired in detecting whether metrical structure was the same or different, and rise-time discrimination, rather than duration discrimination, predicted individual differences. Phase locking and low-frequency modulations Regarding sensory and neural deficits, the TSF predicts reduced perceptual sensitivity to amplitude modulation and to frequency modulation at lower rates, and possibly (via MTRM) atypical right-hemisphere processing of lowfrequency modulations [5–7]. Envelope-following difficulties dependent on impaired phase locking should also be found. Temporal modulation transfer functions to AM noise are indeed impaired in both adults [45] and children [46] with dyslexia, with particular insensitivity for children at 4 Hz (the lowest frequency measured in adults was 10 Hz). Frequency modulation detection is also impaired in dyslexic children at slower rates [39] (2 Hz but not 240 Hz) and, in typically developing children, the threshold for detecting 2-Hz FM predicts reading and spelling skills [47]. EEG recordings with adults with dyslexia have shown significantly smaller amplitude modulation following auditory evoked responses compared with controls [48]. For children, a recent EEG study assessed phase locking to speech by cross-correlating the response in temporal electrodes with the broadband envelope of a sentence [49]. Impaired phase locking was found in poor readers. The timing of phase locking for each hemisphere also differed by reading skill. Whereas typically developing readers had earlier right-hemisphere responses, poor readers had earlier left-hemisphere responses, and cortical asymmetry predicted 50% of unique variance in phonological skills. In studies where intelligibility of syllables presented as envelope speech has been assessed, children with dyslexia show clear impairments [50]. These findings are all consistent with the TSF. Finally, in a recent EEG study exploring auditory rhythmic entrainment in dyslexia (Solte´sz et al., unpublished data), atypical phase locking was found at lower frequencies (0.5–4 Hz, within Delta and Theta bands) to a 2-Hz rhythmic stimulus stream in dyslexia. This implicates Delta and Theta mechanisms in dyslexia; in addition, whereas Theta oscillations track syllable-level information [5], Delta oscillations have been proposed to encode metric foot and phrase-level information [7], also areas of linguistic difficulty in dyslexia [51]. Phonetic representation: different in dyslexia? One attractive and provocative feature of the TSF is that it places perceptual deficits at the level of the syllable rather than of the phoneme. For phonemes, the closest acoustic correlates are phonetic features, and children with dyslexia can be more sensitive to allophonic variation (phonetic variation within phoneme categories) than are typically developing children [52]. Atypical phonetic-level percep-
Trends in Cognitive Sciences
January 2011, Vol. 15, No. 1
tion would have serious consequences for establishing grapheme–phoneme correspondences [52]. Consistent with this view, neural integration between letters and sounds (e.g. indexed by neural activation when letters and sounds mismatch, as in E and /a/) is reduced in dyslexia [53]. Making a phonetic discrimination between the syllables BA and WA on the basis of formant transition duration is significantly enhanced in dyslexia [23]; discriminating BA from WA on the basis of rise time is significantly impaired in the same children [23]. The TSF proposes that fast rate processing (Gamma sampling of the speech signal) is preserved in dyslexia, possibly resulting in greater weight being accorded to sensory feature-level cues and enhancing perception of formant transitions (Figure 1). There are also relevant data from auditory brainstem studies. When a syllable is spoken, certain harmonics are boosted depending on the vowel. The brainstem frequency following response (FFR) reflects neural phase locking to these harmonics, and brainstem timing is poorer in poor readers [54]. Additionally, the FFR to repetition of the same syllable (DA) versus variants of the syllable created by varying different features [e.g. DA with high pitch; DA with dipping pitch; long DA; and different voice onset time (hence TA)] differs between typically developing and dyslexic readers [55]. Typical readers show larger amplitudes to repetition of the same syllable in the formant transition region of the FFR (7–60 ms). By contrast, dyslexic readers show larger FFR amplitudes to the variants of the syllable over this range, suggestive of increased sensitivity to finegrained (allophonic) information. The temporal sampling framework and other sensory theories As noted earlier, there are many sensory deficits in dyslexia (see Table 1 for examples). Theories of attention difficulties in dyslexia [37] fit the TSF, as attention is enhanced when stimuli arrive in phase with neural oscillations [26]. Impaired phase locking in dyslexia could explain the atypical visual and auditory cueing effects that underpin sluggish attention-shifting theory [37]. Theories based on magnocellular dysfunction [35] have suffered from inconsistent data [56]. Researchers now propose either broader dorsal stream deficits affecting visuo-spatial attention [38], or difficulties in the detection of dynamic visual (and auditory) events [39]. Regarding the former, deficits that are observed in visual crowding (when visual features such as letters are jumbled together perceptually) are insufficient to explain the degree of reading impairment shown by children with dyslexia [57]. Regarding the latter, difficulties in dynamic sensory sensitivity are consistent with the TSF if the neural coding of lower frequency information (perhaps spatial as well as temporal) is primarily impaired. However, neither visual dorsal stream deficits nor attention deficits are specific to developmental dyslexia, weakening recent claims that they represent the primary impairment [38]. Dorsal stream deficits occur in many other developmental learning difficulties, including autism [58] (where recoding print to sound can be enhanced, as in hyperlexia) and dyscalculia [59] (where reading can be normal), as do attention deficits (notably attention deficit hyperactivity disorder, where reading can 7
Opinion
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
be intact [60]). By contrast, rise-time deficits are found wherever there are phonological difficulties (e.g. in SLI [61,62]), and are also found in compensated adult dyslexics [63], suggesting a specific sensory deficit that does not resolve itself over developmental time. Worse sensory function in the presence of noise is also consistent [36,50] (noise will affect temporal modulations). It was recently proposed that children with dyslexia have difficulties in forming perceptual anchors [64] [benefitting from repetition of the same sensory reference stimulus (anchor) across trials]. However, such a broad deficit would have implications for cognitive development outside the language and/or reading sphere and has failed to be supported by specific tests [65]. Similarly, rapid auditory processing (RAP) deficit theory [66] (which proposed specific difficulties in processing brief, rapidly successive acoustic changes) has also failed to be supported by several studies [67]. RAP theory proposed that impaired discrimination of formant transitions affected phonetic perception. However, stretching the formant transitions in syllables (or stretching the syllables in time) does not improve syllable perception in dyslexia, a necessary corollary of RAP [68]. Furthermore, FastForwardR, an auditory training programme based on slowing down speech by 50%, also amplified the amplitude modulations between 3 and 30 Hz in the narrowband filtered signal [68]. Therefore, the amplitude modulation amplification might be more potent in contributing to training effects. Finally, several of these theories have been proposed as alternatives to the phonological core deficit model, treating the pervasive and well[()TD$FIG]
documented phonological difficulties found across languages as incidental rather than causal in dyslexia [38]. An advantage of the TSF is that it places impaired phonology at the heart of this specific learning difficulty. Concluding remarks The TSF proposes a specific deficit in dyslexia with lowfrequency phase locking mechanisms in auditory cortex, which is argued to have an impact on phonological development. The proposed auditory phase locking deficit might also have implications for the efficient functioning of other sensory systems. Being able to define the core neural deficit(s) underlying dyslexia will improve the efficacy of educational interventions. The TSF suggests a novel focus on the syllable in educational interventions, incorporating direct tuition about speech prosody. Interventions based on rhythm and music might also offer benefits for children with developmental dyslexia, as the dyslexic brain is ‘in tune but out of time’ [62]. Perceiving melody is not dependent on rhythm, as demonstrated in a model of acoustic analysis [69] derived from amusic (tone-deaf) patients (Figure 2). The acoustic parameters that are preserved in amusia are impaired in dyslexia, whereas sensitivity to lower frequency modulations is preserved in amusia [70] and impaired in dyslexia. As rhythm and meter are more overt in music than in language, it might be that remediation based on music and rhythm (ideally multi-modal), such as matching syllable patterning to metrical structure in music (singing), and playing instruments or moving in time with rhythms or rhythmic language (e.g. metrical poetry), will impact
Acoustic analysis
Pitch Pitch Organisation organisation
Temporal Temporal organisation Organisation
Contour Contour Analysis analysis Rhythm Rhythm
Interval Interval analysis Analysis Tonal Tonal encoding Encoding
Vocal plan formation
Singing
Acoustic toto Acoustic phonological Phonological conversion Conversion
analysis
Musical Musical lexicon Lexicon
Meter Meter analysis
Phonological Phonological lexicon Lexicon
Tapping Speaking reading TRENDS in Cognitive Sciences
Figure 2. Components of music and language processing. A model of how acoustic analysis of pitch versus rhythm might contribute to language versus musical processing, based on the modular framework of Peretz and Coltheart [69]. The skills preserved in amusia yet impaired in dyslexia are shown in blue. The skills impaired in amusia are shown in yellow. Although developmental disorders of learning are not modular, the framework is useful for supporting the view that musical remediation of dyslexia via training in rhythm and meter might improve phonological development and processing of low-frequency modulations.
8
Opinion Box 2. Questions for future research The TSF makes several predictions about acoustic processing in dyslexia that can be tested empirically: If the key integration windows in speech do correspond to the time frames of Gamma, Theta and Delta oscillations, is it possible to show that the dyslexic brain is unimpaired at sampling auditory signals at faster oscillatory rates such as Gamma, while simultaneously showing impairments at slower oscillatory rates such as Theta and Delta? Is it possible to find empirical evidence that perception of temporal fine structure is unimpaired in dyslexia, complemented by evidence that envelope perception is impaired, using the same stimuli (e.g. auditory chimera)? Is there a consistent right-hemisphere impairment in key language (temporal) areas, such as STS and STG, in dyslexia and, if so, how does the left hemisphere process stimuli for which the right hemisphere shows processing inefficiencies? Is there lefthemisphere compensation or atypical processing in both hemispheres? Is it possible to devise experimental techniques to investigate whether children with dyslexia perceive all the phonetic contrasts in human languages, as young infants do? Is it possible to predict the phonetic contrasts that are likely to be most affected in dyslexic perception based on current evidence for the role of lowfrequency modulations in phoneme recognition? Is it possible to make predictions about which aspects of language processing should be preserved in dyslexia? For example, as TSF does not accord a key role to frequency perception, is perception of emotional prosody (which relies on F0) preserved whereas perception of intonational patterning (linked to rise time) is impaired?
phonology and language development, for example via subcortical structures such as the cerebellum [71]. Traditional educational practices, such as learning metrical poetry and singing nursery rhymes, might also entrain the Theta and Delta oscillatory networks that are (by hypothesis) impaired in dyslexia. Such interventions could begin very young, long before literacy tuition [72]. Furthermore, exploring how neuronal oscillations code key sensory parameters might be of utility for educational neuroscience beyond dyslexia; for example, in explaining the co-morbidities that are characteristic of developmental disorders of learning [73] (see also Box 2 for a list of outstanding questions). Acknowledgements I thank David Poeppel, Steven Greenberg and Ian Winter for many helpful discussions during the development of this framework, and Vicky Leong, Denes Szu¨cs and Martina Huss for their comments. U.G. is supported by a Major Research Fellowship from the Leverhulme Trust and funding from the Medical Research Council (G0400574).
References 1 Goswami, U. (2008) Foresight Mental Capital and Wellbeing Project. Learning Difficulties: Future Challenges, Government Office for Science 2 Stanovich, K.E. (1998) Explaining the differences between the dyslexic and the garden-variety poor reader: the phonological-core variabledifference model. J. Learn. Disabil. 21, 590–604 3 Buzsaki, G. and Draghun, A. (2004) Neuronal oscillations in cortical networks. Science 304, 1926–1929 4 Schroeder, C.E. et al. (2008) Neuronal oscillations and visual amplification of speech. Trends Cogn. Sci. 12, 106–113 5 Luo, H. and Poeppel, D. (2007) Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron 54, 1001–1010 6 Poeppel, D. et al. (2008) Speech perception at the interface of neurobiology and linguistics. Philos. Trans. R. Soc. Lond. B Biol. Sci. 363, 1071–1086
Trends in Cognitive Sciences
January 2011, Vol. 15, No. 1
7 Ghitza, O. and Greenberg, S. (2009) On the possible role of brain rhythms in speech perception: Intelligibility of time-compressed speech with periodic and aperiodic insertions of silence. Phonetica 66, 113–126 8 Joris, P.X. et al. (2004) Neural processing of amplitude-modulated sounds. Physiol. Rev. 84, 541–577 9 Drullman, R. (2006) The significance of temporal modulation frequencies for speech intelligibility. In Listening to Speech: An Auditory Perspective (Greenberg, S. and Ainsworth, W.A., eds), pp. 39–47, Lawrence Erlbaum Associates 10 Goswami, U. et al. (2002) Amplitude envelope onsets and developmental dyslexia: a new hypothesis. Proc. Natl. Acad. Sci. U. S. A. 99, 10911– 10916 11 Muneaux, M. et al. (2004) Deficits in beat perception and dyslexia: evidence from French. Neuroreport 15, 1255–1259 12 Sura´nyi, Z. et al. (2009) Sensitivity to rhythmic parameters in dyslexic children: a comparison of Hungarian and English. Read. Writ. 22, 41–56 13 Goswami, U. et al. (2010) Language-universal deficits in developmental dyslexia: English, Spanish and Chinese. J. Cogn. Neurosci. DOI: 10.1016/jocn.2010.21453 14 Ha¨ma¨la¨inen, J. et al. (2005) Detection of sound rise time by adults with dyslexia. Brain Lang. 94, 32–42 15 Georgiou, G.K. et al. (2010) Auditory temporal processing and dyslexia in an orthographically consistent language. Cortex DOI: 10.1016/ j.cortex/2010.06.006 16 Drullman, R. et al. (1994) Effect of temporal envelope smearing on speech perception. J. Acoust. Soc. Am. 95, 1053–1064 17 Goswami, U. et al. (2010) Amplitude envelope perception, phonology and prosodic sensitivity in children with developmental dyslexia. Read. Writ. 23, 995–1019 18 Thomson, J.M. and Goswami, U. (2010) Learning novel phonological representations in developmental dyslexia: associations with basic auditory processing of rise time and phonological awareness. Read. Writ. 23, 453–469 19 Thomson, J.M. and Goswami, U. (2008) Rhythmic processing in children with developmental dyslexia: auditory and motor rhythms link to reading and spelling. J. Physiol. 102, 120–129 20 Thomson, J.M. et al. (2006) Auditory and motor rhythm awareness in adults with dyslexia. J. Res. Read. 29, 334–348 21 Arvaniti, A. (2009) Rhythm, timing and the timing of rhythm. Phonetica 66, 46–63 22 Leong, V. et al. (2010) Amplitude envelope perception and sensitivity to prosodic stress in developmental dyslexia. J. Mem. Lang. DOI: 10.1016/j.jml.2010.09.003 23 Goswami, U. et al. (2010) Rise time and formant transition duration in the discrimination of speech sounds: the Ba-Wa distinction in developmental dyslexia. Dev. Sci. DOI: 10.1111/j.1467-7687.2010. 00955.x 24 Cutler, A. (2005) Lexical stress. In The Handbook of Speech Perception (Pisoni, D.B. and Remez, R.E., eds), pp. 264–289, Blackwell 25 Giraud, A.L. et al. (2007) Endogenous cortical rhythms determine cerebral specialization for speech perception and production. Neuron 56, 1127–1134 26 Luo, H. et al. (2010) Auditory cortex tracks both auditory and visual stimulus dynamics using low-frequency neuronal phase modulation. PLoS Biol. 8, e1000445 27 Gauger, L.M. et al. (1997) Brain morphology in children with SLI. J. Speech Lang. Hear. Res. 40, 1272–1284 28 Heim, S. et al. (2003) Altered hemispheric asymmetry of auditory P100m in dyslexia. Eur. J. Neurosci. 17, 1715–1722 29 Mehler, J. et al. (1988) A precursor of language acquisition in young infants. Cognition 29, 143–178 30 Ziegler, J.C. and Goswami, U. (2005) Reading acquisition, developmental dyslexia, and skilled reading across languages: a psycholinguistic grain size theory. Psychol. Bull. 131, 3–29 31 Kuhl, P.K. (2004) Early language acquisition: cracking the speech code. Nat. Rev. Neurosci. 5, 831–843 32 Chandrasekaran, C. et al. (2009) The natural statistics of audiovisual speech. PloS Comput. Biol. 5, DOI: 10.1371/journal.pcbi.1000436 33 Devlin, J.T. and Aydelott, J. (2009) Speech perception: motoric contributions versus the motor theory. Curr. Biol. 19, R198–R200
9
Opinion 34 Smith, A.B. et al. (2008) A longitudinal study of speech timing in young children later found to have reading disability. J. Speech Lang. Hear. Res. 51, 1300–1314 35 Stein, J. and Walsh, V. (1997) To see but not to read: the magnocellular theory of dyslexia. Trends Neurosci. 20, 147–152 36 Sperling, A.J. et al. (2005) Deficits in perceptual noise exclusion in developmental dyslexia. Nat. Neurosci. 8, 862–863 37 Facoetti, A. et al. (2010) Multisensory spatial attention deficits are predictive of phonological decoding skills in developmental dyslexia. J. Cogn. Neurosci. 22, 1011–1025 38 Vidyasagar, T.R. and Pammer, K. (2010) Dyslexia: a deficit in visuospatial attention, not in phonological processing. Trends Cogn. Sci. 14, 57–63 39 Witton, C. et al. (1998) Sensitivity to dynamic auditory and visual stimuli predicts nonword reading ability in both dyslexic and normal readers. Curr. Biol. 8, 791–797 40 Nicolson, R.I. et al. (2001) Developmental dyslexia: the cerebellar deficit hypothesis. Trends Neurosci. 24, 508–511 41 Jones, M.R. et al. (2002) Temporal aspects of stimulus-driven attending in dynamic arrays. Psychol. Sci. 13, 313–319 42 Scott, S.K. (1998) The point of P-centres. Psychol. Res. 61, 4–11 43 Wolff, P.H. (2002) Timing precision and rhythm in developmental dyslexia. Read. Writ. 15, 179–206 44 Huss, M. et al. (2010) Music, rhythm, rise time perception and developmental dyslexia: perception of musical meter predicts reading and phonology. Cortex DOI: 10.1016/j.cortex.2010.07.010 45 Menell, P. et al. (1999) Psychophysical sensitivity and physiological response to amplitude modulation in adult dyslexic listeners. J. Speech Lang. Hear. Res. 42, 797–803 46 Lorenzi, C. et al. (2000) Use of temporal envelope cues by children with developmental dyslexia. J. Speech Lang. Hear. Res. 43, 1367–1379 47 Talcott, J.B. et al. (2000) Dynamic sensory sensitivity and children’s word decoding skills. Proc. Natl Acad. Sci. U. S. A. 97, 2952–2957 48 McAnally, K.I. and Stein, J.F. (1997) Scalp potentials evoked by amplitude modulated tones in dyslexia. J. Speech Lang. Hear. Res. 40, 939–945 49 Abrams, D.A. et al. (2009) Abnormal cortical processing of the syllable rate of speech in poor readers. J. Neurosci. 29, 7686–7693 50 Ziegler, J.C. et al. (2009) Speech-perception-in-noise deficits in dyslexia. Dev. Sci. 12, 732–745 51 Penolazzi, B. et al. (2008) Delta EEG as a marker of dysfunctional linguistic processing in developmental dyslexia. Psychophysiology 45, 1025–1033 52 Bogliotti, C. et al. (2008) Discrimination of speech sounds by children with dyslexia. J. Exp. Child Psychol. 101, 137–155 53 Blau, V. et al. (2009) Reduced neural integration of letters and speech sounds links phonological and reading deficits in adult dyslexia. Curr. Biol. 19, 503–508 54 Banai, K. et al. (2009) Reading and subcortical auditory function. Cereb. Cortex 19, 2699–2707
10
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1 55 Chandrasekaran, B. et al. (2009) Context-dependent encoding in the human auditory brainstem relates to hearing speech in noise: implications for developmental dyslexia. Neuron 64, 311–319 56 Skottun, B.C. (2000) The magnocellular deficit theory of dyslexia: the evidence from contrast sensitivity. Vision Res. 40, 111–127 57 Pelli, D.G. and Tillman, K.A. (2008) The uncrowded window of object recognition. Nat. Neurosci. 11, 1129–1135 58 Braddick, O. et al. (2003) Normal and anomalous development of visual motion processing: motion coherence and ‘dorsal stream vulnerability’. Neuropsychologia 41, 1769–1784 59 Sigmundsson, H. et al. (2010) Are poor mathematics skills associated with visual deficits in temporal processing? Neurosci. Lett. 469, 248– 250 60 Raberger, T. and Wimmer, H. (2003) On the automaticity/cerebellar deficit hypothesis of dyslexia: balancing and continuous rapid naming in dyslexic and ADHD children. Neuropsychologia 41, 1493–1497 61 Corriveau, K. et al. (2007) Basic auditory processing skills and specific language impairment: a new look at an old hypothesis. J. Speech Lang. Hear. Res. 50, 1–20 62 Corriveau, K. and Goswami, U. (2009) Rhythmic motor entrainment in children with speech and language impairment: tapping to the beat. Cortex 45, 119–130 63 Pasquini, E. et al. (2007) Auditory processing of amplitude envelope rise time in adults diagnosed with developmental dyslexia. Sci. Stud. Read. 11, 259–286 64 Ahissar, M. (2007) Dyslexia and the anchoring-deficit hypothesis. Trends Cogn. Sci. 11, 458–465 65 Ziegler, J.C. (2008) Better to lose the anchor than the whole ship. Trends Cogn. Sci. 12, 244–245 66 Tallal, P. (2004) Opinion – Improving language and literacy is a matter of time. Nat. Rev. Neurosci. 5, 721–728 67 McArthur, G.M. and Bishop, D.V.M. (2001) Auditory perceptual processing in people with reading and oral language impairments: current issues and recommendations. Dyslexia 7, 150–170 68 McAnally, K.I. et al. (1997) Effect of time and frequency manipulation on syllable perception in developmental dyslexics. J. Speech Lang. Hear. Res. 40, 912–924 69 Peretz, I. and Coltheart, M. (2003) Modularity of music processing. Nat. Neurosci. 6, 688–691 70 Griffiths, T.D. et al. (1997) Spatial and temporal auditory processing deficits following right hemisphere infarction: a psychophysical study. Brain 120, 785–794 71 Kotz, S.A. and Schwartze, M. (2010) Cortical speech processing unplugged: a timely subcortico-cortical framework. Trends Cogn. Sci. 14, 392–399 72 Trehub, S.E. and Hannon, E.E. (2006) Infant music perception: domain-general or domain-specific mechanisms? Cognition 100, 73–99 73 Goswami, U. and Szu¨cs, D. (2010) Educational neuroscience, developmental mechanisms: towards a conceptual framework. Neuroimage DOI: 10.1016/j.neuroimage.2010.08.072
Review
Mind the gap: bridging economic and naturalistic risk-taking with cognitive neuroscience Tom Schonberg1,2, Craig R. Fox2,3 and Russell A. Poldrack1,4,5 1
Imaging Research Center, University of Texas at Austin, Austin, TX 78759, USA Department of Psychology, University of California Los Angeles, Los Angeles, CA 90095, USA 3 Anderson School of Management, University of California Los Angeles, Los Angeles, CA 90095, USA 4 Department of Psychology, University of Texas at Austin, Austin, TX 78712, USA 5 Section of Neurobiology, University of Texas at Austin, Austin, TX 78712, USA 2
Economists define risk in terms of the variability of possible outcomes, whereas clinicians and laypeople generally view risk as exposure to possible loss or harm. Neuroeconomic studies using relatively simple behavioral tasks have identified a network of brain regions that respond to economic risk, but these studies have had limited success predicting naturalistic risk-taking. By contrast, more complex behavioral tasks developed by clinicians (e.g. Balloon Analogue Risk Task and Iowa Gambling Task) correlate with naturalistic risk-taking but resist decomposition into distinct cognitive constructs. We propose here that to bridge this gap and better understand neural substrates of naturalistic risktaking, new tasks are needed that: are decomposable into basic cognitive and/or economic constructs; predict naturalistic risk-taking; and engender dynamic, affective engagement. Defining risk When economists and clinical psychologists characterize behavior as ‘risky’, they use the same word but mean different things. Risk in the economics and finance literatures (e.g. [1]) is usually defined in terms of the variance of possible monetary outcomes, and risk seeking is defined as a preference for a higher variance payoff, holding expected value (EV) constant. By contrast, when clinicians and lay people identify behaviors as risky (e.g. drug use, unprotected-sex, or mountain climbing) they invoke a broader meaning of the term. Clinicians typically define risky behavior as behavior that can harm oneself or others [2]. Interviews with experienced managers suggest that they also tend to see risk in terms of possible negative outcomes, rather than conceiving it in terms of chance probabilities or some quantifiable construct [3]. Psychometric studies have found that the lay conception of riskiness encompasses a ‘dread’ dimension that is characterized by lack of control and/or potential catastrophic consequences, and an ‘unknown’ dimension that is characterized by unobservable, unfamiliar, and/or delayed consequences [4]. This gap in definitions is reflected in distinct approaches to studying risk. Neuroeconomics is a field aimed at underCorresponding author: Poldrack, R.A. (
[email protected]).
standing the neural basis of decision-making by drawing on models from behavioral economics and methods used in cognitive neuroscience [5]. The bulk of the neuroeconomics literature has focused (with substantial success) on disentangling the role of specific brain regions in coding economic variables implicated in traditional expectation-based models of risk-taking (Box 1), or mean–variance models of risktaking used in financial decision theories (Box 2). However, economic paradigms have had limited success in predicting individual differences in naturalistic risk-taking, even in the monetary domain. Meanwhile, clinical psychologists and clinical neuroscientists have advanced behavioral paradigms that better predict real-world risk-taking behaviors and resonate more closely with the lay conception of risk. However, they cannot readily be decomposed to identify separate underlying cognitive and neural mechanisms involved in naturalistic risk-taking. In this review, we propose a research approach that combines the conceptual rigor of neuroeconomics with the predictive validity of clinical neuroscience, thus bridging these disciplines. We believe that such an approach will yield a better understanding of the neural mechanisms involved in risky decision making in both healthy and clinical populations. Neuroeconomics of risk perception and risk-taking Since Knight [6], economists have distinguished decision under risk, in which the decision maker knows the objective probability distribution over possible outcomes, from decision under uncertainty, in which this information is assessed with some degree of vagueness (Box 3). Early neuroimaging studies of risk relied largely on task paradigms (Tables 1 and 2) that manipulate variance in the probability distribution of reward, enabling the identification of neural responses associated with objective risk defined in economic terms. This work has identified riskrelated responses in several regions, mainly the anterior cingulate cortex (ACC), lateral orbitofrontal cortex (OFC) and insula, all of which are also responsive to monetary gains and/or losses. The lateral OFC and ACC were implicated in a positron emission tomography (PET) study coding risk in terms of increased variance owing to differences in probabilities of points lost or gained [7]. These regions, as well as the insula, also responded to different levels of risk in
1364-6613/$ – see front matter ß 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.tics.2010.10.002 Trends in Cognitive Sciences, January 2011, Vol. 15, No. 1
11
Review
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Box 1. Expectation-based models of risk-taking Expectation-based models posit that preferences are a function of the magnitudes and probabilities of possible outcomes. Consider a prospect (x, p) that offers $x with probability p (and nothing otherwise). A basic decision rule is to choose the outcome that maximizes expected value (EV; Equation I): EV ¼ p x: [I] EV maximization implies risk neutrality (e.g. indifference between receiving: (i) $50 for sure, or (ii) a 50% chance to win $100). To accommodate risk aversion, expected utility theory [66] allows the subjective value of money to decrease as wealth increases. This gives rise to a concave utility function, u(.) over states of wealth, W. Decision makers choose the option that maximizes expected utility (EU; Equation II): [II]
EU ¼ p uðxÞ;
where u(x) represents the utility of outcome x. For example, a concave utility function [u00 (x) < 0] implies that gaining $50 (in addition to one’s current state of wealth) adds more than half the utility of gaining $100 (Figure Ia). Therefore, such a utility function implies that a sure $50 is preferred to a 50% chance of $100. A utility function over states of wealth cannot readily accommodate pronounced risk aversion for gambles involving possible losses [67]; neither can it accommodate the commonly observed fourfold pattern of risk preferences: risk aversion for high-probability gains and lowprobability losses, coupled with risk seeking for low-probability gains and high-probability losses. Prospect theory [68,69] accommodates these patterns by proposing that decision makers maximize the value V of a prospect (Equation III): [III]
V ðx; pÞ ¼ w ð pÞ vðxÞ;
[()TD$FIG]
where v(x) measures the subjective value of the consequence x, and w( p) measures the impact of probability p on the attractiveness of the prospect.
(a)
(b)
u
A typical value function v(.), displayed in Figure Ib, is characterized by: (i) reference dependence: it is a function of changes in wealth relative to a reference point, such as the status quo; (ii) diminishing sensitivity: it is concave for gains but convex for losses; and (iii) loss aversion: the loss limb is much steeper than the gain limb. Loss aversion accommodates pronounced risk aversion for mixed (gain– loss) gambles; for example, rejection of a gamble that offers a 50% chance of winning $150 and a 50% chance of losing $100. Tom et al. [70] and De Martino et al. [71] identified neural correlates of loss aversion in humans. Diminishing sensitivity explains a general tendency toward risk aversion for gains (as in expected utility theory) but risk seeking for losses. Reference dependence allows risk preferences to differ depending on whether prospects are described (framed) in terms of gains or losses relative to different reference points. De Martino et al. [72] studied framing susceptibility in humans using fMRI. The weighting function w(.), depicted in Figure Ic, captures diminishing sensitivity to probabilities away from natural boundaries of impossibility ( p = 0) and certainty ( p = 1). The weighting function is characterized by: (i) overweighting of probabilities near zero; (ii) underweighting of probabilities otherwise, especially near 1; and (iii) reduced sensitivity to differences between intermediate probabilities. Overweighting low-probability events can supersede the impact of nonlinearities of the value function, leading to risk seeking for low-probability gains (e.g. the attraction of lottery tickets) and risk aversion for low-probability losses (e.g. the attraction of insurance). Underweighting moderate to high probabilities reinforces the impact of nonlinearities of the value function, leading to risk aversion for high-probability gains and risk seeking for high-probability losses. The weighting function was recently studied using fMRI by Hsu et al. [73], Paulus and Frank [74] and Berns et al. [75].
(c)
v
u(W 0 +$100)
1
w
u(W 0 +$50) 1
2
u(W 0 +$100)
W0
Losses
W0 +$50
Gains
0
W0 +$100 W
Value function
p 1 Weighting function TRENDS in Cognitive Sciences
Figure I. Representative utility, value and weighting functions. (a) an illustration of how expected utility theory explains risk aversion: Utility (u) as a function of increasing wealth (W), starting at an initial level (W0). The utility of gaining $50, u(W0+$50), is more than half the utility of gaining $100, 1/2 u(W0+$100). Thus, according to this function, the individual would rather receive $50 for sure than face a 50% chance of gaining $100 (and nothing otherwise). (b) A representative prospect theory value function depicts subjective value (v) of losing or gaining a particular amount of money relative to the reference point; (c) A representative prospect theory probability weighting function depicts the decision weight (w) as a function of objective probability ( p).
a gambling task, as measured using functional magnetic resonance imaging (fMRI) [8]. The posterior parietal cortex, dorsolateral prefrontal cortex (DLPFC) and anterior insula were found to be more active during a choice of risky versus safe options [9]; in addition, fMRI activity levels in the right insula following a negative outcome were negatively correlated with subsequent risky choices. Similarly, in a study using a financial decision-making paradigm (involving uncertainty and learning), increased activity in anterior insula was associated with subsequent switching by participants from a risky to a safe option [10]. 12
Preuschoff et al. [11] segregated risk (defined as variance of possible outcomes) from expected reward in modeling a similar paradigm to [8]. They found that risk was coded in the ventral striatum, but on a more delayed timescale than the phasic response to the reward prediction error signal that is usually observed in this region. This fMRI signal resembled sustained activity of dopamine neurons from electrophysiological recordings in non-human primates [12] (although see [13]), suggesting that dopamine neurons encode both reward and its variance on different timescales. The authors further used a model-driven approach to study
Review
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Box 2. Risk-value models of risk-taking The risk-value approach to risk-taking, advanced in financial decision theory [1], assumes that preferences are a function of two parameters: risk, operationalized as the variance (or standard deviation) in the probability distribution over possible outcomes, s; and expected value, the mean of that distribution, m. Functions of these two variables define indifference curves reflecting portfolios that a person considers equally attractive (Figure I). A steeper indifference curve represents greater risk aversion because it suggests that a given increase in risk of a portfolio must be accompanied by a greater increase in expected value to maintain its attractiveness. The risk-value approach is appealing from a modeling standpoint because it segregates an objective measure of riskiness from expected reward. Unfortunately, behavioral studies show that perceived riskiness is a function of more than merely variance. For instance, holding variance constant, perceived riskiness can vary with: (i) the absolute magnitude of payoffs; (ii) whether they are perceived as gains or losses; and (iii) skewness of the probability distribution over outcomes. An alternative approach that can accommodate such behavioral tendencies includes a measure of perceived riskiness that can diverge from objective measures (e.g. [76–78]).
[()TD$FIG]
µ Expected Return
More risk-averse
More riskseeking
Riskiness σ TRENDS in Cognitive Sciences
Figure I. Indifference curves for a relatively risk-averse individual and a riskseeking individual in a mean-variance model. Lines are indifference curves that depict the mean (m) and standard deviation (s) of portfolio returns that an individual finds equally attractive. The dashed line represents a relatively riskaverse individual and the solid line a relatively risk-seeking individual.
the concepts of risk prediction and risk-prediction errors [14], suggesting that both are encoded by the anterior insula, again on different timescales. In sum, neuroeconomic studies of risk have implicated many of the same brain regions involved in the processing of monetary gains and/or losses, putatively related to the midbrain dopamine system and its targets, although potentially using different coding schemes and timescales within those same systems. Individual risk attitudes A first step towards linking economic models to naturalistic risk-taking is to identify neural systems in which activity is correlated with individual differences in economic risk attitudes. Recent work has shown that many (but not all) of the brain areas that exhibit sensitivity to economic risk (i.e. variance in the probability distribution over possible outcomes) also reveal individual differences that co-vary with risk preferences. Tobler et al. [15] found positive associations between risk aversion and fMRI signals coding variance of outcomes in lateral OFC, and positive associations with risk-seeking in more medial OFC regions. The same authors [16] also found an
EV-related fMRI signal in lateral OFC that was positively correlated with risk aversion and negatively correlated with risk-seeking. Another study found risk-seeking to be negatively correlated with the fMRI signal in dorsomedial prefrontal cortex (DMPFC), whereas positive correlations were found with reward magnitude signals in ventromedial prefrontal cortex (VMPFC) [17]. A fourth study reported that the fMRI signal in inferior frontal gyrus (IFG) increased during low-risk gambles and this increase was positively correlated with individual risk aversion [18]. Collectively, these studies suggest that individual economic risk preferences modulate brain activity in the regions implicated in risk processing: risk aversion was correlated with lateral PFC regions in OFC, DMPFC and IFG (adjacent to DLPFC), whereas risk seeking was positively correlated with activity in more medial prefrontal cortex regions. Interestingly, the insula was not found to code individual risk attitudes (for more on insula involvement in risk-taking, see [19]). The correlation of risk attitudes with areas in inferior prefrontal cortex accords with previous studies implicating this region in cognitive control and inhibition (see [20]). The DLPFC, a region previously implicated in self-control during decision making (e.g. [21]), has also been implicated in modulation of risk attitudes. Knoch et al. [22] used repetitive Transcranial Magnetic Stimulation (rTMS) to suppress activity in the DLPFC, which led to increased risk-seeking in the Cambridge Gambling Task [7]. Conversely, when excitability of the same regions was increased using transcranial Direct Current Stimulation (tDCS), subjects exhibited increased risk aversion [23]. Thus, the DLPFC might have a key role in the modulation of risk attitudes, even though it has not been implicated in representation of risk per se. Despite this success in mapping neural building blocks of economic risk-taking, such studies have seldom, if ever, attempted to examine the association between individual differences in neural response to economic risk and naturalistic risk-taking behavior. In fact, laboratory measures of economic risk attitudes have rarely been used to predict naturalistic risk-taking (or perhaps they have just rarely succeeded). A few studies have had modest success predicting naturalistic financial risk-taking from laboratory measures (e.g. hog farmers who were more risk averse for lotteries were also more likely to hedge on the hog futures market [24]). In other studies, researchers have predicted naturalistic risk-taking behaviors from psychometric measures of risk tolerance (e.g. citizens who said they were more risk tolerant were more likely to move from one part of Germany to another [25]) or the association between distinct real-world manifestations of risk-taking (e.g. choice of labor contracts with different levels of income risk could be predicted from other naturalistic behaviors, such as expenditures on gambling and insurance [26]). It bears mentioning that there might be inherent limits to the proportion of variance in naturalistic risk-taking behavior that can be explained using any measure of risk preference. First, there is substantial variation in individual risk preferences across life domains, although these probably reflect differences in perceived risks and/or benefits of such activities [27,28]. Second, several situational 13
Review
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Box 3. From risk to uncertainty Most naturalistic decisions, other than simple games of chance, must be made with incomplete knowledge of the probability distribution over possible outcomes. Subjective expected utility theory (SEU) [79] accommodates uncertainty by replacing objective probabilities with subjective probabilities, inferred from choices, which are assumed to accord with standard axioms of probability theory. However, empirical studies of decision under uncertainty raise challenges to this model that can be accommodated by an extension of prospect theory from risk to uncertainty [69]. In particular: Subjective probabilities are not additive. If one asks a bettor how much she is willing to pay to bet on each of several horses entered in a race, her prices would typically sum to more than the total prize paid for picking the winning horse. Under SEU with concave utility, this implies subjective probabilities that sum to more than one. This is because the tendency to overweight unlikely events and underweight probable events (captured by the inverse S-shaped weighting function under risk; Box 1), is amplified by similar bias in the subjective assessment of probabilities [80–82].
variables can influence risk perception and risk preferences. These range from the way in which prospects are framed (e.g. in terms of gains and losses [29]), depicted (e.g. as a bar graph or density function [30]) or labeled (e.g. Republicans reminded of their political affiliation were subsequently more attracted to options labeled ‘conservative’ [31]) to the way in which preferences are elicited (e.g. by pricing risky prospects versus choosing between them [32,33]). Third, economic risk-preferences co-vary with state variables, including specific emotions (e.g. people are apparently more risk seeking when angry than when fearful [34]) and motivational state (e.g. whether one is in an aspirational or protective mode [35]). Characterizing the components of naturalistic risktaking behavior The neuroeconomic perspective on risk-taking has begun to lay a foundation for understanding how the brain responds to risky monetary payoffs, but the question remains how to bridge the gap with risk-taking in situ. To do so, one first needs to characterize risk-taking in naturalistic environments. A popular inventory of such behaviors, the domain-specific risk-attitude scale (DOSPERT; [28]) identifies five domains of risk-taking (recreational, financial, health, social and ethical) that differ across individuals according to their self-reports. Such behaviors (e.g. extreme sports, investing in stocks, smoking, taking the unpopular stand in a social discussion or cheating in a tax return) all entail a potential negative outcome and variance of possible outcomes. However, we argue that willingness to accept variance in outcomes or negative outcomes does not fully capture what drives participation in such ‘risky’ behaviors. In fact, several factors distinct from the economic conception of risk preference might contribute to what has been called ‘risky’ behavior in the field. Consider, for example, the choice to engage in unprotected sex. This decision could stem from: (i) underestimating the likelihood of negative consequences; (ii) discounting possible negative consequences because they are in the future; or (iii) bowing to social pressure or perceived norms. Only after one controls for such factors, and also related 14
People generally find uncertainty aversive. Ellsberg [83] devised a problem involving an urn with 50 red balls and 50 black balls, and an urn with 100 red and black balls in unknown proportion. He asserted that most people would rather bet that they would blindly draw a red (black) ball from the urn with known probabilities than a red (black) ball from the urn with unknown probabilities. This aversion to betting on events with vague probabilities (‘ambiguity aversion’) has since been validated and modeled in numerous studies (reviewed in [84]). It appears to be driven by an aversion to betting in situations in which one feels relatively ignorant or incompetent [85–87]. Neuroimaging studies of ambiguity aversion have aimed to identify brain mechanisms that code risk and ambiguity. Hsu et al. [88] and Levy et al. [89] conclude that the same regions code both, only to a different degree. However, Huettel et al. [90] and Bach et al. [91] conclude that distinct regions code risk and ambiguity. This disagreement might be due to differences in empirical paradigms, and further studies are needed.
constructs such as sensation-seeking and impulsivity, can one distill what might be properly deemed individual ‘risk preference’ and identify economic factors contributing to naturalistic risk-taking behavior. Even if one is successful in mapping distilled measures of naturalistic risk-taking onto economic variables, these ‘cold’ cognitive constructs still fail to capture fully what are largely emotional decisions. In an influential survey, Loewenstein et al. [36] observed that risky decisions are driven not just by anticipated emotions that a decision maker associates with possible consequences, but also ‘anticipatory’ emotions experienced at the time of the decision. Although these researchers emphasized negative emotions, such as fear and anxiety, we suggest that positive emotions also have an important role in risk-taking behavior: for example, the exhilaration of waiting for a roulette ball to land in its slot or driving a car beyond the speed limit (see also [37] on ‘need for arousal’). Decomposing current naturalistic risk-taking tasks Well-designed neuroeconomic tasks have been relatively decomposable (Table 2), but as discussed above, they often lack external validity. Two prominent behavioral paradigms have had unique success predicting naturalistic risk-taking behaviors. The first is the Iowa Gambling Task (IGT), described in Table 1. The original study using this task showed that patients with vmPFC lesions who exhibited ‘real-life’ risky behaviors were impaired on the task [38] (for a recent fMRI study with healthy subjects showing differences in this region, see [39]). Patients with lesions in the amygdala, DLPFC, OFC or DMPFC, and other clinical populations, such as drug abusers, alcoholics and pathological gamblers, were also found to be impaired on the IGT (for a critical review, see [40]). Whereas the ‘bad’ decks are indeed ‘riskier’ in an economic sense, increased variance in this case is confounded with lower expected value. Moreover, risk preferences are confounded with the need to learn the long-term EV of the decks (for critiques, see [41,42]). Thus, it is almost impossible to determine the degree to which individual differences in behavior in the IGT reflect differences in learning, risk attitudes, and/or sensitivity to gain and/or loss magnitude (however, a
Review
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Table 1. Risk tasks used in studies cited in the main text Study with task cited in main text [7]
Brief task description
[8]
[8]
Two cards are drawn without replacement from a deck containing cards numbered from one to ten (one of each). After the first card is presented, participants bet whether the next card will be higher or lower than the first card. Thus, there is maximal risk when the first card is five or six, zero risk when it is ten or one.
[9]
[9]
On each trial, participants must respond quickly to receive a small sure gain of 20 points. A longer wait involves potential higher gain or loss of either 40 points (longer wait) or 80 points (longest wait). All choices have the same expected value.
Behavioral Investment Allocation Strategy (BIAS) [10]
[10]
On each trial, participants choose between two stocks (gain/loss gambles, one stochastically dominating the other) and one bond (a sure gain of $1). They must learn through trial-anderror the characteristics of the stocks, which change over blocks of trials. Feedback on payoffs of the forgone options is presented on each trial.
[11]
[11,14]
Similar to [8] but participants bet on whether the second card will be higher or lower before seeing the first card.
[15]
[15]
Each of 12 stimuli (circles of different colors, numbers and sizes) is associated with a different reward magnitude and probability. These include all combinations of (100 and 200) point rewards with (0, 0.25, 0.5, 0.75 and 1) probabilities, plus 300 and 400 rewards with 0.5 probability. Participants are first trained to learn the probabilities and outcomes associated with each stimulus. Next, on each trial, a stimulus appears in one of four quadrants of the screen, and participants indicate which quadrant using a button press.
[16]
[18]
[18]
Experiment 1: on each trial, participants choose between a risky and safe option. The risky option is a lottery that offers a 50–50 chance of different outcomes (£10, £90 or £40, £60) and the safe option offers the participants’ own certainty equivalent for the corresponding risky lottery, as determined in a previous phase of the experiment. Experiment 2: as in Experiment 1, on each trial, participants choose between a risky and safe option. This time, possible outcomes of the risky option include (£10, £50), (£15, £45), (£40, £80) and (£30, £90), and the safe options offer a range of semi-random values.
[16]
The Cups Task [95]
[17]
On each trial, participants choose between a risky and safe option. Each trial involves either gains or losses. The options are presented as a choice of cups. The risky option involves two to five cups, one containing a gain (loss) of $2, $3 or $5, and the others containing $0. If the latter option is selected, the payoff from one cup is selected at random. The safe cup offers a sure gain (loss) $1.
Iowa Gambling Task [38]
[38]
On each trial, participants select a card from one of four decks; two ‘bad’ decks offer a higher reward on most trials but also higher possible loss and lower overall expected value, whereas two ‘good’ decks offer a lower reward on most trials but lower possible loss and higher expected value. Participants learn the nature of the decks through trial-anderror. In some versions of the task, the probabilities are not stationary.
[39]
Balloon Analogue Risk Task [44]
[44]
On each trial, participants pump a simulated balloon without knowing when it will explode. Each pump increases the potential reward to be gained but also the probability of explosion, which wipes out all potential gains for that trial. In most studies, balloon explosion probabilities are drawn from a uniform distribution, and participants must learn explosion probabilities through trial-and-error.
[52,53]
Devil’s Task [55]
[54]
This task is a forerunner to the BART: on each trial, participants decide how many of seven treasure chests to open. They are informed that six boxes contain a prize and one box contains a ‘devil’ that will cause them to lose all their potential gains on that trial. Similar to the BART, participants make sequential choices and, after opening each chest, decide whether to continue to the next chest or cash in their earnings to that point.
Task name [Original author] Cambridge Gambling Task [7]
A token is hidden under one of six boxes that are each one of two colors. Different trials have different ratios between box colors (3:3, 4:2, 5:1). On each trial, participants choose a color on which to bet. The color with the higher probability (more boxes) is associated with lower potential gains and lower potential losses of points than is the color with lower probability.
computational model of distinct components of the task is presented in [43]). A second task that has successfully predicted naturalistic risk-taking is the Balloon Analogue Risk Task (the ‘BART’) [44], described in Table 1. The average number of pumps a person tolerates in the task was found to correlate with self-reported drinking, smoking, stealing and substance use in healthy adults and adolescents [44–50],
Used by other studies cited in main text [22,23]
[11,14] use a similar task described in the corresponding row below
but interestingly, not with performance on the IGT [44] (but see [51]). Recent neural research on the BART implicates the DLPFC in risk-taking. Using fMRI, Rao et al. [52] compared active risk-taking/pumping versus passive pumping on the task, and found that DLPFC activity was higher during active risk-taking. Further evidence for the role of the lateral PFC in risk-taking in the BART 15
Review
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Table 2. Decomposition of specific constructs that are isolated by tasks listed in Table 1 Studies with task cited in main text
Contrast used in study
[7,22,23]
Variance of outcomes
Probability of gain
Probability of loss
Expected value
Magnitude of gain
Magnitude of loss
Risk conditions versus a control task
+b
+
+
+
+
+
[8]
Different risk levels during anticipation of second card
+
+
+
+
[9]
Risky options versus safe option
+
+
+
+
+
[10]
Compared to a rational choice determined by a computational learning model
+
+
+
+
+
[11,14]
Contrast 1: variance of outcomes Contrast 2: EV
+
+
+
+
+
[15]
Contrast 1: variance of outcomes Contrast 2: EV
+
+
[18]
Uncertainty
a
+
?c ?
Contrast 1: risky option versus safe Contrast 2: EV
[17]
Risky cups versus safe cup
[38]
Low EV decks versus high EV decks
[44,52,53]
Increasing number of pumps
[54]
Average number of chests open
+
+ +
+
+
+
+
+ +
+
+
+
+
+
+
+
+
+
+
+
+
+
+ d
+
+
+
+
+
+
+
+
+
+
+
+
+
a
Although tasks are often described as identifying a single cognitive or economic construct of interest, many tasks also engage additional potentially confounding processes. This table presents a decomposition of the specific constructs that are engaged by the tasks listed in Table 1. For each task, the contrast of interest that was used to measure risk (or expected value) is analyzed to identify the cognitive or economic constructs it also manipulated (listed in the top row of the table). Some of the studies listed in the table accounted for these confounds using parametric statistical modeling. b + indicates that the relevant construct (column) is engaged by that contrast (row). c ? indicates unclear involvement of the relevant construct in the task. d In one condition the expected value is equal between the two options.
was provided in a study [53] that used bilateral tDCS putatively to enhance excitability in DLPFC, resulting in decreased risk-taking/pumping behavior in the BART. Gianotti et al. [54] used the similar Devil’s task [55], which requires no learning (Table 1). They reported that greater risk-taking was positively correlated with lower tonic EEG activity (delta and theta bands) in right-lateral prefrontal cortex, consistent with a negative association between lateral prefrontal cortex engagement and risk-taking. Jentsch et al. developed a version of the BART for rodents [56] and found that temporary inactivation of a region homologous to the human DLPFC resulted in increased variability in behavior and sub-optimal performance, whereas inactivation of the OFC homolog resulted in overall decreased risk-taking. Together, these results suggest a convergence in the neural basis of risky choice between neuroeconomic paradigms and more naturalistic tasks: increased activity in the DLPFC (primarily in the right hemisphere) underlies risk avoidance and self-control, whereas increased activity in the OFC underlies risktaking. Although the BART is attractive owing to its predictive validity, it does not lend itself well to decomposition. In 16
particular, a task analysis reveals that every pump increases the probability of explosion and the variance of possible outcomes, but (similar to the IGT) this increased risk is confounded with varying expected value. Moreover, because the probability distribution of explosions is unknown to subjects, this task also involves learning under uncertainty (see [57] for a computational model of behavior in the BART and [51] for comparison of models of BART and IGT). A modified version of this task in which ‘explosion’ probabilities are transparent remains correlated with self-reported naturalistic risk-taking [58], suggesting that these associations do not necessarily reflect the learning component, but decomposition of this task remains challenging. Exhilaration and tension in naturalistic risk-taking Despite the limitations of BART, it has appealing features. First, as discussed above, it predicts self-reported measures of naturalistic risk-taking reasonably well and distinguishes clinical populations. Second, it uses a familiar naturalistic metaphor that engenders a strong affective response (a sense of escalating tension and exhilaration) that mimics the affective phenomenological experience of
Review risk-taking in naturalistic environments, which could partially explain its capacity to predict naturalistic risk-taking behaviors. Another task that appears to tap directly into the affective dimension of risk-taking is a variation of the ‘near-miss’ paradigm (see [59]) developed by Clark et al. [60]. The task imitates a slot machine with two reels, each with six icons: the icon on the first reel is fixed either by the participant or a computer, and the second reel spins on each trial. Participants rated ‘near-miss’ losses in which the second reel stopped one position away from a ‘match’ as more unpleasant than ‘far-miss’ losses in which the second reel was farther from matching. Interestingly, they also rated near misses as more motivating for continued play than were far miss losses. This was only the case for trials in which participants had personal control by fixing the position of the first reel themselves. Areas in both anterior insula and ventral striatum were found to be more active during near misses versus full misses (although both reflect the same objective loss, they entail varying degrees of subjective regret for one’s choice, cf. [61]). Moreover, Chase and Clark [62] found that among gamblers, fMRI activity in dopaminergic midbrain regions during near-miss events correlated positively with gambling severity. These results suggest that individual differences in risk attitudes (at least in the case of gambling) may be driven by individual differences in dopaminergic response (see [63]), in this case to events coding loss but that might simultaneously be experienced as exhilarating and motivating for further action. It is worth noting that reward prediction error signals in the striatum reach their peak during adolescence [64], a time of heightened risk-taking, consistent with a role for dopamine in risk-taking. Bridging the gap To bridge the gap between economic models and naturalistic risk-taking behaviors, we suggest that the former models must incorporate both the positive and negative affective dimensions of risk-taking, through empirical paradigms that can capture them in more compelling ways. We thus propose three criteria for such new laboratory paradigms:(i) Decomposable: the tasks must allow for decomposition and analysis in terms of cognitive and economic primitives (e.g. magnitude of gains and losses, and probabilities), both for the sake of conceptual clarity and as a prerequisite for identifying neural mechanisms using functional imaging and other tools of behavioral neuroscience.(ii) Externally valid: the tasks must exhibit empirical associations with naturalistic risk-taking behaviors in healthy or clinical populations and/or enable one to distinguish between them. Naturally, a requirement for validity is reliability of such measures (on reliability of fMRI, see [65]).(iii) Emotionally engaging: the tasks must not only capture static and cognitive dimensions of risk-taking (e.g. an evaluation of the probability distribution over possible outcomes), but also engage dynamic and affective dimensions (e.g. the hope, exhilaration, tension, and/or fear that might accompany risky behaviors). From our reading, no single task yet conforms to all three criteria. We argue that new tasks that do conform will offer greater promise in helping identify behavioral and neural factors that predict naturalistic risk-taking.
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Box 4. Questions for future research How to do neural representations of risk differ between static or description-based tasks and dynamic or experience-based tasks [92]? What are the neural correlates of alternative, non-compensatory strategies for risky choice, such as choosing the option that minimizes overall probability of losing (for an early attempt, see [93])? To what extent do neural representations of risk differ across different domains of real-world naturalistic risk-taking, and to what extent is there a ‘common pathway’ or set of regions for risk processing in the brain? To what degree are representations of risk coded by patterns of activity across relevant regions (on this method see [94]) rather than their overall activation?
For instance, the recently developed Columbia Card Task (CCT) [37] is dynamic and affective, and appears to be decomposable. It remains to be seen whether cognitive primitives of the CCT can be isolated using current modeling techniques in a neuroimaging study, and its predictive validity is yet to be formally established. As noted above, behavior in any task might vary systematically with state variables, such as arousal or motivation of participants at the time of elicitation, just as naturalistic risk-taking does. This presents both a challenge to establishing predictive validity and an opportunity to determine moderators of emotional engagement. Concluding remarks There is still a great distance to cover in bridging the gap between economic and naturalistic risk-taking, which we suggest will require development of new empirical paradigms. Many existing paradigms exhibit one or two of the three criteria suggested above. For instance, most tasks in the neuroeconomics literature are decomposable but are not especially predictively valid or emotionally engaging. By contrast, tasks in the naturalistic side of the divide, such as the BART and IGT, tend to be emotionally engaging and predictively valid, but not particularly decomposable. The ‘near-miss’ paradigm [60,62] provides another example of an emotionally engaging and externally valid task that is decomposable; however, it does not entail a risky decision and thus is not designed to decompose performance into economic variables related to risk-taking. We propose that progress in understanding the neural systems underlying naturalistic (including clinical and abnormal) risk-taking awaits development of tasks that fulfill all of these criteria (see also Box 4). Acknowledgments We thank Eliza Congdon, Adriana Galvan, Liat Hadar, Brian Knutson, Elke Weber and an anonymous reviewer for their helpful comments on an earlier version of this article. This work was supported by the National Institues of Health (NIH RO1MH082795 to R.P.). T.S. would like to thank the United States-Israel Educational Foundation (Fulbright post-doctoral fellowship) for financial support.
References 1 Markowitz, H. (1952) Portfolio selection. J. Finance 7, 77–91 2 Steinberg, L. (2008) A social neuroscience perspective on adolescent risk-taking. Dev. Rev. 28, 78–106 3 March, J.G. and Shapira, Z. (1987) Managerial perspectives on risk and risk taking. Manage. Sci. 33, 1404–1418 17
Review 4 Slovic, P. (1987) Perception of risk. Science 236, 280–285 5 Glimcher, P.W. et al. (2008) Neuroeconomics: Decision Making and the Brain, Academic Press 6 Knight, F. (1921) Risk, Uncertainty and Profit, Hougton-Mifflin 7 Rogers, R.D. et al. (1999) Choosing between small, likely rewards and large, unlikely rewards activates inferior and orbital prefrontal cortex. J. Neurosci. 19, 9029–9038 8 Critchley, H.D. et al. (2001) Neural activity in the human brain relating to uncertainty and arousal during anticipation. Neuron 29, 537–545 9 Paulus, M.P. et al. (2003) Increased activation in the right insula during risk-taking decision making is related to harm avoidance and neuroticism. Neuroimage 19, 1439–1448 10 Kuhnen, C.M. and Knutson, B. (2005) The neural basis of financial risk taking. Neuron 47, 763–770 11 Preuschoff, K. et al. (2006) Neural differentiation of expected reward and risk in human subcortical structures. Neuron 51, 381–390 12 Fiorillo, C.D. et al. (2003) Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299, 1898–1902 13 Niv, Y. et al. (2005) Dopamine, uncertainty and TD learning. Behav. Brain Funct. 1:6 14 Preuschoff, K. et al. (2008) Human insula activation reflects risk prediction errors as well as risk. J. Neurosci. 28, 2745–2752 15 Tobler, P.N. et al. (2007) Reward value coding distinct from risk attitude-related uncertainty coding in human reward systems. J. Neurophysiol. 97, 1621–1632 16 Tobler, P.N. et al. (2009) Risk-dependent reward value signal in human prefrontal cortex. Proc. Natl. Acad. Sci. U. S. A. 106, 7185– 7190 17 Xue, G. et al. (2009) Functional dissociations of risk and reward processing in the medial prefrontal cortex. Cereb. Cortex 19, 1019– 1027 18 Christopoulos, G.I. et al. (2009) Neural correlates of value, risk, and risk aversion contributing to decision making under risk. J. Neurosci. 29, 12574–12583 19 Mohr, P.N. et al. (2010) Neural processing of risk. J. Neurosci. 30, 6613– 6619 20 Aron, A.R. et al. (2004) Inhibition and the right inferior frontal cortex. Trends Cogn. Sci. 8, 170–177 21 Hare, T.A. et al. (2009) Self-control in decision-making involves modulation of the vmPFC valuation system. Science 324, 646–648 22 Knoch, D. et al. (2006) Disruption of right prefrontal cortex by lowfrequency repetitive transcranial magnetic stimulation induces risktaking behavior. J. Neurosci. 26, 6469–6472 23 Fecteau, S. et al. (2007) Diminishing risk-taking behavior by modulating activity in the prefrontal cortex: a direct current stimulation study. J. Neurosci. 27, 12500–12505 24 Pennings, J.M.E. and Smidts, A. (2000) Assessing the construct validity of risk attitude. Manage. Sci. 46, 1337–1348 25 Jaeger, D.A. et al. (2009) Direct evidence on risk attitudes and migration. Rev. Econ. Stat. 92, 684–689 26 Brown, S. et al. (2006) Risk preference and employment contract type. J. R. Stat. Soc. A 169, 849–863 27 Hanoch, Y. et al. (2006) Domain specificity in experimental measures and participant recruitment. Psychol. Sci. 17, 300–304 28 Weber, E.U. et al. (2002) A domain-specific risk-attitude scale: measuring risk perceptions and risk behaviors. J. Behav. Decis. Making 15, 263–290 29 Tversky, A. and Kahneman, D. (1986) Rational choice and the framing of decisions. J. Bus. 59, S251–S278 30 Weber, E.U. et al. (2005) Communicating asset risk: how name recognition and the format of historic volatility information affect risk perception and investment decisions. Risk Anal. 25, 597–609 31 Morris, M.W. et al. (2008) Mistaken identity. Psychol. Sci. 19, 1154– 1160 32 Tversky, A. et al. (1990) The causes of preference reversal. Am. Econ. Rev. 80, 204–217 33 Harbaugh, W.T. et al. (2010) The fourfold pattern of risk attitudes in choice and pricing tasks. Econ. J. 120, 595–611 34 Lerner, J.S. and Keltner, D. (2001) Fear, anger, and risk. J. Pers. Soc. Psychol. 81, 146–159 35 Scholer, A.A. et al. (2010) When risk seeking becomes a motivational necessity. J. Pers. Soc. Psychol. 99, 215–231
18
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1 36 Loewenstein, G.F. et al. (2001) Risk as feelings. Psychol. Bull. 127, 267– 286 37 Figner, B. et al. (2009) Affective and deliberative processes in risky choice: age differences in risk taking in the Columbia Card Task. J. Exp. Psychol. Learn. Mem. Cogn. 35, 709–730 38 Bechara, A. et al. (1994) Insensitivity to future consequences following damage to human prefrontal cortex. Cognition 50, 7–15 39 Lawrence, N.S. et al. (2009) Distinct roles of prefrontal cortical subregions in the Iowa Gambling Task. Cereb. Cortex 19, 1134– 1143 40 Buelow, M.T. et al. (2009) Construct validity of the Iowa Gambling Task. Neuropsychol. Rev. 19, 102–114 41 Maia, T.V. and McClelland, J.L. (2004) A reexamination of the evidence for the somatic marker hypothesis: what participants really know in the Iowa gambling task. Proc. Natl. Acad. Sci. U. S. A. 101, 16075– 16080 42 Dunn, B.D. et al. (2006) The somatic marker hypothesis: a critical evaluation. Neurosci. Biobehav. Rev. 30, 239–271 43 Busemeyer, J.R. and Stout, J.C. (2002) A contribution of cognitive decision models to clinical assessment: decomposing performance on the Bechara gambling task. Psychol. Assess. 14, 253–262 44 Lejuez, C.W. et al. (2002) Evaluation of a behavioral measure of risk taking: the Balloon Analogue Risk Task (BART). J. Exp. Psychol. Appl. 8, 75–84 45 Lejuez, C.W. et al. (2003) The Balloon Analogue Risk Task (BART) differentiates smokers and nonsmokers. Exp. Clin. Psychopharmacol. 11, 26–33 46 Lejuez, C.W. et al. (2003) Evaluation of the Balloon Analogue Risk Task (BART) as a predictor of adolescent real-world risk-taking behaviours. J. Adolesc. 26, 475–479 47 Lejuez, C.W. et al. (2004) Risk-taking propensity and risky sexual behavior of individuals in residential substance use treatment. Addict. Behav. 29, 1643–1647 48 Bornovalova, M.A. et al. (2005) Differences in impulsivity and risktaking propensity between primary users of crack cocaine and primary users of heroin in a residential substance-use program. Exp. Clin. Psychopharmacol. 13, 311–318 49 Aklin, W.M. et al. (2005) Evaluation of behavioral measures of risk taking propensity with inner city adolescents. Behav. Res. Ther. 43, 215–228 50 Lejuez, C.W. et al. (2007) Reliability and validity of the youth version of the Balloon Analogue Risk Task (BART-Y) in the assessment of risktaking behavior among inner-city adolescents. J. Clin. Child Adolesc. Psychol. 36, 106–111 51 Bishara, A.J. et al. (2009) Similar processes despite divergent behavior in two commonly used measures of risky decision making. J. Behav. Decis. Making 22, 435–454 52 Rao, H. et al. (2008) Neural correlates of voluntary and involuntary risk taking in the human brain: an fMRI Study of the Balloon Analog Risk Task (BART). Neuroimage 42, 902–910 53 Fecteau, S. et al. (2007) Activation of prefrontal cortex by transcranial direct current stimulation reduces appetite for risk during ambiguous decision making. J. Neurosci. 27, 6212–6218 54 Gianotti, L.R. et al. (2009) Tonic activity level in the right prefrontal cortex predicts individuals’ risk taking. Psychol. Sci. 20, 33–38 55 Slovic, P. (1966) Risk-taking in children: age and sex differences. Child Dev. 37, 169–176 56 Jentsch, J.D. et al. (2010) Behavioral characteristics and neural mechanisms mediating performance in a rodent version of the Balloon Analog Risk Task. Neuropsychopharmacology 35, 1797–1806 57 Wallsten, T.S. et al. (2005) Modeling behavior in a clinically diagnostic sequential risk-taking task. Psychol. Rev. 112, 862–880 58 Pleskac, T.J. (2008) Decision making and learning while taking sequential risks. J. Exp. Psychol. Learn. Mem. Cogn. 34, 167–185 59 Reid, R.L. (1986) The psychology of the near miss. J. Gambling Stud. 2, 32–39 60 Clark, L. et al. (2009) Gambling near-misses enhance motivation to gamble and recruit win-related brain circuitry. Neuron 61, 481–490 61 Kahneman, D. and Miller, D.T. (1986) Norm theory: comparing reality to its alternatives. Psychol. Rev. 93, 136–153 62 Chase, H.W. and Clark, L. (2010) Gambling severity predicts midbrain response to near-miss outcomes. J. Neurosci. 30, 6180–6187
Review 63 Reuter, J. et al. (2005) Pathological gambling is linked to reduced activation of the mesolimbic reward system. Nat. Neurosci. 8, 147–148 64 Cohen, J.R. et al. (2010) A unique adolescent response to reward prediction errors. Nat. Neurosci. 13, 669–671 65 Bennett, C.M. and Miller, M.B. (2010) How reliable are the results from functional magnetic resonance imaging? Ann. N. Y. Acad. Sci. 1191, 133–155 66 von Neumann, J. and Morgenstern, O. (1944) Theory of Games and Economic Behavior, Princeton University Press 67 Rabin, M. (2000) Risk aversion and expected-utility theory: a calibration theorem. Econometrica 68, 1281–1292 68 Kahneman, D. and Tversky, A. (1979) Prospect theory: an analysis of decision under risk. Econometrica 47, 263–291 69 Tversky, A. and Kahneman, D. (1992) Advances in prospect theory: cumulative representation of uncertainty. J. Risk Uncertain. 5, 297– 323 70 Tom, S.M. et al. (2007) The neural basis of loss aversion in decisionmaking under risk. Science 315, 515–518 71 De Martino, B. et al. (2010) Amygdala damage eliminates monetary loss aversion. Proc. Natl. Acad. Sci. U. S. A. 107, 3788–3792 72 De Martino, B. et al. (2006) Frames, biases, and rational decisionmaking in the human brain. Science 313, 684–687 73 Hsu, M. et al. (2009) Neural response to reward anticipation under risk is nonlinear in probabilities. J. Neurosci. 29, 2231–2237 74 Paulus, M.P. and Frank, L.R. (2006) Anterior cingulate activity modulates nonlinear decision weight function of uncertain prospects. Neuroimage 30, 668–677 75 Berns, G.S. et al. (2008) Nonlinear neurobiological probability weighting functions for aversive outcomes. Neuroimage 39, 2047– 2057 76 Pollatsek, A. and Tversky, A. (1970) A theory of risk. J. Math. Psychol. 7, 540–553 77 Sarin, R.K. and Weber, M. (1993) Risk-value models. Eur. J. Oper. Res. 70, 135–149 78 Jia, J. et al. (1999) Measures of perceived risk. Manage. Sci. 45, 519–532 79 Savage, L.J. (1954) The Foundations of Statistics, John Wiley & Sons
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1 80 Tversky, A. and Fox, C.R. (1995) Weighing risk and uncertainty. Psychol. Rev. 102, 269–283 81 Fox, C.R. and Tversky, A. (1998) A belief-based account of decision under uncertainty. Manage. Sci. 44, 879–895 82 Wu, G. and Gonzalez, R. (1999) Nonlinear decision weights in choice under uncertainty. Manage. Sci. 45, 74–85 83 Ellsberg, D. (1961) Risk, ambiguity, and the savage axioms. Q. J. Econ. 75, 643–669 84 Camerer, C. and Weber, M. (1992) Recent developments in modeling preferences: uncertainty and ambiguity. J. Risk Uncertain. 5, 325–370 85 Heath, C. and Tversky, A. (1991) Preference and belief: Ambiguity and competence in choice under uncertainty. J. Risk Uncertain. 4, 5–28 86 Fox, C.R. and Tversky, A. (1995) Ambiguity aversion and comparative ignorance. Q. J. Econ. 110, 585–603 87 Fox, C.R. and Weber, M. (2002) Ambiguity aversion, comparative ignorance, and decision context. Organ. Behav. Hum. Decis. Process. 88, 476–498 88 Hsu, M. et al. (2005) Neural systems responding to degrees of uncertainty in human decision-making. Science 310, 1680–1683 89 Levy, I. et al. (2010) Neural representation of subjective value under risk and ambiguity. J. Neurophysiol. 103, 1036–1047 90 Huettel, S.A. et al. (2006) Neural signatures of economic preferences for risk and ambiguity. Neuron 49, 765–775 91 Bach, D.R. et al. (2009) Neural activity associated with the passive prediction of ambiguity and risk for aversive events. J. Neurosci. 29, 1648–1656 92 Hertwig, R. and Erev, I. (2009) The description-experience gap in risky choice. Trends Cogn. Sci. 13, 517–523 93 Venkatraman, V. et al. (2009) Separate neural mechanisms underlie choices and strategic preferences in risky decision making. Neuron 62, 593–602 94 Kriegeskorte, N. et al. (2008) Representational similarity analysis – connecting the branches of systems neuroscience. Front. Syst. Neurosci. 2:4 95 Weller, J.A. et al. (2007) Neural correlates of adaptive decision making for risky gains and losses. Psychol. Sci. 18, 958–964
19
Review
The critical role of retrieval practice in long-term retention Henry L. Roediger III1 and Andrew C. Butler2 1 2
Department of Psychology, Box 1125, Washington University, One Brookings Drive, St. Louis, MO 63130-4899, USA Psychology & Neuroscience, Duke University, Box 90086, Durham, NC 27708-0086, USA
Learning is usually thought to occur during episodes of studying, whereas retrieval of information on testing simply serves to assess what was learned. We review research that contradicts this traditional view by demonstrating that retrieval practice is actually a powerful mnemonic enhancer, often producing large gains in long-term retention relative to repeated studying. Retrieval practice is often effective even without feedback (i.e. giving the correct answer), but feedback enhances the benefits of testing. In addition, retrieval practice promotes the acquisition of knowledge that can be flexibly retrieved and transferred to different contexts. The power of retrieval practice in consolidating memories has important implications for both the study of memory and its application to educational practice. Introduction A curious peculiarity of our memory is that things are impressed better by active than by passive repetition. I mean that in learning (by heart, for example), when we almost know the piece, it pays better to wait and recollect by an effort within, than to look at the book again. If we recover the words the former way, we shall probably know them the next time; if in the latter way, we shall likely need the book once more. William James [1] Psychologists have often studied learning by alternating series of study and test trials. In other words, material is presented for study (S) and a test (T) is subsequently given to determine what was learned. After this procedure is repeated over numerous ST trials, performance (e.g. the number of items recalled) is plotted against trials to depict the rate of learning; the outcome is referred to as a learning curve and it is negatively accelerated and is fit by a power function. Thus, most learning occurs on early ST trials, and the amount of learning decreases with additional trials. The critical assumption is that learning occurs during the study phases of the ST ST ST. . . sequence, and the test phase is simply there to measure what has been learned during previous occasions of study. The test is usually considered a neutral event. For example, researchers in the 1960 s debated whether learning occurs gradually (e.g. through continual strengthening of memory traces) or in an all-or-none fashion, but they focused on study events as the locus of the effects and Corresponding author: Roediger, H.L. III (
[email protected]).
20
ignored the possibility that learning occurred during the retrieval tests [2–5]. Exactly the same assumption is built into our educational systems. Students are thought to learn via lectures, reading, highlighting, study groups, and so on; tests are given in the classroom to measure what has been learned from studying. Again, tests are considered assessments, gauging the knowledge that has been acquired without affecting it in any way. In this article, we review evidence that turns this conventional wisdom on its head: retrieval practice (as occurs during testing) often produces greater learning and longterm retention than studying. We discuss research that elucidates the conditions under which retrieval practice is most effective, as well as evidence demonstrating that the mnemonic benefits of retrieval practice are transferrable to different contexts. We also describe current theories on the mechanisms underlying the beneficial effects of testing. Finally, we discuss educational implications of this research, arguing that more frequent retrieval practice in the classroom would increase long-term retention and transfer. The testing effect and repeated retrieval The finding that retrieval of information from memory produces better retention than restudying the same information for an equivalent amount of time has been termed the testing effect [6]. Although the phenomenon was first reported over 100 years ago [7], research on the testing effect has been sporadic at best until recently (but see Box 1 for some classic studies). In the last 10 years, much research has shown powerful mnemonic benefits of retrieval practice [8–10] . The data in Figure 1 come from a study in which two groups of students retrieved information several times Glossary Expanding retrieval schedule: testing of retention shortly after learning to make sure encoding is accurate, then waiting longer to retrieve again, then waiting still longer for a third retrieval and so on. Feedback: providing information after a question. General (right or wrong) feedback is not very helpful if the correct answer is not provided. Correct answer feedback usually produces robust gains on a final criterion measure. Negative suggestion effect: taking a test that provides subtly wrong answers (e.g. true or false, multiple choice) can lead students to select a wrong answer, believe it is right, and thus learn an error from taking the test. Retrieval practice: act of calling information to mind rather than rereading it or hearing it. The idea is to produce ‘an effort from within’ to induce better retention. Test-enhanced learning: general approach that promotes retrieval practice via testing as a means to improve knowledge. Testing effect: taking a test usually enhances later performance on the material relative to rereading it or to having no re-exposure at all. Transfer: ability to generalize learning from one context to another or to use learned information in a new way (e.g. to solve a problem).
1364-6613/$ – see front matter ß 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.tics.2010.09.003 Trends in Cognitive Sciences, January 2011, Vol. 15, No. 1
Review
Trends in Cognitive Sciences
Box 1. Classic studies of the testing effect The idea that retrieval practice facilitates retention is old. Some 2300 years before the quote from James that begins this article, Aristotle wrote that ‘Exercise in repeatedly recalling a thing strengthens the memory.’ The first empirical evidence that he was right was provided 100 years ago [7], but other studies were more influential. Six classic studies are described in brief: (i) Gates showed large effects of recitation (retrieval) relative to studying in children in grades 3, 4, 5, 6 and 8 for both nonsense words and brief biographies [92]. He argued that building recitation into the curriculum would benefit learning and retention in the schools. (ii) Jones investigated the effect of testing on retention of lecture material by college students [93]. His impressive series of experiments demonstrated the benefits of retrieval practice in both the classroom and the laboratory. (iii) Spitzer tested 3605 6th graders by having them read 600 word passages and taking tests with various schedules before taking a final test approximately 2 months later [94]. Spitzer showed that testing (retrieval) without feedback enhanced final performance when the initial test occurred within a week or so after learning. (iv) Tulving examined learning of word lists and showed that test events could lead to as much learning as study events [95]. (v) Glover provided evidence to support the idea that successful retrieval is the critical mechanism that produces the mnemonic benefits of testing, ruling out an alternative ‘amount of processing’ explanation [64]. His article, entitled The ‘testing’ phenomenon: not gone but nearly forgotten, helped to revive interest in the testing effect. (vi) Carrier and Pashler conducted a careful series of experiments to correct various defects in prior work and confirmed that retrieval helps later retention [65]. Their paper prompted modern interest in retrieval as a powerful mnemonic aid.
during learning and two other groups were treated similarly but only practiced retrieval once [11]. The figure shows performance on a final test given 1 week later. The two groups that practiced retrieval (without feedback) during [()TD$FIG] .90
1.0
.70
.90
.60 .50 .80
.80
.30 .20
.36
.33
.10 .00
Proportion of items correctly recalled on final test
Proportion correct on final test
learning (the two left bars) recalled substantially more of the pairs than the other two groups. In addition, the groups represented by the two dark blue bars were permitted to study the material several more times than the groups represented by the light blue bars. Yet, repeated study led to virtually no improvement a week later. Retrieval practice provides much greater long-term retention than does repeated study [11–16]. The finding that retrieval practice increases retention raises two important questions. First, what are the best conditions for retrieval? The sooner retrieval is attempted after a study trial or a correct retrieval, the more likely it is to be successful. Short delays between retrievals might foster errorless retrieval. However, it might be that retrieval of information after a short delay is too much like rote rehearsal, which often produces little or no mnemonic benefit [17]. Second, how many retrievals are needed to maximize longterm retention? Retrieval practice takes time, so if only one or two retrievals is enough, then practice can be terminated [18,19]. The questions just raised are thorny ones and might depend on the type of materials, the characteristics of the learner and other factors (for a discussion see [20]). However, a recent study gives a tentative answer to both questions [21]. Students learned 70 Swahili–English word pairs via repeated practice at retrieving the English word when presented with the associated Swahili word. Both the time between successive retrievals (1 min or 6 min) and the number of successful retrievals (1, 3, 5, 6, 7, 8 or 10) were manipulated during the initial practice phase. Figure 2 shows performance on a final test given after a delay of either 25 min (top two lines) or 1 week (bottom two lines). Regardless of the timing of the final test, retrieval practice with 6-min intervening intervals (red lines) led to better retention relative to retrieval practice with 1-min intervening intervals (blue lines). With respect to the number of
[()TD$FIG]
.80
.40
January 2011, Vol. 15, No. 1
.80 .70 .60 .50 .40 .30 .20 .10
Repeated retrieval
One retrieval
Learning condition TRENDS in Cognitive Sciences
Figure 1. Recall after a week for Swahili–English word pairs (mashua–boat) learned with retrieval practice (left bars) or with only a single recall (right bars). Retrieval practice doubled recall on the final test when students were given the Swahili word and asked to recall the English word. The dark blue bars indicate groups to which many more study trials were given than to the groups represented by light blue bars. Repetition of studying had virtually no effect on recall a week later, unlike repeated retrieval. Error bars represent standard errors of the mean. Figure adapted from [11].
.00 1
3
5
6
7
8
10
Criterion level during practice TRENDS in Cognitive Sciences
Figure 2. Recall after 25 min (top two lines) or 1 week (bottom two lines) after varying numbers of correct recalls in an earlier phase of the experiment. When 6 min occurred between retrievals (red lines), performance was better than when only 1 min occurred between tests (blue lines). When only a short interval occurred between retrievals, even recalling the pair ten times failed to improve retention a week later. Figure adapted from [21].
21
Review successful retrievals during initial learning, final test performance generally increased from one to five or seven prior retrievals and then leveled off, so five to seven retrievals seem to be optimal in this paradigm. However, this pattern of performance depended on the time between successive retrievals during initial practice. After a week, only retrieval practice with longer intervening intervals had any effect on performance – practice that occurred every minute produced floor-level performance, no matter how many times the item was successfully retrieved. Retrieval practice can be a potent memory enhancer, but clearly the conditions of retrieval matter. When retrieval occurs under relatively easy (1-min interval) conditions, even ten retrievals might produce little benefit for long-term retention. By contrast, under different conditions, many other studies have shown that even a single test can boost retention [22,23] and that these benefits persist over long delays [14,24]. Still, repeated retrievals usually benefit later retention relative to a single retrieval [14,21,25,26]. Expanding retrieval schedules The data in Figure 2 might be considered surprising in some quarters. For example, researchers who perform behavior analysis [27] or memory remediation among neuropsychological patient populations [28] believe that retrieval attempts should be arranged so that they do not produce errors (errorless retrieval is the watchword in these efforts). The fear is that if an error is produced than it will be learned, making learning of the correct responses more difficult. However, the data in Figure 2 point to a paradox: if retrieval occurs under ‘easy’ conditions in which errors are less likely to be made, the impact of such retrievals on long-term retention might be undermined. Thus, a practical question is whether a strategy exists for retrieval practice that precludes making errors and at the same time permits the type of difficult retrievals that produce better long-term retention. One possible strategy is the expanding schedule of retrieval, which was first proposed by Landauer and Bjork [29]. In this method, a first retrieval attempt occurs shortly after initial learning and subsequent retrieval attempts are staggered so that each successive retrieval occurs after an increasingly long interval. For example, when learning someone’s name, retrieval of the name would occur shortly after meeting the person (say, 1 min) to be sure it is encoded, then after a slightly longer interval (perhaps 4 min), and then after a still longer interval (8 min) before retrieving it a third time, and so on. The idea is to gradually shape long-term retention of the information just as learning can be shaped by reinforcement of successive approximations of the desired behavior [30]. In their influential paper, Landauer and Bjork predicted that expanding retrieval schedules would produce better performance than equal-interval schedules (in which the intervals between retrieval attempts remain constant) or massed schedules (repeated retrieval with no intervening interval) [29]. Indeed, findings from their experiments showed a benefit of an expanding schedule relative to an equal-interval schedule on a final test given after a relatively short retention interval of 30 min. Furthermore, 22
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
both the expanding and equal-interval schedules produced better final retention than did a massed schedule of practice, even though the massed tests provided nearly errorless retrieval. Thus, research comparing different schedules of practice provides additional evidence that repeated retrieval of information immediately after study, even though errorless, produces poor retention [31–33]. Returning to the issue of whether expanding or equalinterval schedules of practice lead to better retention, the answer seems to depend in part on the retention interval. When the final test is given shortly after the learning phase, expanding retrieval seems to be best. However, when long-term retention is measured (i.e. a delay of a day or longer), then prior practice on an equal-interval schedule seems to promote better performance [34,35]. The reason for this flip in performance from immediate to delayed tests might be due to the timing of the initial test: the first test is given almost immediately in an expanding schedule, whereas it is given after a longer delay in an equal-interval schedule. Thus, the equal-interval schedule requires greater retrieval effort on the first test, which should produce better long-term retention. The general conclusion is that the best retrieval schedules are those that involve wide spacing of retrieval attempts, as shown in Figure 2 [21], even if some errors are made [36,37]. To date, evidence shows that expanding retrieval provides better retention after short delays, but equal interval retrieval produces better retention after long delays. However, expanding schedules may show a benefit in future research with expansion that unfolds over days and weeks rather than over seconds (as used in past research). Feedback enhances the testing effect Although retrieval practice promotes superior long-term retention in the absence of feedback (Figure 1), providing the correct answer after a retrieval attempt increases the mnemonic benefits of testing [38,39]. Feedback that includes the correct answer increases learning because it enables test-takers to correct errors [40] and to maintain correct responses [41]. The critical mechanism in learning from tests is successful retrieval; however, if test-takers do not retrieve the correct response and have no recourse to learn it, then the benefits of testing can sometimes be limited or absent altogether [42]. Thus, providing feedback after a retrieval attempt, regardless of whether the attempt is successful or unsuccessful, helps to ensure that retrieval will be successful in the future [41]. The need for feedback is critical after any type of test, but it is particularly important for recognition tests (e.g. multiple choice, true/false, etc.) because test-takers are exposed to incorrect information. For example, on multiple-choice tests, students must identify the correct answer from a number of possible alternative answers (i.e. lures), most of which are plausible but incorrect. The danger is that because students learn from tests, taking a multiplechoice test might cause them to learn incorrect information and believe that it is true. Indeed, recent research has shown that when students select a lure in a multiple-choice test, they often reproduce that incorrect information in a later test [8,43,44]. This outcome even occurs on the SAT test that hundreds of thousands of high school students
Review
Trends in Cognitive Sciences
take every year [45]. Although the potential for negative effects from multiple-choice tests is a real problem, the good news is that there is a simple solution: provide students with feedback. If feedback is provided after a multiple-choice test, the negative effects are completely nullified [46]. Thus, whereas feedback is helpful for all types of tests, it is especially important for multiple-choice and other recognition tests that can lead students to learn incorrect information. Another critical question is the timing of feedback. Conventional wisdom and studies in behavioral psychology indicate that providing feedback immediately after a test is best [27,47]. However, experimental results show that delayed feedback might be even more powerful. In one study, students read passages and then either took or did not take a multiple-choice test [16]. For students who took the test, one group received correct answer feedback immediately after making a response (immediate feedback) and the other group received the correct answers for all questions after the entire test (delayed feedback). One week after the initial learning session, students took a final test in which they had to produce a response to the question that had formed the stem of the multiple-choice item (i.e. they had to produce the answer rather than selecting one from among several alternatives). The final test consisted of the same questions from the initial multiple-choice test and comparable questions that had not been tested. Figure 3 shows the results for the final test. Taking an initial test (even without feedback) tripled final recall relative to only studying the material. When correct answer feedback was given immediately after each question in the initial test, performance increased another 10%. However, feedback given after the entire test boosted final performance even more. The finding that delayed feedback led to better retention than immediate feedback undermines the conventional idea that feedback must be given
[()TD$FIG]
January 2011, Vol. 15, No. 1
immediately to be effective. Although giving the answers to questions soon after a test is still relatively immediate feedback, the superiority of delayed feedback has been replicated numerous times with longer delays [48–51]. The benefits of delayed feedback might represent a type of spacing effect: the phenomenon whereby two presentations of material given with spacing between them generally leads to better retention than massed (back-to-back) presentations [52–55]. Retrieval practice enhances transfer of learning Are the mnemonic benefits of testing limited to the learning of a specific response? One criticism that could be leveled at research on the testing effect is that retrieval practice merely teaches people to produce a fixed response when given a particular retrieval cue, so the procedure simply amounts to drill and practice of a particular response. Thus, a key question is whether testing also promotes transfer of knowledge; that is, can the knowledge gained through testing be flexibly used to construct new responses and answer different questions? Transfer of learning is of critical interest for both theories of memory and educational policy [56]. Researchers have recently begun to explore whether retrieval practice can promote transfer of learning in different contexts [57–59]. For example, Butler [60] investigated whether repeated testing produces better transfer than repeated studying in a series of experiments. In one of the experiments, students studied six prose passages, each of which contained several critical concepts (among other information). A concept was operationally defined as information that had to be extracted from multiple sentences. Next, the students repeatedly restudied two of the passages, repeatedly restudied isolated sentences that contained the critical concepts from another two passages, and repeatedly took a test on the critical concepts for another two passages. After each test
Proportion correct on final test
.60
.50
.40
.30 .54 .43
.20 .33 .10 .11 .00 No test
Test with no feedback
Test with immediate feedback
Test with delayed feedback
Learning condition TRENDS in Cognitive Sciences
Figure 3. Proportion of correct responses on the final cued recall test as a function of initial learning condition. All conditions involving an initial test led to greater final recall than in the No Test condition, but feedback after the initial test led to greater final recall. In addition, delayed feedback (given on each item after the test) led to better recall than did immediate feedback (given after each question was answered). Error bars represent 95% confidence intervals. The figure represents data in Table 2 from [46].
23
Review
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Box 2. Sample materials from Butler [60] The passages used in the study covered a range of topics. The questions below are samples from a passage about bats. Initial test Question: Some bats use echolocation to navigate the environment and locate prey. How does echolocation help bats to determine the distance and size of objects? Answer: Bats emit high-pitched sound waves and listen to the echoes. The distance of an object is determined by the time it takes for the echo to return. The size of the object is calculated by the intensity of the echo: a smaller object will reflect less of the sound wave, and thus produce a less intense echo. Final transfer test Question: An insect is moving towards a bat. Using the process of echolocation, how does the bat determine that the insect is moving towards it (i.e. rather than away from it)? Answer: The bat can tell the direction that an object is moving by calculating whether the time it takes for an echo to return changes from echo to echo. If the insect is moving towards the bat, the time it takes the echo to return will get steadily shorter.
[()TD$FIG]
question, students received feedback that was essentially the same information as that presented in the condition with the restudied isolated sentences. Thus, the key difference between the restudy isolated sentences condition and the repeated testing condition was that students attempted to retrieve the information in the latter condition before getting it to restudy. One week later, students took a final test that required application of each critical concept from the passages to a new inferential question from the same knowledge domain. Examples of materials are shown in Box 2. Figure 4 shows results for the final test. Interestingly, there was virtually no difference between the two repeated
.80
Proportion correct on final test
.70 .60 .50 .40 .30 .20
.58 .41
.44
Re-study passages
Re-study sentences
.10 .00 Repeated test
Learning condition TRENDS in Cognitive Sciences
Figure 4. Proportion of correct responses on the final cued recall test as a function of initial learning condition. The retrieval practice (testing) conditions led to greater transfer relative to repeated restudying of whole passages or restudying of just the sentences containing the critical concepts. Error bars represent 95% confidence intervals. Figure adapted from [60].
24
study conditions even though studying the isolated sentences ostensibly allowed for more time to learn the critical concepts than studying the entire passage. This result fits well with the findings of other studies demonstrating that restudying provides limited benefits for retention (Figure 1) [61]. More importantly, repeated testing led to significantly better transfer than either repeated studying of the passages or repeated studying of the isolated sentences. This finding indicates that the mnemonic benefits of testing extend well beyond the retention of a specific response. In fact, a subsequent experiment in the same series showed that repeated testing produced better transfer relative to repeated studying on new inferential questions about different knowledge domains (e.g. applying knowledge about echolocation in bats to sonar in submarines), a situation that constitutes far transfer according to one definition [56]. Theories of the retrieval practice effects Researchers have intensively studied the effects of retrieval practice and today we know much about conditions that produce the effect. However, theoretical understanding – or even proper theories of the effect – has lagged behind. One idea sometimes invoked to explain retrieval practice (testing) effects is that such practice simply permits reexposure to material and causes overlearning of the set of material that can be retrieved [62,63]. Many experiments have discredited this hypothesis by showing that equating the number of study events to test (retrieval) events does not eliminate the effect [6,64,65]. The data in Figure 2 also show that this idea must be wrong, because with number of retrievals equated at various levels, some conditions produced huge retrieval practice effects and others none at all. In general, theoretical explanations for retrieval practice (testing) effects have focused on how the act of retrieval affects memory. One idea is that retrieval of information from memory leads to elaboration of the memory trace and/ or the creation of additional retrieval routes, which makes it more likely that the information will be successfully retrieved again in the future [22,66,67]. A related idea invokes the notion of retrieval effort to explain the positive effects of retrieval practice [21,68]. Retrieval effort can be thought of as an index of the amount of reprocessing of the memory trace that occurs during retrieval: the more effort involved in retrieving the memory, the more extensive is the reprocessing (which presumably involves elaboration). As discussed above, retrieval practice that occurs under conditions in which information can be easily accessed (e.g. from short-term or working memory) leads to little or no benefit for long-term retention (Figure 2). Yet another explanation relies on the concept of transfer-appropriate processing [69,70], which holds that memory performance is enhanced to the extent that the cognitive processes during learning match those required during retrieval. The processes engaged by taking an initial test provide a better match with final test than the processes involved in studying the material. The new theory of disuse of Bjork and Bjork incorporates these ideas to provide a more formal explanation of retrieval practice effects [71]. The theory distinguishes between storage strength (relative permanence of the
Review memory trace) and retrieval strength (momentary accessibility of a trace). For example, if a weak trace (in terms of storage strength) has recently been retrieved, its retrieval strength will be great for some time afterward. The theory proposes that positive effects of retrieval on storage strength are inversely related to retrieval strength; the greater the retrieval strength, the less is the effect of retrieval on storage strength. This idea would account for the fact that repeated retrieval just after study has little effect and other data such as those in Figure 2. The theories above and others [72] are psychological ones at an abstract level of description. Mechanistic accounts of testing in neuroscientific terms await development. However, we can point to some promising leads. The concept of reconsolidation – the idea that retrieval of a memory places it into a labile state in which the trace can be enhanced or disrupted – has become a topic of considerable interest in neuroscience in the past 10 years [73,74]. The molecular cascade involved in reconsolidation [75,76] will doubtless be involved in explaining the mnemonic benefits of retrieval practice. Interaction between the hippocampus and dopaminergic neurons in the ventral tegmental area (VTA) might provide another piece of the puzzle [77]. When the hippocampus detects information that is relatively unfamiliar, the novelty signal causes firing of dopaminergic cells, which enhances long-term potentiation and thus learning. Retrieval practice might activate the hippocampal–VTA feedback loop, thereby strengthening connections between the neurons that form the memory trace for the retrieved information. However, this process would only occur when the information is relatively unfamiliar (perhaps having low retrieval strength, in terms used above [71]). These ideas are clearly speculative, but might point the way to a more mechanistic account of retrieval practice effects. Educational implications Retrieval practice produces greater long-term retention than studying alone. This finding suggests that testing, which is commonly conceptualized as an assessment tool, can be used as a learning tool as well [78]. In particular, practicing retrieval is beneficial when it requires effortful processing (e.g. production rather than recognition tests), it occurs multiple times with relatively long intervals between retrieval attempts, and it is followed by feedback after each attempt. Under these conditions, tests provide a highly effective means of learning. Educators sometimes decry this approach of what we have called test-enhanced learning [6,9] as involving nothing but drill and practice in which students engage in rote rehearsal. However, when used correctly, retrieval practice techniques help to foster deeper learning and understanding so that knowledge can be flexibly retrieved and transferred to new situations [57–60]. Studies on retrieval practice conducted in educational settings have shown that frequent testing produces substantial benefits to long-term retention [79]. For example, research has demonstrated that retrieval practice improves scores in college courses in biological psychology and statistics [80,81], as well as advanced medical education [82]. In addition, experiments in middle-school history, social studies and science classrooms have shown great
Trends in Cognitive Sciences
January 2011, Vol. 15, No. 1
improvement in children’s knowledge derived from repeated quizzing on delayed tests [83–85]. Importantly, the tests used to measure long-term retention in some of these studies were the actual tests being given to the class for assessment purposes, not ones made up for the sake of an experiment. Testing at the university level provides an indirect benefit that complements the direct benefit that is discussed here. Many university courses require only one or two semester tests and a final exam, a practice that leads to the near universal phenomenon of students concentrating their study attempts just before the exams and not keeping up with the course [86,87]. Frequent quizzing (say, on a weekly or even a daily basis) forces students to stay current with the course by studying more regularly. Classroom studies have shown that students who received daily quizzes performed better than those who did not [81,88]. Importantly, survey questions given at the end of the semester revealed that the students who were frequently quizzed felt they had learned more and reported greater satisfaction with the course, despite (or perhaps because of) the greater effort they exerted [81,88]. In addition, the mnemonic benefits of testing extend beyond the specific information that is tested: retrieval practice can increase retention of related, but non-tested material as well [89–91]. Of course, retrieval practice need not occur only through quizzing or testing in the classroom. Retrieval practice can be implemented in many different ways, including self-testing (e.g. using flash cards, chapter-ending questions, or other methods). Concluding remarks The finding that retrieval practice yields substantial mnemonic benefits validates the quote from William James [1] at the outset: Students’ ‘active repetition’ via attempts to ‘recollect by an effort from within’ provides a much greater boost to retention than does ‘passive repetition’ from an outside source. The research we reviewed makes five points. First, retrieval practice often produces superior long-term retention relative to studying for an equivalent amount of time. Second, repeated testing is better than taking a single test. Third, testing with feedback leads to greater benefits than does testing without feedback, but even the latter procedure can be surprisingly effective. Fourth, to place a caveat on the first three claims, testing under conditions that make retrieval easy (e.g. learning a face–name pair and being tested on it several times immediately) often has surprisingly little effect; some lag between study and test is required for retrieval practice to provide a benefit. Fifth, the mnemonic benefits of retrieval practice are not limited to the learning of a specific response, but rather produce knowledge that can be transferred to different contexts. Integration of retrieval practice into educational practices has the potential to boost performance in schools. Further research is required, however, to understand the mechanisms that give rise to the beneficial effects of retrieval practice. Acknowledgements The authors are supported by a Collaborative Activity Grant from the James S. McDonnell Foundation and a grant from the Cognition and Student Learning Program of the Institute of Education Science in the U.S. Department of Education. 25
Review References 1 James, W. (1890) The Principles of Psychology, Holt 2 Estes, W.K. (1960) Learning theory and the new ‘mental chemistry’. Psychol. Rev. 67, 207–223 3 Postman, L. (1963) One-trial learning. In Verbal Behavior and Learning: Problems and Processes (Cofer, C.N. and Musgrave, B.S., eds), pp. 295–335, McGraw-Hill 4 Rock, I. (1957) The role of repetition in associative learning. Am. J. Psychol. 70, 186–193 5 Underwood, B.J. and Keppel, G. (1962) One-trial learning? J. Verb. Learn. Verb. Behav. 1, 1–13 6 Roediger, H.L., III and Karpicke, J.D. (2006) The power of testing memory: basic research and implications for educational practice. Persp. Psychol. Sci. 1, 181–210 7 Abbott, E.E. (1909) On the analysis of the factors of recall in the learning process. Psychol. Monogr. 11, 159–177 8 Marsh, E.J. et al. (2007) The memorial consequences of multiple-choice testing. Psychonom. Bull. Rev. 14, 194–199 9 McDaniel, M.A. et al. (2007) Generalizing test-enhanced learning from the laboratory to the classroom. Psychonom. Bull. Rev. 14, 200–206 10 Pashler, H. et al. (2007) Enhancing learning and retarding forgetting: choices and consequences. Psychonom. Bull. Rev. 14, 187–193 11 Karpicke, J.D. and Roediger, H.L., III (2008) The critical importance of retrieval for learning. Science 15, 966–968 12 Carpenter, S.K. et al. (2008) The effects of tests on learning and forgetting. Mem. Cogn. 36, 438–448 13 Kuo, T. and Hirshman, E. (1996) Investigations of the testing effect. Am. J. Psychol. 109, 451–464 14 Roediger, H.L., III and Karpicke, J.D. (2006) Test-enhanced learning: taking memory tests improves long-term retention. Psychol. Sci. 17, 249–255 15 Toppino, T.C. and Cohen, M.S. (2009) The testing effect and the retention interval: questions and answers. Exp. Psychol. 56, 252–257 16 Wheeler, M.A. et al. (2003) Different rates of forgetting following study versus test trials. Memory 11, 571–580 17 Craik, F.I.M. and Watkins, M.J. (1973) The role of rehearsal in shortterm memory. J. Verb. Learn. Verb. Behav. 12, 599–607 18 Pyc, M.A. and Rawson, K.A. (2007) Examining the efficiency of schedules of distributed retrieval practice. Mem. Cogn. 35, 1917– 1927 19 Kornell, N. and Bjork, R.A. (2008) Optimising self-regulated study: the benefits – and costs – of dropping flashcards. Memory 16, 125– 136 20 McDaniel, M.A. and Butler, A.C. (2010) A contextual framework for understanding when difficulties are desirable. In Successful Remembering and Successful Forgetting: Essays in Honor of Robert A. Bjork (Benjamin, A.S., ed.), pp. 175–199, Psychology Press 21 Pyc, M.A. and Rawson, K.A. (2009) Testing the retrieval effort hypothesis: does greater difficulty correctly recalling information lead to higher levels of memory? J. Mem. Lang. 60, 437–447 22 Carpenter, S.K. (2009) Cue strength as a moderator of the testing effect: the benefits of elaborative retrieval. J. Exp. Psychol. Learn. Mem. Cogn. 35, 1563–1569 23 Carpenter, S.K. and DeLosh, E.L. (2006) Impoverished cue support enhances subsequent retention: support for the elaborative retrieval explanation of the testing effect. Mem. Cogn. 34, 268–276 24 Butler, A.C. and Roediger, H.L., III (2007) Testing improves long-term retention in a simulated classroom setting. Eur. J. Cogn. Psychol. 19, 514–527 25 Hogan, R.M. and Kintsch, W. (1971) Differential effects of study and test trials on long-term recognition and recall. J. Verb. Learn. Verb. Behav. 10, 562–567 26 Wheeler, M.A. and Roediger, H.L., III (1992) Disparate effects of repeated testing: reconciling Ballard’s (1913) and Bartlett’s (1932) results. Psychol. Sci. 3, 240–245 27 Skinner, B.F. (1954) The science of learning and the art of teaching. Harv. Educ. Rev. 24, 86–97 28 Baddeley, A.D. and Wilson, B.A. (1994) When implicit learning fails: amnesia and the problem of error elimination. Neuropsychologia 32, 53–68 29 Landauer, T.K. and Bjork, R.A. (1978) Optimum rehearsal patterns and name learning. In Practical Aspects of Memory (Gruneberg, M.M. et al., eds), pp. 625–632, Academic Press
26
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1 30 Skinner, B.F. (1953) Science and Human Behavior, Macmillan 31 Cull, W.L. (2000) Untangling the benefits of multiple study opportunities and repeated testing for cued recall. Appl. Cogn. Psychol. 14, 215–235 32 Cull, W.L. et al. (1996) Expanding understanding of the expandingpattern-of-retrieval mnemonic: toward confidence in applicability. J. Exp. Psychol. Appl. 2, 365–378 33 Balota, D.A. et al. (2007) Is expanded retrieval practice a superior form of spaced retrieval? A critical review of the extant literature. In The Foundations of Remembering: Essays in Honor of Henry L. Roediger, III (Nairne, J.S., ed.), pp. 83–106, Psychology Press 34 Karpicke, J.D. and Roediger, H.L., III (2007) Expanding retrieval practice promotes short-term retention, but equally spaced retrieval enhances long-term retention. J. Exp. Psychol. Learn. Mem. Cogn. 33, 704–719 35 Logan, J.M. and Balota, D.A. (2008) Expanded vs. equal interval spaced retrieval practice: exploration of schedule of spacing and retention interval in younger and older adults. Aging Neuropsychol. Cogn. 15, 257–280 36 Roediger, H.L., III and Karpicke, J.D. (2010) Intricacies of spaced retrieval: a resolution. In Successful Remembering and Successful Forgetting: Essays in Honor of Robert A. Bjork (Benjamin, A.S., ed.), Psychology Press, pp. 23–47 37 Pashler, H. et al. (2003) Is temporal spacing of tests helpful even when it inflates error rates? J. Exp. Psychol. Learn. Mem. Cogn. 29, 1051–1057 38 Bangert-Drowns, R.L. et al. (1991) The instructional effect of feedback in test-like events. Rev. Educ. Res. 61, 213–238 39 Kulhavy, R.W. and Stock, W.A. (1989) Feedback in written instruction: the place of response certitude. Educ. Psychol. Rev. 1, 279–308 40 Pashler, H. et al. (2005) When does feedback facilitate learning of words? J. Exp. Psychol. Learn. Mem. Cogn. 31, 3–8 41 Butler, A.C. et al. (2008) Correcting a meta-cognitive error: feedback enhances retention of low confidence correct responses. J. Exp. Psychol. Learn. Mem. Cogn. 34, 918–928 42 Kang, S.H.K. et al. (2007) Test format and corrective feedback modulate the effect of testing on memory retention. Eur. J. Cogn. Psychol. 19, 528–558 43 Butler, A.C. et al. (2006) When additional multiple-choice lures aid versus hinder later memory. Appl. Cogn. Psychol. 20, 941–956 44 Roediger, H.L., III and Marsh, E.J. (2005) The positive and negative consequences of multiple-choice testing. J. Exp. Psychol. Learn. Mem. Cogn. 31, 1155–1159 45 Marsh, E.J. et al. (2009) Memorial consequences of answering SAT II questions. J. Exp. Psychol. Appl. 15, 1–11 46 Butler, A.C. and Roediger, H.L., III (2008) Feedback enhances the positive effects and reduces the negative effects of multiple-choice testing. Mem. Cogn. 36, 604–616 47 Kulik, J.A. and Kulik, C.C. (1988) Timing of feedback and verbal learning. Rev. Educ. Res. 58, 79–97 48 Butler, A.C. et al. (2007) The effect of type and timing of feedback on learning from multiple-choice tests. J. Exp. Psychol. Appl. 13, 273–281 49 Kulhavy, R.W. and Anderson, R.C. (1972) Delay-retention effect with multiple-choice tests. J. Educ. Psychol. 63, 505–512 50 Metcalfe, J. et al. (2009) Delayed versus immediate feedback in children’s and adults’ vocabulary learning. Mem. Cogn. 37, 1077–1087 51 Smith, T.A. and Kimball, D.R. (2010) Learning from feedback: spacing and the delay-retention effect. J. Exp. Psychol. Learn. Mem. Cogn. 36, 80–95 52 Cepeda, N.J. et al. (2006) Distributed practice in verbal recall tasks: a review and quantitative synthesis. Psychol. Bull. 132, 354–380 53 Cepeda, N.J. et al. (2008) Spacing effect in learning: a temporal ridgeline of optimal retention. Psychol. Sci. 19, 1095–1102 54 Melton, A.W. (1970) The situation with respect to the spacing of repetitions and memory. J. Verb. Learn. Verb. Behav. 9, 596–606 55 Madigan, S.A. (1969) Intraserial repetition and coding processes in free recall. J. Verb. Learn. Verb. Behav. 8, 828–835 56 Barnett, S.M. and Ceci, S.J. (2002) When and where do we apply what we learn? A taxonomy for far transfer. Psychol. Bull. 128, 612–637 57 Johnson, C.I. and Mayer, R.E. (2009) A testing effect with multimedia learning. J. Educ. Psychol. 101, 621–629 58 McDaniel, M.A. et al. (2009) The read–recite–review study strategy: effective and portable. Psychol. Sci. 20, 516–522
Review 59 Rohrer, D. et al. (2010) Tests enhance the transfer of learning. J. Exp. Psychol. Learn. Mem. Cogn. 36, 233–239 60 Butler, A.C. (2010) Repeated testing produces superior transfer of learning relative to repeated studying. J. Exp. Psychol. Learn. Mem. Cogn. 36, 1118–1133 61 Callender, A.A. and McDaniel, M.A. (2009) The limited benefits of rereading educational texts. Contemp. Educ. Psychol. 34, 30–41 62 Slamecka, N.J. and Katsaiti, L.T. (1988) Normal forgetting of verbal lists as a function of prior testing. J. Exp. Psychol. Learn. Mem. Cogn. 14, 716–727 63 Thompson, C.P. et al. (1978) How recall facilitates subsequent recall: a reappraisal. J. Exp. Psychol. Hum. Learn. Mem. 4, 210–221 64 Glover, J.A. (1989) The ‘‘testing’’ phenomenon: not gone but nearly forgotten. J. Educ. Psychol. 81, 392–399 65 Carrier, M. and Pashler, H. (1992) The influence of retrieval on retention. Mem. Cogn. 20, 633–642 66 Bjork, R.A. (1975) Retrieval as a memory modifier: an interpretation of negative recency and related phenomena. In Information Processing and Cognition (Solso, R.L., ed.), pp. 123–144, Wiley 67 McDaniel, M.A. and Masson, M.E.J. (1985) Altering memory representations through retrieval. J. Exp. Psychol. Learn. Mem. Cogn. 11, 371–385 68 Gardiner, J.M. et al. (1973) Retrieval difficulty and subsequent recall. Mem. Cogn. 1, 213–216 69 Morris, C.D. et al. (1977) Levels of processing versus transferappropriate processing. J. Verb. Learn. Verb. Behav. 16, 519–533 70 Roediger, H.L., III et al. (2002) Processing approaches to cognition: the impetus from the levels of processing framework. Memory 10, 319– 332 71 Bjork, R.A. and Bjork, E.L. (1992) A new theory of disuse and an old theory of stimulus fluctuation. In From Learning Processes to Cognitive Processes: Essays in Honor of William K. Estes (Vol. 2) (Healy, A. et al., eds), pp. 35–67, Erlbaum. 72 Pavlik, P.I., Jr (2007) Understanding and applying the dynamics of test practice and study practice. Instruct. Sci. 35, 407–441 73 Dudai, Y. (2004) The neurobiology of consolidations, or, how stable is the engram? Annu. Rev. Psychol. 55, 51–86 74 Sara, S.J. (2000) Retrieval and reconsolidation: toward a neurobiology of remembering. Learn. Mem. 7, 73–84 75 Lee, J.L. et al. (2004) Independent cellular processes for hippocampal memory consolidation and reconsolidation. Science 304, 839–843 76 Nader, K. et al. (2000) Fear memories require protein synthesis in the amygdala for reconsolidation after retrieval. Nature 406, 722– 726
Trends in Cognitive Sciences
January 2011, Vol. 15, No. 1
77 Lisman, J.E. and Grace, A.A. (2005) The hippocampal–VTA loop: controlling the entry of information into long-term memory. Neuron 46, 703–713 78 Dempster, F.N. (1992) Using tests to promote learning: a neglected classroom resource. J. Res. Dev. Educ. 25, 213–217 79 Bangert-Drowns, R.L. et al. (1991) The instructional effect of feedback in test-like events. Rev. Educ. Res. 61, 213–238 80 McDaniel, M.A. et al. (2007) Testing the testing effect in the classroom. Eur. J. Cogn. Psychol. 19, 494–513 81 Lyle, K.B. and Crawford, N.A. Retrieving essential material at the end of lectures improves performance on statistics exams. Teach. Psychol. in press. 82 Larsen, D.P. et al. (2009) Repeated testing improves long-term retention relative to repeated study: a randomized, controlled trial. Med. Educ. 43, 1174–1181 83 Carpenter, S.K. et al. (2009) Using tests to enhance 8th grade students’ retention of U.S. history facts. Appl. Cogn. Psychol. 23, 760–771 84 Roediger, H.L., III et al. (submitted). Test-enhanced learning in the classroom: Long-term improvements from quizzing. 85 McDaniel, M.A. et al. Test-enhanced learning in a middle school science classroom: the effects of quiz frequency and placement. J. Educ. Psychol. in press. 86 Mawhinney, V.T. et al. (1971) A comparison of students studyingbehavior produced by daily, weekly, and three-week testing schedules. J. Appl. Behav. Anal. 4, 257–264 87 Michael, J. (1991) A behavioral perspective on college teaching. Behav. Anal. 14, 229–239 88 Leeming, F.C. (2002) The exam-a-day procedure improves performance in psychology classes. Teach. Psychol. 29, 210–212 89 Chan, J.C.K. (2009) When does retrieval induce forgetting and when does it induce facilitation? Implications for retrieval inhibition, testing effect, and text processing. J. Mem. Lang. 61, 153–170 90 Chan, J.C.K. (2010) Long-term effects of testing on the recall of nontested materials. Memory 18, 49–57 91 Chan, J.C.K. et al. (2006) Retrieval induced facilitation: initially nontested material can benefit from prior testing. J. Exp. Psychol. Gen. 135, 533–571 92 Gates, A.I. (1917) Recitation as a factor in memorizing. Arch. Psychol. 6, 1–104 93 Jones, H.E. (1923-1924) The effects of examination on the performance of learning. Arch. Psychol. 10, 1–70 94 Spitzer, H.F. (1939) Studies in retention. J. Educ. Psychol. 30, 641–656 95 Tulving, E. (1967) The effects of presentation and recall of material in free-recall learning. J. Verb. Learn. Verb. Behav. 6, 175–184
27
Review
Cognitive enhancement by drugs in health and disease Masud Husain1 and Mitul A. Mehta2 1
UCL Institute of Cognitive Neuroscience and UCL Institute of Neurology, 17 Queen Square, London WC1N 3AR, UK Department of Neuroimaging, Centre for Neuroimaging Sciences (PO89), Institute of Psychiatry, King’s College London, London SE5 8AF, UK
2
Attempts to improve cognitive function in patients with brain disorders have become the focus of intensive research efforts. A recent emerging trend is the use of socalled cognitive enhancers by healthy individuals. Here, we consider some of the effects – positive and negative – that current drugs have in neurological conditions and healthy people. We conclude that, to date, experimental and clinical studies have demonstrated relatively modest overall effects, most probably because of substantial variability in response both across and within individuals. We discuss biological factors that might account for such variability and highlight the need to improve testing methods and to extend our understanding of how drugs modulate specific cognitive processes at the systems or network level. Uses of cognitive enhancement In the last decade, pharmacological treatments aimed at improving cognitive function across a range of brain disorders have been explored and have even become established in clinical practice [1]. In developmental conditions such as attention deficit hyperactivity disorder (ADHD), drugs acting on the noradrenergic and dopaminergic systems, such as methylphenidate and atomoxetine, are now in widespread use [2–4]. For neurodegenerative disorders such as Alzheimer’s disease and Parkinson’s disease, acetylcholinesterase inhibitors (AChEIs) and memantine [an N-methyl-D-aspartate (NMDA) receptor antagonist] are now standard treatments [5–9]. In chronic mental disorders such as schizophrenia, cognitive deficits are a separable feature from positive (e.g. hallucinations and delusions) and negative (e.g. blunted affect, poverty of speech) symptoms, with current antipsychotic treatments having little, if any, impact on cognitive impairments. A wide range of compounds is therefore being assessed for cognitive enhancement in this disorder [10]. Similarly, attempts to ameliorate cognitive deficits following stroke are being actively explored [1,11–13], although none have been established. Many such cognitive enhancers target neuromodulatory systems – cholinergic, dopaminergic, noradrenergic and serotonergic – ascending from brainstem nuclei to innervate both cortical and subcortical systems (Table 1). Although most of the reported positive effects of such drugs have been modest in magnitude overall and are highly variable across individuals, they have had an enormous Corresponding author: Husain, M. (
[email protected]).
28
impact, stimulating interest in cognitive enhancement not only for patients with brain disorders, but also for healthy individuals. Compounds such as methylphenidate and modafinil are used by students in pursuit of better grades, military personnel who need to remain awake for long missions, elderly individuals afraid of cognitive decline and even university academics keen to maintain their performance [14–17]. Here we focus on what aspects of cognition are enhanced, the magnitude of these effects and possible mechanisms underlying variations in response across individuals. Our aim is to highlight key common themes across studies of clinical populations and healthy individuals, using examples that highlight these principles. Other recent reviews provide excellent discussions of ethical issues in cognitive enhancement [18] and illustrate the complexity of physiological, cellular and computational mechanisms underlying such effects [19–22]. Glossary Acetylcholinesterase: enzyme that breaks down acetylcholine at synapses. Cholinergic system: nervous system pathways that use acetylcholine as a neurotransmitter. This includes cholinergic neurons in the basal forebrain that project to the cerebral cortex. COMT (catechol-O-methyltransferase): enzyme that degrades catecholamines, including dopamine, at synapses. DAT (dopamine active transporter): membrane-spanning protein that pumps dopamine from the synapse back into the cell, thereby reducing its synaptic concentration. Dementia with Lewy bodies (DLB): form of dementia characterized by the presence of Lewy bodies (consisting of a-synuclein and ubiquitin proteins), closely related to Parkinson’s disease with dementia (PDD). Dopaminergic system: neurons that use dopamine as a neurotransmitter have cell bodies located in the midbrain. The mesolimbic pathway and mesocortical pathway originate in the ventral tegmental area to innervate the limbic system and cerebral cortex, respectively, whereas the nigrostriatal pathway projects from the substantia nigra to innervate the caudate and putamen. Glutamate: ionized form of the amino acid glutamic acid; acts as an excitatory amino acid transmitter. Heteroreceptors: receptors on axons that are specific for neurotransmitters released by other cells at axon–axon synapses. Histaminergic system: neurons that use histamine as a neurotransmitter have cell bodies in the hypothalamus and project to brain regions including the cerebral cortex. NMDA receptor: class of glutamate receptors activated by N-methyl-Daspartate. Noradrenergic system: neurons that use noradrenaline as a neurotransmitter project from cell bodies in the locus coeruleus in the pons to innervate the cerebral cortex. Nucleus accumbens: part of the basal ganglia. Its inputs include dopaminergic neurons from the ventral tegmental area via the mesolimbic pathway. Serotonergic system: neurons that use serotonin as a neurotransmitter project from cell bodies in the brainstem (notably in the raphe nucleus) to the cerebral cortex. Working memory: process whereby information is held in mind for brief periods.
1364-6613/$ – see front matter ß 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.tics.2010.11.002 Trends in Cognitive Sciences, January 2011, Vol. 15, No. 1
Review
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Table 1. Summary of the effects of some drugs frequently used as cognitive enhancers Cognitive enhancer
Neuromodulatory mechanism
Methylphenidate, amphetamine
Dopamine and noradrenaline reuptake inhibitors
Caffeine
Non-selective adenosine receptor antagonist Nicotinic cholinergic receptor agonist
Nicotine
Modafinil
Atomoxetine, reboxetine Donepezil, galantamine, rivastigmine (AChEI) Memantine
Cognitive functions improved Response inhibition, working memory, attention, vigilance Vigilance, working memory, incidental learning Working memory, episodic memory, attention
Known brain systems most affected Frontoparietal attentional systems, striatum, default mode networks Frontal lobe attentional systems Fronto-parietal attentional systems, medial temporal lobe, default mode networks Frontal lobe attentional systems
Currently recommended clinical use ADHD, wake-promoting agent – –
Unknown, but effects on dopamine, noradrenaline and orexin systems proposed Noradrenaline reuptake inhibitors Blocks enzymatic breakdown of acetylcholine
Working memory, episodic memory, attention Response inhibition, working memory, attention Episodic memory, attention
Frontoparietal attentional systems Frontal lobe attentional systems
ADHD, depression
Noncompetitive, low-affinity, open channel blocker of the NMDA receptor
Episodic memory, attention
Frontal and parietal lobe
Alzheimer’s disease
What is enhanced? What exactly do cognitive neuromodulators do? It might be tempting to assume a selective one-to-one mapping between a specific neurotransmitter system and a particular cognitive function. For example, dopamine has been strongly linked with working memory (WM) and attention [19], whereas serotonergic drugs have been prominently associated with affective processes [23,24]. However, serotonergic modulation can also influence WM [25], as can noradrenaline and acetylcholine. Conversely, dopamine influences affective processing [26,27]. A simple mapping between a specific neurotransmitter and a particular cognitive function described at a very general level – such as WM – therefore seems untenable. However, subtle but important differences in the precise processes modulated might provide some discriminating value: for instance, dopamine has an established role in reinforcement learning in response to rewards [28,29], whereas serotonin seems to modulate reinforcement learning for aversive stimuli [20,23]. To add to the complexity, neurotransmitters act via a suite of different receptor systems. Thus, dopamine acting at D1 receptors can have very different – even opposing – effects to that of its actions at D2 receptors [19,30]; for serotonin there are 17 different receptor systems. In addition, dopamine can have very different effects at different brain regions, even within different regions of the human basal ganglia [31]. Its release can also be modulated in a highly specific regional manner by other neurotransmitters, such as glutamate within the nucleus accumbens [32]. Thus, interactions between neuromodulatory systems are also a probable mechanism by which some of their effects are modulated. For instance, dopamine, noradrenaline and acetylcholine release is under histaminergic H3 heteroreceptor control [33], whereas noradrenaline and dopamine can interact to modulate spatial WM neuronal responses in prefrontal cortex in a synergistic fashion [19,21]. Again, these considerations suggest that simple conceptualizations linking a specific neurotransmitter to a single cognitive function are unlikely to be helpful.
Wake-promoting agent
Alzheimer’s disease, PDD, DLB
Finally, there is increasing evidence that several neurotransmitters might have different modes of action when released in a tonic, sustained manner compared to phasic release [29,34,35]. For instance, baseline firing of noradrenergic cells in the locus coeruleus varies with different states of alertness or arousal. Optimal responses to environmentally important events seem to be linked to phasic firing of these cells, but this occurs only when tonic levels of activity are moderate [35]. Thus, alteration of global concentrations of a neurotransmitter might modulate the ability to respond to external events mediated by phasic firing. How do drugs currently used as enhancers produce their beneficial effects? Is it through multiple effects on several different cognitive processes or do they enhance one cognitive mechanism – such as arousal or improved sustained attention – through which they lead to better performance across a battery of tests? For studies in clinical populations, the difficulty is that many standard cognitive test batteries used in clinical trials are very unlikely to be sensitive enough to answer questions on the specificity of cognitive modulation (Box 1). For example, AChEIs such as rivastigmine and donepezil are now widely used to treat Parkinson’s disease dementia (PDD) and the related condition of dementia with Lewy bodies (DLB). Many clinical trials have reported modest global beneficial effects of such drugs on bedside cognitive screening tests [5–7]. More detailed assessment using sensitive computerized cognitive tests has revealed widespread improvements in the domains of attention, WM and episodic memory [36–38]. However, these positive effects of AChEIs might all be mediated via a common process such as elevated arousal [39,40]. In fact, the very same issue pertains to the modulatory effects of AChEIs in healthy subjects [41]. For example, in young volunteers, donepezil improves episodic memory, whereas healthy elderly subjects show improvements in verbal memory [42]. Is it possible that these effects could be due simply to a generalized improvement in arousal? Studies demonstrating that donepezil attenuates decline in short-term memory and visual attention induced by sleep deprivation [43,44] raise the possibility that this might indeed be the case. 29
Review
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Box 1. Measurement of cognitive enhancement in clinical trials In clinical studies of neurodegenerative conditions – such as Alzheimer’s disease, Parkinson’s disease with dementia (PDD), dementia with Lewy bodies (DLB) and vascular dementia – the gold standard outcome measure has become the ADAS-Cog (Alzheimer’s Disease Assessment Scale) [91]. This is a relatively short battery of cognitive tests covering memory, orientation, language, visual construction and limb praxis skills measured on a 70-point scale. Drugs approved for use in these clinical conditions have demonstrated efficacy in changing this measure in the context of a randomized controlled trial (RCT), in which patients are randomly assigned either to drug or to placebo. Many trials have also revealed changes in CIBIC-plus (Clinician’s Interview-Based Impression of Change) [92], ADAS-CGIC (Alzheimer’s Disease Assessment ScaleClinical Global Impression of Change) [92] or Neuropsychiatric Inventory (NPI) scores [93]. These scoring systems attempt to capture more global function or psychiatric effects of drug interventions. For example, the CIBIC-plus is a semi-structured instrument that attempts to evaluate four areas: general, cognitive and behavioural functions and activities of daily living, based on the clinician’s observations of the patient at interview, together with information supplied by a caregiver. By contrast, the NPI evaluates delusions, hallucinations, dysphoria, anxiety, agitation or aggression, euphoria, disinhibition, irritability or lability, apathy, aberrant motor activity,
Similar considerations as for AChEIs also apply to modafinil, which has become a popular drug for cognitive enhancement in healthy individuals. Although its precise mechanism of action remains to be established, modafinil is used as a wake-promoting agent for the treatment of narcolepsy, a disorder associated with excessive daytime somnolence. Analysis of the effects of modafinil in healthy subjects has revealed improvements in attention, memory and executive function in sleep-deprived individuals [17]. However, this might simply be due to improved wakefulness or arousal induced by the drug [17], just as caffeine can improve performance on a variety of measures, including vigilance, and on incidental learning and WM tests [45]. However, it is also important to appreciate that ‘arousal’
[()TD$FIG]
(a)
and night-time behaviour disturbances. It also relies on a structured interview with a caregiver who is familiar with the patient. The problem with such scoring systems is that they are relatively crude and subjective. Many of them were developed for Alzheimer’s disease and might not be as appropriate for other neurodegenerative conditions or for individuals performing in the normal range, but at risk of developing Alzheimer’s disease. For example, fluctuations in attention or vigilance are a prominent feature of PDD and DLB whereas impairments in speed of information processing are common in vascular dementia. These aspects of cognition are not measured well by batteries such as ADAS-Cog. Such scoring systems also often lack dynamic range and can be affected by ceiling or floor effects. Alternative measures comprising computerized batteries have therefore been used [36,94]. These can give more sensitive cognitive indices and reaction time measures can avoid saturation effects. However, they might be time-consuming to perform and require some degree of expertise to administer and interpret. Similar issues also pertain to treatment studies of developmental disorders such as ADHD. Here, rating scales are also used as outcome measures, with trials showing relatively modest effects compared to placebo [90,95]. In ADHD too, experimental measures using reaction time indices, for example to assay response inhibition using the STOP signal reaction time task, might be more sensitive measures of the efficacy of drug interventions [96,97].
need not be a unitary process: there is evidence of different arousal systems that might be selectively modulated by different types of pharmacological intervention [46]. It is possible that neuroimging studies might contribute to identification of the mechanisms underpinning improvement on cognitive tests, including arousal. Although early studies assessed changes in brain activity on drug administration [47–49], more recent investigations have begun to examine the modulatory effect of compounds on brain networks. For example, the beneficial effects of reboxetine on visuomotor control are associated with strengthening of coupling between selective regions in posterior and anterior regions of the right hemisphere (Figure 1) [50]. Approaches to characterize the effects of drugs at a net-
(b) .09
PMC
SMA
.19
.18
.09
FEF
.34
.10
.11
.11
.07
.05 .40
.12
.20 .26
M1 L
.09
.04
.13
.14
IPS
.15
.18
.36
IPS .15
.06
.05
V1
R
.03
significant increase
.07
V1
significant decrease TRENDS in Cognitive Sciences
Figure 1. Network effects of reboxetine in visuomotor control. (a) The noradrenaline reuptake inhibitor reboxetine improved visuomotor control in healthy volunteers and increased cortical activity in the right intraparietal sulcus (IPS), frontal eye field (FEF) and primary visual cortex (V1). (b) Dynamic causal modelling demonstrated enhanced coupling between these regions when participants were on reboxetine (adapted from with permission from [50]).
30
Review
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Box 2. Neuroimaging of drug effects Although a great deal of work has been performed on furthering our understanding of the actions of several cognitive enhancers at a cellular level, it is likely that this level of explanation will be insufficient, on its own, to account for the effects of drugs on cognitive performance in both healthy humans and those with brain disorders. Instead, more insight might be obtained from an understanding of the modulatory effects of drugs on large-scale brain networks underlying cognitive skills at the systems level. Early studies demonstrated how drugs such as AChEIs and methylphenidate might modulate visual attention and WM via effects on parietal, frontal and extrastriate occipital regions [47–49]. More recent investigations have focused on the effects of drugs on functional connectivity across a brain network. For example, reboxetine, a noradrenergic reuptake inhibitor, improved performance on a visuomotor task, an effect that was associated with enhanced
work level in brain disorders are also being applied in patient groups (Box 2). Finally, it is also crucial to appreciate that non-cognitive factors such as alterations in mood, anxiety, motivation or apathy induced by a drug can have indirect effects on cognition. Hence, it is useful to control for these factors if at all possible. How effective are the benefits? A major issue in assessing cognitive enhancement studies is the problem of effect size. First, in studies of healthy subjects, there is no universal, standard battery of tests that has been agreed on, so comparisons across studies are not easy. It is not possible to compare effect sizes for different drugs if the tests used differ in the level of difficulty or method of measurement (e.g. reaction time vs error rate). Overall, however, the effects of cognitive enhancers such as methylphenidate, modafinil and AChEIs in healthy individuals seems to be quite modest according to recent systematic reviews [17,41]. Second, many experimental investigations in healthy subjects have used single-dose assessments aimed primarily at assessing mechanisms rather than establishing optimal cognitive enhancement. Very few studies have examined the effects of repeated doses or long-term effects, which might be far more revealing and representative of the overall costs and benefits of taking cognitive enhancers on a regular basis. Third, as we have seen, although clinical trials in patients often use standardized bedside batteries, they might be hampered by their insensitivity and limited range of measurement (Box 1). Nevertheless, even for these relatively crude measures, studies in clinical populations have revealed significant effects of long-term drug use that have led to changes in practice. For example, one of the remarkable changes in the management of neurological conditions in the last decade has been the advent of treatment for cognitive deficits in neurodegenerative conditions, initially in Alzheimer’s disease with AChEIs [7]. These studies stimulated clinical trials in other conditions such as PDD and DLB, with two major placebo-controlled studies involving over 650 patients demonstrating significant positive effects of the AChEI rivastigmine on cognition and neuropsychiatric measures such as apathy, anxiety and visual hallucinations [5,6]. Although these trials have now led to widespread clinical use of rivastigmine, it is important to keep the effect size in perspective. In the larger study, rivastigmine produced only
effective connectivity between right hemisphere parietal and frontal regions, as well as their influences on left hemisphere regions [50]. Such approaches have also been used to examine more challenging effects, such as that of modafinil on the noradrenergic locus coeruleus, a very small nucleus located in the pons [98]. A different approach, applied to clinical populations, has been to examine brain metabolic network deficiencies associated with neurodegenerative conditions, such as motor and cognitive deficits in PD, using fluorodeoxyglucose PET [99]. Researchers have also started to use this methodology to investigate the effects of treatment at the network level, raising the possibility of producing a network-level account of how a drug might modulate function in a particular brain disorder. Importantly, different neurodegenerative diseases seem to have characteristically different effects on the resting-state functional connectivity across brain network nodes, as indexed by fMRI [100].
a mean 2-point improvement on the ADAS-Cog battery (Box 1), which has a 70-point range [6]. Similar degrees of change have been observed in Alzheimer’s disease and vascular dementia trials with AChEIs (Figure 2a). Of course, effect sizes vary across individual patients. Indeed 40–80% of PDD or DLB patients might not show a response to treatment on such clinical measures, but other individuals show a very strong improvement [5,6]. Overall, therefore, this means that positive effects have been moderate, at best, when results are examined at the group level – at least using this currently accepted method for measuring cognition in neurodegenerative clinical trials. Similar conclusions have been reached in schizophrenia, for which there is currently no established treatment for cognitive enhancement [10]. Thus, interindividual variability might be one potential reason for small overall effect sizes (see below). By contrast, a first glance might indicate far more substantial effect sizes in treatment trials of ADHD, for which several drugs that target the catecholaminergic system are used in clinics. For example, a recent study using high levels of the a2 noradrenergic agonist guanfacine demonstrated a 12-point mean improvement compared to placebo on a rating scale with a range of 54 points (Figure 2b). However, these effects were based on ratings by parents or caregivers, and not on cognitive tests. These might be very valid measures to rate the behavioural effects of a drug, but the point is that when considering effect size it is crucial to bear in mind the nature of the assessments. It is also important to question whether there might be negative effects of taking a compound. The downside of cognitive enhancers Like all drugs, those used with the aim of enhancing cognition can have side effects via body systems other than the brain. Thus, both AChEIs and methylphenidate frequently cause gastrointestinal upset or nausea, sometimes leading patients to discontinue medication altogether. These effects have the potential to offset any positive effects of the drug on overall performance, and also need to be borne in mind by anyone contemplating use of such drugs for non-medicinal purposes. More important from a cognitive neuroscience perspective is the ability of some drugs to impair certain aspects of cognition while simultaneously enhancing others in the same individual. 31
()TD$FIG][ Review
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
(a)
(b) ADAS-cog
Improvement
-1 0
Deterioration
1 2 Week 6 3 months
6 months
ADHD-RS-IV total score mean change from baseline (SD)
0
-2
GXR 0.01- GXR 0.05- GXR 0.09- GXR 0.13Placebo 0.04 mg/kg 0.08 mg/kg 0.12 mg/kg 0.17 mg/kg
-5 -10 -15 -20 -25 -30
-12.2 -17.6 +
-20.0 ++ -23.5 +++
-24.8 ++++
TRENDS in Cognitive Sciences
Figure 2. Effect sizes of cognitive enhancers in clinical studies. (a) Overall change in ADAS-Cog scores over 6 months on an AChE inhibitor in probable vascular dementia and Alzheimer disease patients (black circles) compared to patients on placebo (white circles) (adapted from with permission from [89]). (b) Improvements in ADHD Rating Scale IV with guanfacine at different doses versus placebo over 9 weeks in children and adolescents with ADHD (adapted from with permission from [90]).
Thus, rivastigmine in healthy elderly subjects can improve learning on a motor task and making associations between symbols and digits, but can at the same time impair verbal and visual episodic memory [51]. Similarly, the dopamine agonist bromocriptine can enhance spatial WM while simultaneously impairing probabilistic reversal learning in young participants [52]. This finding echoes results in patients with PD: dopaminergic medication improves their performance on WM and task-set switching tasks, but degrades reversal learning [53,54]. It has been hypothesized that such opposing effects are due to ‘overdosing’ of ventral striatal areas involved in the latter, but replenishment of dopamine in dorsal striatal areas required for the former [53,55]. Thus, doses of dopaminergic medication sufficient to ameliorate motor function and some aspects of cognition in PD have the potential to worsen others. Indeed, this conclusion might well be applicable to recent reports that some PD patients on dopaminergic agonists developed impulsive behaviours such as gambling, compulsive shopping and hypersexuality [56,57]. It has been reported that such behaviour in PD is often associated with the presence of dyskinesias, involuntary movements due to excessive dopaminergic stimulation [58], consistent with the notion that such impulse control disorders might indeed be associated with ‘overdosing’ of some basal ganglia regions. Importantly, reducing the dose of dopaminergic drugs often leads to reductions in impulsivity. These findings show that dopamine agonists in PD can have a spectrum of effects, both beneficial and harmful, on cognition and behaviour. Who benefits from cognitive enhancers? A major theme that has emerged from studies of neurological patient groups is that there is a great variability of response, with many individuals not responding to treatment on (relatively crude) clinical measures, whereas others show a very strong improvement, for example in response to AChEIs [5,6]. Thus, although this group of patients demonstrates a modest average cognitive change overall, the effect is likely to be diluted by the fact that many individuals show very little benefit. 32
The same issue has arisen in investigations in healthy individuals: some subjects respond, whereas others might show little or no benefit. As we discuss below, recent investigations have begun to question whether such differences in outcome might depend on genotype and/or the baseline level of cognitive function. These considerations also raise concerns about what has become the standard method of performing clinical drug trials. Large-scale randomized controlled trials offer protection from false positive findings, but they also have the potential to discard the fact that some subgroups might benefit from a compound, whereas others might not. What might be the cause of such variations in response? Several studies on the effects of dopaminergic drugs on WM in healthy volunteers support the conclusion that those who benefit most are low performers, such as those with low WM capacity or span. Thus, methylphenidate or dopamine receptor agonists such as bromocriptine improve WM updating or retrieval in people who were low performers on study entry, but can actually impair performance in participants with high baseline WM spans [47,59–62]. One possible explanation for such contradictory effects might reside in the classic inverted U-shaped relationship between cognitive performance and dopamine receptor (particularly D1 receptor) stimulation (Figure 3). Such effects have been known for a long time, with investigations in experimental animals revealing that both low and excessively high levels of D1 receptor stimulation in the prefrontal cortex can impair WM [63–65]. For optimal performance, a baseline level between these two extremes is required. However, until recently, direct evidence in favour of this concept has been lacking in humans. New findings reveal that dopamine synthesis capacity in the caudate nucleus of the basal ganglia is lower in individuals with low WM spans compared to those with high spans [66]. Participants in this study were also investigated after taking bromocriptine or placebo. Ability to update reward predictions on a reversal learning task was improved by bromocriptine far more in individuals with low baseline dopamine synthesis capacity in the basal ganglia. Indeed, high-synthesis subjects were actually impaired in their performance [67].
()TD$FIG][ Review
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
(a)
(b)
Effect
Effect
F2 F1
Drug concentration
Base line New level
Drug concentration TRENDS in Cognitive Sciences
Figure 3. Drug concentrations modulate cognition across and within individuals. (a) Evidence from animal studies suggests that modulation of a cognitive process, such as spatial working memory, by a neurotransmitter such as dopamine might be described by an inverted U-shaped function. Too low or too high a concentration of dopamine in prefrontal cortex might not produce optimal functional effects. If an individual has low baseline concentrations of dopamine, small increases in concentration might help to improve performance (red circle). However, individuals with a higher baseline concentration of dopamine (green circle) might actually suffer an impairment of function on introduction of a drug. (b) Two cognitive processes within the same individual might have differential drug sensitivity (compare functions F1 and F2). In this case, this individual performed nearly optimally on cognitive function F1 but relatively poorly on function F2 (green circles) before drug administration. Administration of the drug led to an increase in neurotransmitter concentration from baseline levels. At the new drug level (dashed vertical line) performance on function 1 might theoretically decrease, whereas cognitive function 2 might now be optimized (yellow circles).
More recently, it was demonstrated using radioligand positron emission tomography (PET) imaging that individuals with small levels of dopamine release induced by methylphenidate improved on a reversal learning task [31]. By contrast, participants with larger dopamine release in the caudate nucleus were impaired by the drug. Importantly, the authors also found that the most impulsive subjects (as indexed by their score on an impulsivity scale) were more likely to improve with methylphenidate. Thus, both baseline trait impulsivity and methylphenidate-induced dopamine release affected response to drugs. The effects of methyphenidate on spatial WM in healthy subjects are also most prominent in individuals with the lowest performance [47]. In ADHD it has similarly been reported that children with the poorest sustained attention or highest baseline motor activity are most likely to respond to methylphenidate treatment [68]. The effects of baseline performance might also be evident for cholinergic modulation: whereas beneficial effects of donepezil on cognitive function were evident in healthy participants whose performance declined after sleep deprivation, those who were not much affected by sleep loss tended to deteriorate after donepezil intake [43,44]. Modafinil also seems to have the most prominent cognitive effects on attention and WM in subjects who have low baseline performance [69,70]. Interestingly, recent studies using magnetic resonance spectroscopy suggest that levels of GABA in specific brain regions predict differences in individual performance on cognitive tasks [71,72]. Thus, one reason for baseline performance modulation of response to drugs might be the baseline level of a neurotransmitter in a critical brain region or network. Effects of genotype on response to drugs Genetic predictors of individual variability in response to treatments aimed at improving cognitive function would clearly be beneficial in effective targeting of therapeutic strategies. These effects might result directly from varia-
tions in efficiency of drug targets or indirectly via metabolic pathways or other risk genes. Several studies have suggested a role for polymorphisms in the catechol-Omethyltransferase (COMT) enzyme-coding region on chromosone 22 in WM [73]. COMT degrades catecholamines, including dopamine, at the synapse. Polymorphisms of the COMT gene seem to be associated with variability in human WM performance and associated brain activity, presumed to be via its putative influence on cortical dopamine levels [73]. Amphetamine responses might interact with COMT activity. When performing a test of cognitive flexibility – the Wisconsin Card Sorting Test – those with the higheractivity COMT Val-Val genotype improved, whereas those with the lower-activity Met-Met genotype deteriorated after a single dose of amphetamine. An inverted-U relationship between predicted cortical dopamine levels and performance is consistent with these findings (Figure 3). Variations in COMT and the dopamine transporter gene (DAT) are both obvious candidates for modulation of response to psychomotor stimulant treatment in a condition such as ADHD. DAT is a major target of methylphenidate and amphetamine, and many treatments for ADHD, including the noradrenaline transporter inhibitor atomoxetine, are thought to increase cortical dopamine levels [74], consistent with a role for COMT. An association between good clinical response to methylphenidate and carriers of the high-activity Val polymorphism also suggests a role for cortical dopamine in mediating treatment response [75,76]. However, the influence of variable number of tandem repeats in the DAT gene on methylphenidate response seems to be mixed [77–79]. Apoliprotein E4 (apoE4), an allele of apolipoprotein E, which is involved in lipoprotein processing in cells, increases the risk of developing dementia later in life. Perhaps paradoxically, young healthy carriers of this genotype, who have a higher risk of cognitive decline later in life, actually show better performance on decision-making and prospective memory tasks compared to their apoE3 33
Review counterparts [80]. Moreover, nicotine – but not dopaminergic drugs – potentiate the advantage in apoE4 carriers, producing greater cognitive benefits in these individuals than in apoE3 carriers on these tasks [80]. The reasons for this are unclear, but the findings suggest that some genetic variations influence the integrity of specific neurotransmitter systems, limiting the potential to improve function in response to drugs acting on the same systems. For the AChEIs, extensive metabolizers of drugs as defined by gene variations in cytochrome P450 (a family of degradative enzymes) might show greater response to donepezil and rivastigmine [81,82]. This has been demonstrated using the Mini Mental State Examination (MMSE), which is a relatively crude bedside test of cognition; selective cognitive tasks have not been used to elucidate processspecific advantages. Drug effects and behavioural training One area that is likely to develop in cognitive enhancement research is investigation of the interaction between drugs and behavioural approaches to improve cognition. There has been a great deal of recent interest in the potential for cognitive training, for example on WM tasks, to improve performance not only on these paradigms but also to generalize to other tasks in healthy people, as well as those with brain conditions such as ADHD [83,84]. fMRI studies in healthy participants have revealed alterations in activity across parietal and frontal regions during such training [85]. Intriguingly, radioligand PET imaging demonstrated associated changes in dopamine D1 receptor binding in parietal and frontal areas [86]. Thus it might be possible to visualize alterations in neurotransmitter systems as a function of cognitive training using brain imaging. An important question for future studies will be whether there can be synergistic effects of behavioural training and cognitive-enhancing drugs. Such synergism has been demonstrated for learning of new material and levodopa in healthy subjects [87]. Whether such combined intervention might also be useful for cognitive deficits in brain disorders has yet to be explored in detail. However, there is emerging evidence of such effects. For example, both memantine and speech therapy improved dysphasia in stroke patients, but the combination of the two led to enhanced outcomes [88]. Demonstrations of network-level interactions for drug and cognitive training in this type of context would be an important way to investigate the mechanisms underlying such synergistic effects. Taking the effects of genotype, baseline cognitive performance and the nature of brain disorder in patients into account is likely to be an important factor in understanding such synergies. Concluding remarks It would probably be fair to say that we are still in the first generation of studies to examine the potential for cognitive enhancement in humans. In both healthy individuals and many patient groups, the overall effects of drugs generally seem to be modest. However, there is evidence that there might be more significant effects in subgroups, such as those whose baseline performance is poorest or individuals with a particular genotype. Moreover, new drugs aimed at enhancing the phasic response of neurotransmitter systems, such 34
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Box 3. Questions for future research Can we improve outcome measures used to study the effects of drugs in clinical populations? Is it possible to improve prediction of which healthy individuals or patients might respond to a particular cognitive-enhancing drug? How do different neurotransmitter systems interact to modulate a particular cognitive function? Animal studies provide evidence of such interactions but there are few investigations in humans. Is there a role for combined treatments for neurological disorders, targeting different neurotransmitter systems?
as direct nicotinic agonists for the cholinergic system [34], might prove to have greater effects than existing modulators that globally increase levels of a neurotransmitter in a tonic fashion. The neurobiology underpinning the effects of cognitive enhancers and the mechanisms that determine responsiveness across individuals promise to be the focus of research in health and brain disorders in the future (Box 3). Acknowledgments We thank Dr Alex Leff for helpful comments on the manuscript. This work was funded by The Wellcome Trust and the NIHR CBRC at UCL/UCLH.
References 1 Parton, A. et al. (2005) Neuropharmacological modulation of cognitive deficits after brain damage. Curr. Opin. Neurol. 18, 675–680 2 Lo´pez, F.A. (2006) ADHD: new pharmacological treatments on the horizon. J. Dev. Behav. Pediatr. 27, 410–416 3 Findling, R.L. (2006) Evolution of the treatment of attention-deficit/ hyperactivity disorder in children: a review. Clin. Ther. 30, 942–957 4 Tucha, O. et al. (2006) Methylphenidate-induced improvements of various measures of attention in adults with attention deficit hyperactivity disorder. J. Neural. Transm. 113, 1575–1592 5 McKeith, I. et al. (2000) Efficacy of rivastigmine in dementia with Lewy bodies: a randomised, double-blind, placebo-controlled international study. Lancet 356, 2031–2036 6 Emre, M. et al. (2004) Rivastigmine for dementia associated with Parkinson’s disease. N. Engl. J. Med. 351, 2509–2518 7 Farlow, M.R. and Cummings, J.L. (2007) Effective pharmacologic management of Alzheimer’s disease. Am. J. Med. 120, 388–397 8 Reisberg, B. et al. (2003) Memantine in moderate-to-severe Alzheimer’s disease. N. Engl. J. Med. 348, 1333–1341 9 Aarsland, D. et al. (2009) Memantine in patients with Parkinson’s disease dementia or dementia with Lewy bodies: a double-blind, placebo-controlled, multicentre trial. Lancet Neurol. 8, 613–618 10 Harvey, P.D. (2009) Pharmacological cognitive enhancement in schizophrenia. Neuropsychol. Rev. 19, 324–335 11 Berthier, M.L. et al. (2006) A randomized, placebo-controlled study of donepezil in poststroke aphasia. Neurology 67, 1687–1689 12 Malhotra, P.A. et al. (2006) Noradrenergic modulation of space exploration in visual neglect. Ann. Neurol. 59, 186–190 13 Jorge, R.E. et al. (2010) Escitalopram and enhancement of cognitive recovery following stroke. Arch. Gen. Psychiatry 67, 187–196 14 Greely, H. et al. (2008) Towards responsible use of cognitiveenhancing drugs by the healthy. Nature 456, 702–705 15 Farah, M.J. et al. (2004) Neurocognitive enhancement: what can we do and what should we do? Nat. Rev. Neurosci. 5, 421–425 16 Sahakian, B. and Morein-Zamir, S. (2007) Professor’s little helper. Nature 450, 1157–1159 17 Repantis, D. et al. (2010) Modafinil and methylphenidate for neuroenhancement in healthy individuals: a systematic review. Pharmacol. Res. 62, 187–206 18 Sahakian, B.J. and Morein-Zamir, S. (2010) Neuroethical issues in cognitive enhancement. J. Psychopharmacol. DOI: 10.1177/ 0269881109106926 19 Robbins, T.W. and Arnsten, A.F.T. (2009) The neuropsychopharmacology of fronto-executive function: monoaminergic modulation. Annu. Rev. Neurosci. 32, 267–287 20 Dayan, P. and Huys, Q.J.M. (2009) Serotonin in affective control. Annu. Rev. Neurosci. 32, 95–126
Review 21 Arnsten, A.F.T. (2009) Stress signalling pathways that impair prefrontal cortex structure and function. Nat. Rev. Neurosci. 10, 410–422 22 Minzenberg, M.J. and Carter, C.S. (2008) Modafinil: a review of neurochemical actions and effects on cognition. Neuropsychopharmacology 33, 1477–1502 23 Cools, R. et al. (2008) Serotoninergic regulation of emotional and behavioural control processes. Trends Cogn. Sci. 12, 31–40 24 Harmer, C.J. (2008) Serotonin and emotional processing: does it help explain antidepressant drug action? Neuropharmacology 55, 1023–1028 25 Luciana, M. et al. (2001) Effects of tryptophan loading on verbal, spatial and affective working memory functions in healthy adults. J. Psychopharmacol. (Oxford) 15, 219–230 26 Mehta, M.A. et al. (2005) Sulpiride and mnemonic function: effects of a dopamine D2 receptor antagonist on working memory, emotional memory and long-term memory in healthy volunteers. J. Psychopharmacol. (Oxford) 19, 29–38 27 Gibbs, A.A. et al. (2007) The role of dopamine in attentional and memory biases for emotional information. Am. J. Psychiatry 164, 1603–1609 28 Schultz, W. et al. (1997) A neural substrate of prediction and reward. Science 275, 1593–1599 29 Schultz, W. (2002) Getting formal with dopamine and reward. Neuron 36, 241–263 30 Floresco, S.B. and Magyar, O. (2006) Mesocortical dopamine modulation of executive functions: beyond working memory. Psychopharmacology (Berl.) 188, 567–585 31 Clatworthy, P.L. et al. (2009) Dopamine release in dissociable striatal subregions predicts the different effects of oral methylphenidate on reversal learning and spatial working memory. J. Neurosci. 29, 4690– 4696 32 Wise, R.A. (2004) Dopamine, learning and motivation. Nat. Rev. Neurosci. 5, 483–494 33 Schlicker, E. et al. (1994) Modulation of neurotransmitter release via histamine H3 heteroreceptors. Fundam. Clin. Pharmacol. 8, 128–137 34 Sarter, M. et al. (2009) Phasic acetylcholine release and the volume transmission hypothesis: time to move on. Nat. Rev. Neurosci. 10, 383–390 35 Aston-Jones, G. and Cohen, J.D. (2005) An integrative theory of locus coeruleus–norepinephrine function: adaptive gain and optimal performance. Annu. Rev. Neurosci. 28, 403–450 36 Wesnes, K.A. et al. (2002) Effects of rivastigmine on cognitive function in dementia with lewy bodies: a randomised placebo-controlled international study using the cognitive drug research computerised assessment system. Dement. Geriatr. Cogn. Disord. 13, 183–192 37 Wesnes, K.A. et al. (2005) Benefits of rivastigmine on attention in dementia associated with Parkinson disease. Neurology 65, 1654– 1656 38 Rowan, E. et al. (2007) Effects of donepezil on central processing speed and attentional measures in Parkinson’s disease with dementia and dementia with Lewy bodies. Dement. Geriatr. Cogn. Disord. 23, 161–167 39 Everitt, B.J. and Robbins, T.W. (1997) Central cholinergic systems and cognition. Annu. Rev. Psychol. 48, 649–684 40 Edgar, C.J. et al. (2009) Approaches to measuring the effects of wakepromoting drugs: a focus on cognitive function. Hum. Psychopharmacol. 24, 371–389 41 Repantis, D. et al. (2010) Acetylcholinesterase inhibitors and memantine for neuroenhancement in healthy individuals: a systematic review. Pharmacol. Res. 61, 473–481 42 FitzGerald, D.B. et al. (2008) Effects of donepezil on verbal memory after semantic processing in healthy older adults. Cogn. Behav. Neurol. 21, 57–64 43 Chuah, L.Y.M. and Chee, M.W.L. (2008) Cholinergic augmentation modulates visual task performance in sleep-deprived young adults. J. Neurosci. 28, 11369–11377 44 Chuah, L.Y.M. et al. (2009) Donepezil improves episodic memory in young individuals vulnerable to the effects of sleep deprivation. Sleep 32, 999–1010 45 Koelega, H.S. (1993) Stimulant drugs and vigilance performance: a review. Psychopharmacology (Berl.) 111, 1–16 46 Robbins, T.W. (1997) Arousal systems and attentional processes. Biol. Psychol. 45, 57–71
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1 47 Mehta, M.A. et al. (2000) Methylphenidate enhances working memory by modulating discrete frontal and parietal lobe regions in the human brain. J. Neurosci. 20, RC65 48 Furey, M.L. et al. (2000) Cholinergic enhancement and increased selectivity of perceptual processing during working memory. Science 290, 2315–2319 49 Bentley, P. et al. (2004) Effects of cholinergic enhancement on visual stimulation, spatial attention, and spatial working memory. Neuron 41, 969–982 50 Grefkes, C. et al. (2010) Noradrenergic modulation of cortical networks engaged in visuomotor processing. Cereb. Cortex 20, 783– 797 51 Wezenberg, E. et al. (2005) Modulation of memory and visuospatial processes by biperiden and rivastigmine in elderly healthy subjects. Psychopharmacology (Berl.) 181, 582–594 52 Mehta, M.A. et al. (2001) Improved short-term spatial memory but impaired reversal learning following the dopamine D2 agonist bromocriptine in human volunteers. Psychopharmacology (Berl.) 159, 10–20 53 Swainson, R. et al. (2000) Probabilistic learning and reversal deficits in patients with Parkinson’s disease or frontal or temporal lobe lesions: possible adverse effects of dopaminergic medication. Neuropsychologia 38, 596–612 54 Cools, R. et al. (2001) Enhanced or impaired cognitive function in Parkinson’s disease as a function of dopaminergic medication and task demands. Cereb. Cortex 11, 1136–1143 55 Dagher, A. and Robbins, T.W. (2009) Personality, addiction, dopamine: insights from Parkinson’s disease. Neuron 61, 502–510 56 Weintraub, D. et al. (2010) Impulse control disorders in Parkinson disease: a cross-sectional study of 3090 patients. Arch. Neurol. 67, 589–595 57 Weintraub, D. et al. (2006) Association of dopamine agonist use with impulse control disorders in Parkinson disease. Arch. Neurol. 63, 969–973 58 Voon, V. et al. (2009) Chronic dopaminergic stimulation in Parkinson’s disease: from dyskinesias to impulse control disorders. Lancet Neurol. 8, 1140–1149 59 Kimberg, D.Y. et al. (1997) Effects of bromocriptine on human subjects depend on working memory capacity. Neuroreport 8, 3581–3585 60 Cools, R. et al. (2007) Impulsive personality predicts dopaminedependent changes in frontostriatal activity during component processes of working memory. J. Neurosci. 27, 5506–5514 61 Frank, M.J. and O’Reilly, R.C. (2006) A mechanistic account of striatal dopamine function in human cognition: psychopharmacological studies with cabergoline and haloperidol. Behav. Neurosci. 120, 497–517 62 Gibbs, S.E.B. and D’Esposito, M. (2005) Individual capacity differences predict working memory performance and prefrontal activity following dopamine receptor stimulation. Cogn. Affect Behav. Neurosci. 5, 212–221 63 Williams, G.V. and Goldman-Rakic, P.S. (1995) Modulation of memory fields by dopamine D1 receptors in prefrontal cortex. Nature 376, 572–575 64 Vijayraghavan, S. et al. (2007) Inverted-U dopamine D1 receptor actions on prefrontal neurons engaged in working memory. Nat. Neurosci. 10, 376–384 65 Zahrt, J. et al. (1997) Supranormal stimulation of D1 dopamine receptors in the rodent prefrontal cortex impairs spatial working memory performance. J. Neurosci. 17, 8528–8535 66 Cools, R. et al. (2008) Working memory capacity predicts dopamine synthesis capacity in the human striatum. J. Neurosci. 28, 1208–1212 67 Cools, R. et al. (2009) Striatal dopamine predicts outcome-specific reversal learning and its sensitivity to dopaminergic drug administration. J. Neurosci. 29, 1538–1543 68 Teicher, M.H. et al. (2003) Rate dependency revisited: understanding the effects of methylphenidate in children with attention deficit hyperactivity disorder. J. Child. Adolesc. Psychopharmacol. 13, 41–51 69 Finke, K. et al. (2010) Effects of modafinil and methylphenidate on visual attention capacity: a TVA-based study. Psychopharmacology (Berl.) 210, 317–329 70 Randall, D.C. et al. (2005) Cognitive effects of modafinil in student volunteers may depend on IQ. Pharmacol. Biochem. Behav. 82, 133–139
35
Review 71 Boy, F. et al. (2010) Individual differences in subconscious motor control predicted by GABA concentration in SMA. Curr. Biol. 20, 1779–1785 72 Sumner, P. et al. (2010) More GABA, less distraction: a neurochemical predictor of motor decision speed. Nat. Neurosci. 13, 825–827 73 Bilder, R.M. et al. (2004) The catechol-O-methyltransferase polymorphism: relations to the tonic-phasic dopamine hypothesis and neuropsychiatric phenotypes. Neuropsychopharmacology 29, 1943–1961 74 Bymaster, F.P. et al. (2002) Atomoxetine increases extracellular levels of norepinephrine and dopamine in prefrontal cortex of rat: a potential mechanism for efficacy in attention deficit/hyperactivity disorder. Neuropsychopharmacology 27, 699–711 75 Kereszturi, E. et al. (2008) Catechol-O-methyltransferase Val158Met polymorphism is associated with methylphenidate response in ADHD children. Am. J. Med. Genet. B Neuropsychiatr. Genet. 147B, 1431–1435 76 Cheon, K. et al. (2008) Association of the catechol-Omethyltransferase polymorphism with methylphenidate response in a classroom setting in children with attention-deficit hyperactivity disorder. Int. Clin. Psychopharmacol. 23, 291–298 77 McGough, J.J. (2005) Attention-deficit/hyperactivity disorder pharmacogenomics. Biol. Psychiatry 57, 1367–1373 78 Mick, E. et al. (2006) Absence of association with DAT1 polymorphism and response to methylphenidate in a sample of adults with ADHD. Am. J. Med. Genet. B Neuropsychiatr. Genet. 141B, 890–894 79 Contini, V. et al. (2010) Response to methylphenidate is not influenced by DAT1 polymorphisms in a sample of Brazilian adult patients with ADHD. J. Neural. Transm. 117, 269–276 80 Marchant, N.L. et al. (2010) Positive effects of cholinergic stimulation favor young APOE epsilon4 carriers. Neuropsychopharmacology 35, 1090–1096 81 Varsaldi, F. et al. (2006) Impact of the CYP2D6 polymorphism on steady-state plasma concentrations and clinical outcome of donepezil in Alzheimer’s disease patients. Eur. J. Clin. Pharmacol. 62, 721–726 82 Cacabelos, R. et al. (2007) Pharmacogenetic aspects of therapy with cholinesterase inhibitors: the role of CYP2D6 in Alzheimer’s disease pharmacogenetics. Curr. Alzheimer Res. 4, 479–500 83 Klingberg, T. et al. (2005) Computerized training of working memory in children with ADHD – a randomized, controlled trial. J. Am. Acad. Child Adolesc. Psychiatry 44, 177–186 84 Jaeggi, S.M. et al. (2008) Improving fluid intelligence with training on working memory. Proc. Natl. Acad. Sci. U. S. A. 105, 6829–6833
36
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1 85 Olesen, P.J. et al. (2004) Increased prefrontal and parietal activity after training of working memory. Nat. Neurosci. 7, 75–79 86 McNab, F. et al. (2009) Changes in cortical dopamine D1 receptor binding associated with cognitive training. Science 323, 800–802 87 Knecht, S. et al. (2004) Levodopa: faster and better word learning in normal humans. Ann. Neurol. 56, 20–26 88 Berthier, M.L. et al. (2009) Memantine and constraint-induced aphasia therapy in chronic poststroke aphasia. Ann. Neurol. 65, 577–585 89 Erkinjuntti, T. et al. (2002) Efficacy of galantamine in probable vascular dementia and Alzheimer’s disease combined with cerebrovascular disease: a randomised trial. Lancet 359, 1283–1290 90 Sallee, F.R. et al. (2009) Guanfacine extended release in children and adolescents with attention-deficit/hyperactivity disorder: a placebocontrolled trial. J. Am. Acad. Child Adolesc. Psychiatry 48, 155–165 91 Rosen, W.G. et al. (1984) A new rating scale for Alzheimer’s disease. Am. J. Psychiatry 141, 1356–1364 92 Schneider, L.S. et al. (1997) Validity and reliability of the Alzheimer’s Disease Cooperative Study-Clinical Global Impression of Change. The Alzheimer’s Disease Cooperative Study. Alzheimer Dis. Assoc. Disord. 11 (Suppl 2), S22–32 93 Cummings, J.L. (1997) The Neuropsychiatric Inventory: assessing psychopathology in dementia patients. Neurology 48, S10–16 94 Blackwell, A.D. et al. (2004) Detecting dementia: novel neuropsychological markers of preclinical Alzheimer’s disease. Dement. Geriatr. Cogn. Disord. 17, 42–48 95 Jensen, P.S. et al. (2007) 3-year follow-up of the NIMH MTA study. J. Am. Acad. Child Adolesc. Psychiatry 46, 989–1002 96 DeVito, E.E. et al. (2009) Methylphenidate improves response inhibition but not reflection-impulsivity in children with attention deficit hyperactivity disorder (ADHD). Psychopharmacology (Berl.) 202, 531–539 97 Chamberlain, S.R. et al. (2007) Atomoxetine improved response inhibition in adults with attention deficit/hyperactivity disorder. Biol. Psychiatry 62, 977–984 98 Minzenberg, M.J. et al. (2008) Modafinil shifts human locus coeruleus to low-tonic, high-phasic activity during functional MRI. Science 322, 1700–1702 99 Eidelberg, D. (2009) Metabolic brain networks in neurodegenerative disorders: a functional imaging approach. Trends Neurosci. 32, 548–557 100 Seeley, W.W. et al. (2009) Neurodegenerative diseases target largescale human brain networks. Neuron 62, 42–52
Review
Reward, dopamine and the control of food intake: implications for obesity Nora D. Volkow1, Gene-Jack Wang2 and Ruben D. Baler1 1 2
National Institute on Drug Abuse, National Institutes of Health, Bethesda, MD 20892, USA Medical Department, Brookhaven National Laboratories, Upton, NY 11973, USA
The ability to resist the urge to eat requires the proper functioning of neuronal circuits involved in top-down control to oppose the conditioned responses that predict reward from eating the food and the desire to eat the food. Imaging studies show that obese subjects might have impairments in dopaminergic pathways that regulate neuronal systems associated with reward sensitivity, conditioning and control. It is known that the neuropeptides that regulate energy balance (homeostatic processes) through the hypothalamus also modulate the activity of dopamine cells and their projections into regions involved in the rewarding processes underlying food intake. It is postulated that this could also be a mechanism by which overeating and the resultant resistance to homoeostatic signals impairs the function of circuits involved in reward sensitivity, conditioning and cognitive control. Introduction One-third of the US adult population is obese [body mass index (BMI) 30 kg m 2] [1]. This fact has far reaching and costly implications, because obesity is strongly associated with serious medical complications (e.g. diabetes, heart disease, fatty liver and some cancers) [2]. Not surprisingly, the health care costs alone owing to obesity in the US have been estimated at close to US$150 billion [3]. Social and cultural factors undoubtedly contribute to this epidemic. Specifically, environments that promote unhealthy eating habits (ubiquitous access to highly processed and junk foods) and physical inactivity are believed to have a fundamental role in the widespread problem of obesity (Overweight and Obesity Website of the Centers for Disease Control and Prevention; http://www.cdc.gov/obesity/index. html). However, individual factors also help determine who will (or will not) become obese in these environments. Based on heredity studies, genetic factors are estimated to contribute between 45% and 85% of the variability in BMI [4,5]. Although genetic studies have revealed point mutations that are over-represented among obese individuals [4], for the most part, obesity is thought to be under polygenic control [6,7]. Indeed, the most recent whole genome-wide association analysis study (GWAS) conducted in 249,796 individuals of European descent identified 32 loci associated with BMI. However, these loci explained only 1.5% of the variance in BMI [8]. Moreover, it was estimated that GWAS studies with larger samples should be able to identify 250 extra loci with effects on BMI. However, even with the Corresponding author: Volkow, N.D. (
[email protected]).
undiscovered variants, it was estimated that signals from common variant loci would account for only 6–11% of the genetic variation in BMI (based on an estimated heritability of 40–70%). The limited explanation of the variance from these genetic studies is likely to reflect the complex interactions between individual factors (as determined by genetics) and the way in which individuals relate to environments where food is widely available, not only as a source of nutrition, but also as a strong reward that by itself promotes eating [9]. The hypothalamus [via regulatory neuropeptides such as leptin, cholecystokinin (CCK), ghrelin, orexin, insulin, neuropeptide Y (NPY), and through the sensing of nutrients, such as glucose, amino acids and fatty acids] is recognized as the main brain region regulating food intake as it relates to caloric and nutrition requirements [10–13]. In particular, the arcuate nucleus through its connections with other hypothalamic nuclei and extra-hypothalamic brain regions, including the nucleus tractus solitarius, regulates homeostatic food intake [12] and is implicated in obesity [14–16] (Figure 1a, left panel). However, evidence is accumulating that brain circuits other than those regulating hunger and satiety are involved in food consumption and obesity [17]. Specifically, several limbic [nucleus accumbens (NAc), amygdala and hippocampus] and cortical brain regions [orbitofrontal cortex (OFC), cingulate gyrus (ACC) and insula] and neurotransmitter systems (dopamine, serotonin, opioids and cannabinoids) as well as the hypothalamus are implicated in the rewarding effects of food [18] (Figure 1a, right panel). By contrast, the regulation of food intake by the hypothalamus appears to rely on the reward and motivational neurocircuitry to modify eating behaviors [19–21]. Based on findings from imaging studies, a model of obesity was recently proposed in which overeating reflects an imbalance between circuits that motivate behavior (because of their involvement in reward and conditioning) and circuits that control and inhibit pre-potent responses [22]. This model identifies four main circuits: (i) reward– saliency; (ii) motivation–drive; (iii) learning–conditioning; and (iv) inhibitory control–emotional regulation–executive function. Notably, this model is also applicable to drug addiction. In vulnerable individuals, the consumption of high quantities of palatable food (or drugs in addiction) can upset the balanced interaction among these circuits, resulting in an enhanced reinforcing value of food (or drugs in addiction) and in a weakening of the control circuits. This perturbation is a consequence of conditioned learning
1364-6613/$ – see front matter . Published by Elsevier Ltd. doi:10.1016/j.tics.2010.11.001 Trends in Cognitive Sciences, January 2011, Vol. 15, No. 1
37
()TD$FIG][ Review
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
(a) Energy homeostasis (HYP) PO
Cognitive/Reward homeostasis
PVN AH
LH
DMH SON
VTA/Nac Reward calculation ns po re s tiv e
ne
itio
C
DA, 5HT, CB, Opioids/GABA
Orexigenic network
Anorexogenic network
NAcb
NAcb NPYR2
NPY
NPYR1
Pdyn
PMCH
ORX
DMH
UBL5 MCHR1 GALR1
PMCH
POMC
CRH
NPYR5
Pdyn
GAL GALR1
UBL5
Oprd
GALR1
CART CCKAR
CRH
NPYR1
Adipor1
CART CRH
POMC
ORX
NPY
ORXR2
Pdyn
Oprd
UBL5 GHSR
GAL
NMB UCN
CRHR1
CCKBR
Oprd Adipoq
ARC ORXR2
Pdyn
GHSR
UBL5
Orexigenic network
AgRPCARTPdynGALGHRHPMCHNPYORXPOMCUBL5-
Anorexigenic network
Genes encoding for peptides Agouti-related peptide Cocaine and Amphetamine regulated transcript Prodynorphin Galanin Growth hormone releasing hormone Pro-melanin concentrating hormone Neuropeptide Y Orexins/Hyopcretins Proopiomelanocortin Ubiquitin-like 5
Genes encoding for peptides Adipoq- Adiponectin CART- Cocaine and Amphetamine regulated transcript CCKCholecystokinin CRHCorticotropin releasing hormone GRP- Gastrin releasing peptide NtsNeurotensin NMB- Neuromedin B POMC- Proopiomelanocortin PYYPeptide YY Tac2Tachykinin 2 (Neuropeptide K) UCNUrocortin
GAL GALR1
CRHR1
MC3R
NTS
CCKBR CCKAR
CART CRH UCN
NPYR5
Adipoq
Tac2
NMB UCN
CART
Lepr
Adipor2
Nmur1 Nts Adipoq
GRP Adipor1
Receptors GALR1- Galanin receptor 1 GHSR- Growth hormone secretagogue receptor NPY1R- NPY receptor Y1 NPY2R- NPY receptor Y2 NPY5R- NPY receptor Y5 MCHR1- Melanin concentrating hormone receptor 2 Oprd- Opioid receptor, delta 1 ORXR2- Orexin / Hypocretin receptor 2
Receptors Adipor1- Adiponectin receptor 1 Adipor2- Adiponectin receptor 2 CCKAR- Cholecystokinin A receptor CCKBR- Cholecystokinin B receptor CRHR1- Corticotropin releasing hormone receptor 1 CRHR2- Corticotropin releasing hormone receptor 2 Glp1r- Glucagon-like peptide 1 receptor LeprLeptin receptor MC3R- Melanocortin receptor 3 Nmur1- Neuromedin U receptor 1 Ntsr1- Neurotensin receptor 1 0/+
CCK CCKAR
CRHR1
CRHR2 Nts
Oprd
Adipor1
Adipor1
MC3R POMC
Glp1r
NTS ORX
Tac2
Adipor2
Nts
Adipoq GHSR
UBL5
AgRP
NPYR1
VTA
MC3R CCKAR
NPYR2
GAL Oprd
ARC
VMH
ORXR2
Pdyn UBL5
Adipor1 PYY
GRP Adipor1
Adipoq
POMC
Nts
Adipoq
VTA ORX
CCKAR
UCN
Adipoq Tac2 Tac2
Oprd
Nmur1
POMC UCN
CRH
Nts
VMH ORX
DMH
Glp1r
CART UBL5
POMC
CRH
PVN
Pdyn
NMB
CRHR1
Adipoq
Oprd
ORX NPYR1
LHA
CCKBR
Adipor1
UBL5 GHSR GALR1 GAL Pdyn
MC3R
CCKAR
UCN
Nmur1 Tac2 Nts
ORX
GAL
NMB MC3R
CRHR1
ORXR2
NPYR1
Oprd
PVN
CART
LHA
NPYR5
es dr
d on
Food intake
ORX
es
ns
po
He d
CRH, TRH, OT, AVP, CART MCH, OREXIN/HYPOCRETIN, DYN, CART NPY, CART, GAL NPY GAL, NT, LEPTIN, NPY/AGRP, POMC/CART LHRH
Hipp/Amygdala Learning/ Memory
en
Key:
ic/ Inc
ARC
(+)
es
Control-Decision making
VMH
(b)
OFC/ACC Salience attribution
(-)
PH
on
SCN
vmPFC/mOFC/ACC Insula Top/Down Interoception inhibition Gustatory integration
+
Ntsr1 GRP
Adipor1
Central sites Arc- Arcuate nucleus VMH- Ventromedial hypothalamic nucleus DMH- Dorsomedial hypothalamic nucleus LHA- Lateral hypothalamic area PVN- Paraventricular hypothalamic nucleus NAcb- Nucleus accumbens VTA- Ventral tegmental area NTS- Nucleus of the solitary tract
Central sites Arc- Arcuate nucleus VMH- Ventromedial hypothalamic nucleus DMH- Dorsomedial hypothalamic nucleus LHA- Lateral hypothalamic area PVN- Paraventricular hypothalamic nucleus NAcb- Nucleus accumbens VTA- Ventral tegmental area NTS- Nucleus of the solitary tract Expression level and density +/++
++
++/+++
+++
TRENDS in Cognitive Sciences
Figure 1. Regulation of food intake relies on multichannel communication between overlapping reward and homeostatic neurocircuits. (a) Schematic diagram of the crosstalk between the homeostatic (hypothalamus, HYP) and reward circuits that control food intake. The HYP is central to energy balance and several of its nuclei are involved in energy regulation [arcuate (ARC), dorsomedial (DMH) ventromedial (VMH) and lateral HYP (LH)] integrating orexigenic and anorexigenic signals from the periphery and the CNS and communicating these to regions from the reward circuitry. For example, orexin neurons in LH are influenced by leptin and ghrelin and, in turn, project to reward regions via OX1 and OX2 receptors. Several key neuropeptides produced in various hypothalamic nuclei are indicated: corticotrophin-releasing hormone
38
Review and the resetting of reward thresholds following the consumption of large quantities of high-calorie foods (or drugs in addiction) by at-risk individuals. The undermining of the cortical top-down networks that regulate pre-potent responses results in impulsivity and in compulsive food intake (or compulsive drug intake in addiction). This paper discusses the evidence that links the neural circuits involved in top-down control with those involved with reward and motivation and their interaction with peripheral signals that regulate homeostatic food intake. Food is a potent natural reward and conditioning stimulus Certain foods, particularly those rich in sugars and fat, are potent rewards [23] that promote eating (even in the absence of an energetic requirement) and trigger learned associations between the stimulus and the reward (conditioning). In evolutionary terms, this property of palatable foods used to be advantageous because it ensured that food was eaten when available, enabling energy to be stored in the body (as fat) for future need in environments where food sources were scarce and/or unreliable. However, in modern societies, where food is widely available, this adaptation has become a liability. Several neurotransmitters, including dopamine (DA), cannabinoids, opioids and serotonin, as well as neuropetides involved in homeostatic regulation of food intake, such as orexin, leptin and ghrelin, are implicated in the rewarding effects of food [24–26]. DA has been the most thoroughly investigated and is the best characterized. It is a key neurotransmitter modulating reward (natural and drug rewards), which it does mainly through its projections from the ventral tegmental area (VTA) into the NAc [27]. Other DA projections are also implicated, including the dorsal striatum (caudate and putamen), cortical (OFC and ACC) and limbic regions (hippocampus and amygdala) and the lateral hypothalamus. Indeed, in humans, ingestion of palatable food has been shown to release DA in the dorsal striatum in proportion to the self-reported level of pleasure derived from eating the food [28]. However, the involvement of DA in reward is more complex than the mere encoding of hedonic value. Upon first exposure to a food reward (or an unexpected reward), the firing of DA neurons in the VTA increases with a resulting increase in DA release in NAc [29]. However, with repeated exposure to the food reward, the DA response habituates and is gradually transferred onto the stimuli associated with the food reward (e.g. the smell of food), which is then processed as a predictor of reward (becoming a cue that is conditioned to the reward) [30,31]; the DA signal in response to the cue then serves to convey a ‘reward prediction error’ [31]. The
Trends in Cognitive Sciences
January 2011, Vol. 15, No. 1
Box 1. The role of the hippocampus in feeding behaviors The hippocampus is not only central to memory, but is also involved in the regulation of eating behaviors through its processing of mnemonic processes (including remembering whether one ate, remembering conditioning associations, remembering where food is located, identifying interoceptive states of hunger and remembering how to relieve these states). For example, in rodents, selective lesions in the hippocampus impaired their ability to discriminate between the state of hunger and that of satiety [99] and, in female rats, it resulted in hyperphagia [100]. In humans, brain-imaging studies have reported activation of the hippocampus with food craving, a state of hunger, the response to food-conditioned cues and to food tasting [101]. The hippocampus expresses high levels of insulin, ghrelin, glucocorticoids and cannabinoid CB1 receptors, which suggests that this region also regulates food intake by nonmnemonic processes [102,103]. In addition, the hippocampus is implicated in obesity, as evinced by imaging studies showing that in obese but not in lean individuals, the hippocampus shows hyperactivation in response to food stimuli [104].
extensive glutamatergic afferents to DA neurons from regions involved with sensory (insula or primary gustatory cortex), homeostatic (hypothalamus), reward (NAc), emotional (amygdala and hippocampus) and multimodal (OFC for salience attribution) modulate their activity in response to rewards and to conditioned cues [32]. Specifically, projections from the amygdala and the OFC to DA neurons and NAc are involved in conditioned responses to food [33]. Indeed, imaging studies showed that when non-obese male subjects were asked to inhibit their craving for food while being exposed to food cues, they decreased metabolic activity in amygdala and OFC [as well as hippocampus (see also Box 1), insula and striatum]; the decreases in OFC were associated with reductions in food craving [34]. Conditioned cues can elicit feeding even in sated rats [30] and, in humans, imaging studies have shown that exposure to food cues elicits DA increases in the striatum that are associated with the desire to eat the food [35]. In addition to its involvement with conditioning, DA is also involved with the motivation to perform the behaviors necessary to procure and consume the food. Indeed, the involvement of DA in food reward has been associated with the motivational salience or ‘wanting’ of food as opposed to the ‘liking’ of food [36] (Box 2), an effect that is likely to involve the dorsal striatum and perhaps also the NAc [37]. DA has such a crucial role in this context that transgenic mice that do not synthesize DA die of starvation owing to a lack of motivation to eat [37]. Restoring DA neurotransmission in the dorsal striatum rescues these animals, whereas restoring it in the NAc does not. The hedonic (‘liking’) properties of food appear to depend on, among others, opioid, cannabinoid and GABA neurotransmission [36]. These ‘liking’ properties of food are
(CRH), tyrotrophin-releasing hormone (TRH), oxytocin (OT), vasopressin (AVP), cocaine- and amphetamine-regulated transcript (CART), NPY, agouti-related protein (AgRP), proopiomelanocortin (POMC), galanin (GAL), neurotensin (NT), leptin, orexin, luteinizing hormone-releasing hormone (LHRH) and melanin-concentrating hormone (MCH). By contrast, top-down inhibition of feeding depends heavily on the PFC, including OFC and ACC. The amygdala ascribes emotional attributes and, together with memory and learning circuitry, generates conditioned responses. This circuit is subject to strong influence coming from cortical and mesolimbic input. Many of the orexigenic and anorexigenic peripheral signals directly influence neural computations not only in hypothalamus, but also in mesocorticolimbic structures (amygdala, OFC and hippocampus). Conversely, many classic neurotransmitters (DA, CB, opioids, GABA and serotonin) are produced as a result of mesocorticolimbic activity and influence the HYP. For comprehensive reviews, see [26,105]. (b) Expression of orexigenic and anorexigenic genes in the central circuitry (data derived from the Allen Brain Atlas; http:// www.brain-map.org). Each box represents a brain region and the circles indicate expression levels of genes in the region. Circle sizes represent expression density (‘+’ expression is sparse to ‘+++’ expression throughout the entire area). Colors represent expression levels (dark blue < light blue < turquoise < light green < orange < red). The location of each gene symbol in the boxes does not correlate with the distribution of that gene within the brain region it represents. Gene symbols without circles are mentioned when only expression density or level is >0. POMC is a precursor for an orexigen, b-endorphin, and for an anorexigen, a-melanocyte-stimulating hormone. Reproduced, with permission, from [106].
39
Review Box 2. Wanting versus liking: an important distinction Brain reward systems involved with food intake distinguish a mechanism involved with motivating the desire for the food, referred to as ‘wanting’, versus a mechanism involved with the hedonic properties of the food, referred to as ‘liking’ [36]. Whereas the dopamine striatal system is predominantly (although not exclusively) implicated in ‘wanting’, the opioid and cannabinoid systems are predominantly (although not exclusively) implicated in food ‘liking’. Indeed, brain-imaging studies in humans have shown that the dopamine release triggered when humans encounter a food cue correlates with their subjective ratings of wanting the food [35]. Conversely, the activation of endogenous opioid or cannabinoid receptors appears to stimulate appetite in part by enhancing the ‘liking’ of the food (i.e. its palatability). Although these two mechanisms are separate, they act in concert to modulate eating behaviors.
processed in reward regions including lateral hypothalamus, NAc, ventral pallidum, OFC [9,27,38] and insula (primary taste area in the brain) [39]. Opioid signaling in NAc (in the shell) and ventral pallidum appears to mediate food ‘liking’ [40]. By contrast, opioid signaling in the basolateral amygdala is implicated in conveying the affective properties of food, which in turn modulate the incentive value of food and reward-seeking behavior, thus also contributing to food ‘wanting’ [41]. Interestingly, in rodents that have been exposed to diets rich in sugar, a pharmacological challenge with naloxone (opiate antagonist drug devoid of effects in control rats) elicits an opiate withdrawal syndrome similar to that observed in animals that have been chronically exposed to opioid drugs [42]. In addition, exposure of humans or laboratory animals to sugar produces an analgesic response [43], which suggests that sugar (and perhaps other palatable foods) has a direct ability to boost endogenous opioid levels. A research question that emerges from these data is whether, in humans, dieting triggers a mild withdrawal syndrome that could contribute to relapse. Endocannabinoids, predominantly through cannabinoid CB1 receptor signaling (in contrast to CB2 receptors), are involved with both homeostatic and rewarding mechanisms of food intake and energy expenditure [44–46]. Homeostatic regulation is mediated in part through the arcuate and paraventricular nuclei in the hypothalamus and through the nucleus of the solitary tract in the brainstem, and the regulation of reward processes is mediated in part through effects in NAc, hypothalamus and brainstem. Therefore, the cannabinoid system is an important target in medication development for treatment of obesity and metabolic syndrome. Similarly, the modulation by serotonin of feeding behaviors involves both reward and homeostatic regulation and it has also been a target for the development of anti-obesity medications [47–50]. In parallel, there is increasing evidence that peripheral homeostatic regulators of energy balance, such as leptin, insulin, orexin, ghrelin and PYY, also regulate behaviors that are non-homeostatic and modulate the rewarding properties of food [50]. These neuropeptides might also be involved with cognitive control over food intake and with conditioning to food stimuli [51]. Specifically, they can interact with cognate receptors in midbrain VTA DA neurons, which not only project to the NAc, but also to prefrontal and limbic regions; in fact, many of them also express receptors in frontal regions and in hippocampus and amygdala [50]. 40
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Insulin, which is one of the key hormones involved in the regulation of glucose metabolism, has been shown to attenuate the response of limbic (including brain reward regions) and cortical regions in the human brain to food stimuli. For example, in healthy controls, insulin attenuated the activation of the hippocampus, frontal and visual cortices in response to food pictures [52]. Conversely, insulin-resistant subjects (patients with type 2 diabetes) showed greater activation in limbic regions (amygdala, striatum, OFC and insula) when exposed to food stimuli than did non-diabetic patients [53]. In the human brain, the adipocyte-derived hormone leptin, which acts in part though leptin receptors in hypothalamus (arcuate nucleus) to decrease food intake, has also been shown to attenuate the response of brain reward regions to food stimuli. Specifically, patients with congenital leptin deficiency showed activation of DA mesolimbic targets (NAc and caudate) to visual food stimuli, which was associated with food wanting, even when the subject had just been fed. By contrast, mesolimbic activation did not occur after 1 week of leptin treatment (Figure 2a,b). This was interpreted to suggest that leptin diminished the rewarding responses to food [19]. Another fMRI study, also done with patients with congenital leptin deficiency, showed that leptin treatment reduced the activation of regions involved with hunger (insula, parietal and temporal cortices) whereas it enhanced activation of regions involved in cognitive inhibition [prefrontal cortex (PFC)] upon exposure to food stimuli [20]. Thus, these two studies provide evidence that, in the human brain, leptin modulates the activity of brain regions involved not only with homeostatic processes, but also with rewarding responses and with inhibitory control. Gut hormones also appear to modulate the response of brain reward regions to food stimuli in the human brain. For example, the peptide YY3–36 (PYY), which is released from gut cells post-prandially and reduces food intake, was shown to modulate the transition of the regulation of food intake by homeostatic circuits (i.e. hypothalamus) to its regulation by reward circuits in the transition from hunger to satiety. Specifically, when plasma PYY concentrations were high (as when satiated), activation of the OFC by food stimuli negatively predicted food intake; whereas when plasma PYY levels were low (as when food deprived) hypothalamic activation positively predicted food intake [54]. This was interpreted to reflect that PYY decreases the rewarding aspects of food through its modulation of the OFC. By contrast, ghrelin (a stomach-derived hormone that increases in the fasted state and stimulates food intake) was shown to increase the activation in response to food stimuli in brain reward regions (amygdala, OFC, anterior insula and striatum) and their activation was associated with self-reports of hunger (Figure 2c,d). This was interpreted to reflect an enhancement of the hedonic and incentive responses to food-related cues by ghrelin [55]. Overall, these findings are also consistent with the differential regional brain activation in response to food stimuli in satiated versus fasted individuals; activation of reward regions in response to food stimuli is decreased during the sated when compared to the fasted state [15]. These observations point to an overlap between the neurocircuitry that regulates reward and/or reinforcement
()TD$FIG][ Review
Trends in Cognitive Sciences
January 2011, Vol. 15, No. 1
Figure 2. Leptin decreases whereas ghrelin increases reactivity to food stimuli in brain reward areas. (a, b) Brain images showing areas where leptin reduced the activation (NAc-caudate) in two subjects with leptin deficiency. (b) Histogram for the activation response to food stimuli in subjects with leptin deficiency before and after leptin treatment. (c) Ghrelin increases reactivity to food stimuli in brain reward areas, as indicated by SPM images showing brain areas where activation by food stimuli was greater with ghrelin than with saline; and (d) a histogram of limbic areas for the response to food stimuli after saline (controls; red bars) and after ghrelin (blue bars). Modified, with permission, from [19] (a, b) and [55] (c, d).
and that which regulates energy metabolism (Figure 1b). Peripheral signals that regulate homeostatic signals to food appear to increase the sensitivity of limbic brain regions to food stimuli when they are orexigenic (ghrelin) and to decrease the sensitivity to activation when they are anorexigenic (leptin and insulin). Similarly, the sensitivity of brain reward regions to food stimuli during food deprivation is increased, whereas it is decreased during satiety. Thus, homeostatic and reward circuitry act in concert to promote eating behaviors under conditions of deprivation and to inhibit food intake under conditions of satiety. Disruption of the interaction between homeostatic and reward circuitry might promote overeating and contribute to obesity (Figure 1). Although other peptides [glucagonlike peptide-1 (GLP-1), CKK, bombesin and amylin] also regulate food intake via their hypothalamic actions, their extrahypothalamic effects have received less attention [12]. Thus, much remains to be learned, including the interactions between the homeostatic and the non-homeostatic mechanisms that regulate food intake and their involvement in obesity. Disruption in reward and conditioning to food in overweight and obese individuals Preclinical and clinical studies have provided evidence of decreases in DA signaling in striatal regions [decreases in DAD2 (D2R) receptors and in DA release], that are linked with reward (NAc) but also with habits and routines
(dorsal striatum) in obesity [56–58]. Importantly, decreases in striatal D2R have been linked to compulsive food intake in obese rodents [59] and with decreased metabolic activity in OFC and ACC in obese humans [60] (Figure 3a–c). Given that dysfunction in OFC and ACC results in compulsivity [reviewed 61], this might be the mechanism by which low striatal D2R signaling facilitates hyperphagia [62]. Decreased D2R-related signaling is also likely to reduce the sensitivity to natural rewards, a deficit that obese individuals might strive to compensate temporarily for by overeating [63]. This hypothesis is consistent with preclinical evidence showing that decreased DA activity in the VTA results in a dramatic increase in the consumption of high-fat foods [64]. Indeed, compared with normal-weight individuals, obese individuals who were presented with pictures of high-calorie food (stimuli to which they are conditioned) showed increased neural activation of regions that are part of reward and motivation circuits (NAc, dorsal striatum, OFC, ACC, amygdala, hippocampus and insula) [65]. By contrast, in normal-weight controls, the activation of the ACC and OFC (regions involved in salience attribution that project into the NAc) during presentation of highcalorie food was found to be negatively correlated with their BMI [66]. This suggests a dynamic interaction between the amount of food eaten (reflected in part by the BMI) and the reactivity of reward regions to high-calorie 41
()TD$FIG][ Review
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1
Figure 3. Hyperphagia could result from a drive to compensate for a weakened reward circuit (processed through dopamine regulated corticostriatal circuits) combined with a heightened sensitivity to palatability (hedonic properties of food processed in part through the somatosensory cortex). (a) Averaged images for DA D2 receptor (D2R) availability in controls (n=10) and in morbidly obese subjects (n=10). (b) Results from SPM identifying the areas in the brain where D2R was associated with glucose metabolism, these included the medial OFC, ACC and the dorsolateral PFC (region not shown). (c) Regression slope between striatal D2R and metabolic activity in ACC in obese subjects. (d) Three-dimensionally rendered SPM images showing the areas with higher metabolism in obese than in lean subjects (P <0.003, uncorrected). (e) Colorcoded SPM results displayed in a coronal plane with a superimposed diagram of the somatosensory homunculus. The results (z value) are presented using the rainbow scale where red > yellow > green. When compared with lean subjects, obese subjects had higher baseline metabolism in the somatosensory areas where the mouth, lips and tongue are represented and which are involved with processing food palatability. Modified, with permission, from [22] (a–c) and [68] (d,e).
food (reflected in the activation of OFC and ACC) in normal-weight individuals, which is lost in obesity. Surprisingly, obese individuals, when compared with lean individuals, experienced less activation of reward circuits from the actual food consumption (consummatory food reward), whereas they showed greater activation of somatosensory cortical regions that process palatability when they anticipated consumption [67] (Figure 4). The latter finding is consistent with a study that reported increased baseline glucose metabolic activity (a marker of brain function) in somatosensory regions that process palatability, including insula, in obese as compared with lean subjects [68] (Figure 3d,e). An enhanced activity of
[()TD$FIG]
(a)
(b) 3.5
2 1 0
PE (caudate)
3
5 4 3 2 1 0 23 -1 -2 -3 -4
R2=0.2496
28
33
BMI TRENDS in Cognitive Sciences
Figure 4. Obese subjects have a decreased response in DA target regions when given food compared with that recorded in lean subjects. (a) Coronal section of weaker activation in the left caudate nucleus in response to receiving a milkshake versus a tasteless solution; (b) Correlation between the difference in activation and BMI of the subjects. Modified, with permission, from [67].
42
regions that process palatability could make obese subjects favor food over other natural reinforcers, whereas decreased activation of dopaminergic targets by the actual food consumption might lead to overconsumption as a means to compensate for the weak DA signals [69]. These imaging findings are consistent with an enhanced sensitivity of the reward circuitry to conditioned stimuli (viewing high-calorie food) that predict reward, but a decreased sensitivity to the rewarding effects of actual food consumption in dopaminergic pathways in obesity. We hypothesize that, to the extent that there is a mismatch between the expected reward and a delivery that does not fulfill this expectation, this will promote compulsive eating as an attempt to achieve the expected level of reward. Although the failure of an expected reward to arrive is accompanied by a decrease in DA cell firing in laboratory animals [70], the behavioral significance of such a decrease (when a food reward is smaller than expected) has, to our knowledge, not been investigated. In parallel to these activation changes in the reward circuitry in obese subjects, imaging studies have also documented consistent decreases in the reactivity of the hypothalamus to satiety signals in obese subjects [71,72]. Evidence of cognitive disruption in overweight and obese individuals There is increasing evidence that obesity is associated with impairment on certain cognitive functions, such as executive function, attention and memory [73–75]. Indeed, the ability to inhibit the urges to eat desirable food varies
Review among individuals and might be one of the factors that contribute to their vulnerability for overeating [34]. The adverse influence of obesity on cognition is also reflected in the higher prevalence of attention deficit hyperactivity disorder (ADHD) [76], Alzheimer disease and other dementias [77], cortical atrophy [78] and white matter disease [79] in obese subjects. Although co-morbid medical conditions (e.g. cerebrovascular pathology, hypertension and diabetes) are known to affect cognition adversely, there is also evidence that high BMI, by itself, might impair various cognitive domains, particularly executive function [75]. In spite of some inconsistencies among studies, brainimaging data have also provided evidence of structural and functional changes associated with high BMI in otherwise healthy controls. For example, an MRI study done in elderly females using voxel-wise morphometry showed a negative correlation between BMI and gray matter volumes (including frontal regions), which, in the OFC, was associated with impaired executive function [80]. Using positron emission tomography (PET) to measure brain glucose metabolism in healthy controls, a negative correlation was also shown between BMI and metabolic activity in PFC (dorsolateral and OFC) and in ACC. In this study, the metabolic activity in PFC predicted the subjects’ performance in tests of executive function [81]. Similarly, an NMR spectroscopic study of healthy middle age and elderly controls showed that BMI was negatively associated with the levels of N-acetyl-aspartate (a marker of neuronal integrity) in frontal cortex and ACC [79,82]. Brain-imaging studies comparing obese and lean individuals have also reported lower gray matter density in frontal regions (frontal operculum and middle frontal gyrus) and in post-central gyrus and putamen [83]. Another study, which found no differences in gray matter volumes between obese and lean subjects, did report a positive correlation between white matter volume in basal brain structures and waist:hip ratio; a trend that was partially reversed by dieting [84]. Finally, the role of DA in inhibitory control is well recognized and its disruption might contribute to behavioral disorders of discontrol, such as obesity. A negative correlation between BMI and striatal D2R has been reported in obese [58] as well as in overweight subjects [85]. As discussed above, the lower-than-normal availability of D2R in the striatum of obese individuals was associated with reduced metabolic activity in PFC and ACC [60]. These findings implicate neuroadaptations in DA signaling as contributors to the disruption of frontal cortical regions associated with overweight and obesity. A better understanding of these disruptions might help guide strategies to ameliorate, or perhaps even reverse, specific impairments in crucial cognitive domains. For example, delay discounting, which is the tendency to devalue a reward as a function of the temporal delay of its delivery, is one of the most extensively investigated cognitive operations in relation to disorders associated with impulsivity and compulsivity. Delay discounting has been most comprehensively investigated in drug abusers who prefer small-but-immediate over large-but-delayed rewards [86]. The few studies done in obese individuals have also shown that these individuals display preference for high,
Trends in Cognitive Sciences
January 2011, Vol. 15, No. 1
immediate rewards, despite an increased chance of suffering higher future losses [87,88]. Moreover, a positive correlation between BMI and hyperbolic discounting, whereby future negative payoffs are discounted less than are future positive payoffs, was recently reported [89]. Delay discounting seems to depend on the function of ventral striatum (where NAc is located) [90,91] and of the PFC, including OFC [92], and is sensitive to DA manipulations [93]. Interestingly, lesions of the OFC in animals can either increase or decrease the preference for immediate small rewards over delayed larger rewards [94,95]. This apparently paradoxical behavioral effect is likely to reflect the fact that at least two operations are processed through the OFC; one is salience attribution, through which a reinforcer acquires incentive motivational value, and the other is control over pre-potent urges [96]. Dysfunction of the OFC is associated with an impaired ability to modify the incentive motivational value of a reinforcer as a function of the context in which it occurs (i.e. decrease the incentive value of food with satiety), which can result in compulsive food consumption [97]. If the stimulus is highly reinforcing (such as food and food cues for an obese subject) the enhanced saliency value of the reinforcer will result in an enhanced motivation to procure it, which could appear as a willingness to delay gratification (such as spending time in long lines to buy ice cream). However, in contexts where food is readily available, the same enhanced saliency can trigger impulsive behaviors (such as buying and eating the chocolate located next to the cashier even without previous awareness of the desire of such item). Dysfunction of the OFC (and of the ACC) impairs the ability to rein in pre-potent urges, resulting in impulsivity and an exaggerated delayed discount rate. Food for thought It would appear, from the collected evidence presented here, that a substantial fraction of obese individuals exhibit an imbalance between an enhanced sensitivity of the reward circuitry to conditioned stimuli linked to energydense food and impaired function of the executive control circuitry that weakens inhibitory control over appetitive Box 3. Future basic research directions A better understanding of the interaction at the molecular, cellular, and circuit levels between the homeostatic and reward processes that regulate food intake. Understanding the role of genes in modulating the homeostatic and the reward responses to food. A better understanding of the involvement of other neurotransmitters, such as cannabinoids, opioids, glutamate, serotonin and GABA, in the long-lasting changes that occur in obesity. Investigating the developmental aspects of the neurobiology underlying food intake (homeostatic and rewarding) and its sensitivity to environmental food exposure. Understanding the epigenetic modifications in neuronal circuits involved with the homeostatic and rewarding control of food intake in the fetal brain in response to exposure to food excess and food deprivation during pregnancy. Investigating neuroplastic adaptations in homeostatic and reward circuits associated with chronic exposure to highly palatable foods and/or to high quantities of calorie-dense food. Investigating the relationship between homeostatic and hedonic processes regulating food intake and physical activity.
43
Review Box 4. Future clinical research directions Studies to ascertain whether the greater activation of rewardassociated areas in response to food-related cues in obese individuals underlies their vulnerability for overeating or reflects a secondary neuroadaptation to overeating. It is suggested that enhanced dopaminergic neurotransmission contributes to improved eating behavior through optimization and/or strengthening of cognitive control mechanisms mediated in part through the PFC; however, further research is needed into the currently ill-defined mechanisms involved. Diet alone is seldom a path to successful (i.e. sustainable) weight loss. It would be instructive to address whether: (i) dieting can trigger a withdrawal syndrome that increases the risk of relapse; and (ii) the decreased leptin levels associated with diet-induced weight loss lead to hyperactivation of reward circuitry and compensatory food seeking behaviors. Research to determine the neurobiology that underlies decreases in food craving and hunger following bariatric surgery.
behaviors. Regardless of whether this imbalance causes, or is caused by, pathological overeating, the phenomenon is reminiscent of the conflict between the reward, conditioning and motivation circuits and the inhibitory control circuit that has been reported in addiction [98]. Knowledge accumulated during the past two decades of the genetic, neural and environmental bases of obesity leaves no doubt that the current crisis has sprouted from the disconnect between the neurobiology that drives food consumption in our species and the richness and diversity of food stimuli driven by our social and economic systems. The good news is that understanding the deep-seated behavioral constructs that sustain the obesity epidemic holds the key to its eventual resolution (see also Boxes 3 and 4). References 1 Ogden, C.L. et al. (2006) Prevalence of overweight and obesity in the United States, 1999 to 2004. JAMA 295, 1549–1555 2 Flegal, K.M. et al. (2010) Prevalence and trends in obesity among US adults, 1999-2008. JAMA 303, 235–241 3 Finkelstein, E.A. et al. (2009) Annual medical spending attributable to obesity: payer-and service-specific estimates. Health Aff. 28, w822– w831 4 Baessler, A. et al. (2005) Genetic linkage and association of the growth hormone secretagogue receptor (ghrelin receptor) gene in human obesity. Diabetes 54, 259–267 5 Silventoinen, K. and Kaprio, J. (2009) Genetics of tracking of body mass index from birth to late middle age: evidence from twin and family studies. Obes. Facts 2, 196–202 6 Speliotes, E. et al. (2010) Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat. Genet. 42, 937–948 7 Thorleifsson, G. et al. (2009) Genome-wide association yields new sequence variants at seven loci that associate with measures of obesity. Nat. Genet. 41, 18–24 8 Naukkarinen, J. et al. (2010) Use of genome-wide expression data to mine the ‘Gray Zone’ of GWA studies leads to novel candidate obesity genes. PLoS Genet. 6, e1000976 9 Gosnell, B. and Levine, A. (2009) Reward systems and food intake: role of opioids. Int. J. Obes. 33 (Suppl. 2), S54–S58 10 van Vliet-Ostaptchouk, J.V. et al. (2009) Genetic variation in the hypothalamic pathways and its role on obesity. Obes. Rev. 10, 593– 609 11 Blouet, C. and Schwartz, G.J. (2010) Hypothalamic nutrient sensing in the control of energy homeostasis. Behav. Brain Res. 209, 1–12 12 Coll, A.P. et al. (2007) The hormonal control of food intake. Cell 129, 251–262 13 Dietrich, M. and Horvath, T. (2009) Feeding signals and brain circuitry. Eur. J. Neurosci. 30, 1688–1696
44
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1 14 Belgardt, B. et al. (2009) Hormone and glucose signalling in POMC and AgRP neurons. J. Physiol. 587 (Pt 22), 5305–5314 15 Goldstone, A.P. (2006) The hypothalamus, hormones, and hunger: alterations in human obesity and illness. Prog. Brain Res. 153, 57–73 16 Rolls, E. (2005) Taste, olfactory, and food texture reward processing in the brain and obesity. Int. J. Obes. 85, 45–56 17 Rolls, E.T. (2008) Functions of the orbitofrontal and pregenual cingulate cortex in taste, olfaction, appetite and emotion. Acta Physiol. Hung. 95, 131–164 18 Petrovich, G.D. et al. (2005) Amygdalar and prefrontal pathways to the lateral hypothalamus are activated by a learned cue that stimulates eating. J. Neurosci. 25, 8295–8302 19 Farooqi, I.S. et al. (2007) Leptin regulates striatal regions and human eating behavior. Science 317, 1355 20 Baicy, K. et al. (2007) Leptin replacement alters brain response to food cues in genetically leptin-deficient adults. Proc. Natl. Acad. Sci. U. S. A. 104, 18276–18279 21 Passamonti, L. et al. (2009) Personality predicts the brain’s response to viewing appetizing foods: the neural basis of a risk factor for overeating. J. Neurosci. 29, 43–51 22 Volkow, N.D. et al. (2008) Overlapping neuronal circuits in addiction and obesity: evidence of systems pathology. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 363, 3191–3200 23 Lenoir, M. et al. (2007) Intense sweetness surpasses cocaine reward. PLoS One 2, e698 24 Cason, A.M. et al. (2010) Role of orexin/hypocretin in reward-seeking and addiction: implications for obesity. Physiol. Behav. 100, 419– 428 25 Cota, D. et al. (2006) Cannabinoids, opioids and eating behavior: the molecular face of hedonism? Brain Res. Rev. 51, 85–107 26 Atkinson, T. (2008) Central and peripheral neuroendocrine peptides and signalling in appetite regulation: considerations for obesity pharmacotherapy. Obes. Rev. 9, 108–120 27 Wise, R. (2006) Role of brain dopamine in food reward and reinforcement. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 361, 1149– 1158 28 Small, D.M. et al. (2003) Feeding-induced dopamine release in dorsal striatum correlates with meal pleasantness ratings in healthy human volunteers. Neuroimage 19, 1709–1715 29 Norgren, R. et al. (2006) Gustatory reward and the nucleus accumbens. Physiol. Behav. 89, 531–535 30 Epstein, L. et al. (2009) Habituation as a determinant of human food intake. Psychol. Rev. 116, 384–407 31 Schultz, W. (2010) Dopamine signals for reward value and risk: basic and recent data. Behav. Brain Funct. 6, 24 32 Geisler, S. and Wise, R. (2008) Functional implications of glutamatergic projections to the ventral tegmental area. Rev. Neurosci. 19, 227–244 33 Petrovich, G. Forebrain circuits and control of feeding by learned cues. Neurobiol. Learn. Mem. 2010 Oct 19. [Epub ahead of print] 34 Wang, G.J. et al. (2009) Evidence of gender differences in the ability to inhibit brain activation elicited by food stimulation. Proc. Natl. Acad. Sci. U. S. A. 106, 1249–1254 35 Volkow, N.D. et al. (2002) ‘Nonhedonic’ food motivation in humans involves dopamine in the dorsal striatum and methylphenidate amplifies this effect. Synapse 44, 175–180 36 Berridge, K. (2009) ‘Liking’ and ‘wanting’ food rewards: brain substrates and roles in eating disorders. Physiol. Behav. 97, 537–550 37 Szczypka, M.S. et al. (2001) Dopamine production in the caudate putamen restores feeding in dopamine-deficient mice. Neuron 30, 819–828 38 Faure, A. et al. (2008) Mesolimbic dopamine in desire and dread: enabling motivation to be generated by localized glutamate disruptions in nucleus accumbens. J. Neurosci. 28, 7148–7192 39 Saddoris, M. et al. (2009) Associatively learned representations of taste outcomes activate taste-encoding neural ensembles in gustatory cortex. J. Neurosci. 29, 15386–15396 40 Smith, K.S. and Berridge, K.C. (2007) Opioid limbic circuit for reward: interaction between hedonic hotspots of nucleus accumbens and ventral pallidum. J. Neurosci. 27, 1594–1605 41 Wassum, K.M. et al. (2009) Distinct opioid circuits determine the palatability and the desirability of rewarding events. Proc. Natl. Acad. Sci. U. S. A. 106, 12512–12517
Review 42 Avena, N.M. et al. (2008) Evidence for sugar addiction: behavioral and neurochemical effects of intermittent, excessive sugar intake. Neurosci. Biobehav. Rev. 32, 20–39 43 Graillon, A. et al. (1997) Differential response to intraoral sucrose, quinine and corn oil in crying human newborns. Physiol. Behav. 62, 317–325 44 Richard, D. et al. (2009) The brain endocannabinoid system in the regulation of energy balance. Best Pract. Res. Clin. Endocrinol. Metab. 23, 17–32 45 Di Marzo, V. et al. (2009) The endocannabinoid system as a link between homoeostatic and hedonic pathways involved in energy balance regulation. Int. J. Obes. 33 (Suppl. 2), S18–S24 46 Matias, I. and Di Marzo, V. (2007) Endocannabinoids and the control of energy balance. Trends Endocrinol. Metab. 18, 27–37 47 Garfield, A. and Heisler, L. (2009) Pharmacological targeting of the serotonergic system for the treatment of obesity. J. Physiol. 587, 48– 60 48 Halford, J. et al. (2010) Pharmacological management of appetite expression in obesity. Nat. Rev. Endocrinol. 6, 255–269 49 Lam, D. et al. (2010) Brain serotonin system in the coordination of food intake and body weight. Pharmacol. Biochem. Behav. 97, 84–91 50 Lattemann, D. (2008) Endocrine links between food reward and caloric homeostasis. Appetite 51, 452–455 51 Rosenbaum, M. et al. (2008) Leptin reverses weight loss-induced changes in regional neural activity responses to visual food stimuli. J. Clin. Invest. 118, 2583–2591 52 Guthoff, M. et al. (2010) Insulin modulates food-related activity in the central nervous system. J. Clin. Endocrinol. Metab. 95, 748– 755 53 Chechlacz, M. et al. (2009) Diabetes dietary management alters responses to food pictures in brain regions associated with motivation and emotion: a functional magnetic resonance imaging study. Diabetologia 52, 524–533 54 Batterham, R.L. et al. (2007) PYY modulation of cortical and hypothalamic brain areas predicts feeding behaviour in humans. Nature 450, 106–109 55 Malik, S. et al. (2008) Ghrelin modulates brain activity in areas that control appetitive behavior. Cell Metab. 7, 400–409 56 Fulton, S. et al. (2006) Leptin regulation of the mesoaccumbens dopamine pathway. Neuron 51, 811–822 57 Geiger, B.M. et al. (2009) Deficits of mesolimbic dopamine neurotransmission in rat dietary obesity. Neuroscience 159, 1193– 1199 58 Wang, G.J. et al. (2001) Brain dopamine and obesity. Lancet 357, 354– 357 59 Johnson, P.M. and Kenny, P.J. (2010) Dopamine D2 receptors in addiction-like reward dysfunction and compulsive eating in obese rats. Nat. Neurosci. 13, 635–641 60 Volkow, N.D. et al. (2008) Low dopamine striatal D2 receptors are associated with prefrontal metabolism in obese subjects: possible contributing factors. Neuroimage 42, 1537–1543 61 Fineberg, N.A. et al. (2010) Probing compulsive and impulsive behaviors, from animal models to endophenotypes: a narrative review. Neuropsychopharmacology 35, 591–604 62 Davis, L.M. et al. (2009) Bromocriptine administration reduces hyperphagia and adiposity and differentially affects dopamine D2 receptor and transporter binding in leptin-receptor-deficient Zucker rats and rats with diet-induced obesity. Neuroendocrinology 89, 152– 162 63 Geiger, B.M. et al. (2008) Evidence for defective mesolimbic dopamine exocytosis in obesity-prone rats. FASEB J. 22, 2740–2746 64 Cordeira, J.W. et al. (2010) Brain-derived neurotrophic factor regulates hedonic feeding by acting on the mesolimbic dopamine system. J. Neurosci. 30, 2533–2541 65 Stoeckel, L. et al. (2008) Widespread reward-system activation in obese women in response to pictures of high-calorie foods. Neuroimage 41, 636–647 66 Killgore, W. and Yurgelun-Todd, D. (2005) Body mass predicts orbitofrontal activity during visual presentations of high-calorie foods. Neuroreport 31, 859–863 67 Stice, E. et al. (2008) Relation of reward from food intake and anticipated food intake to obesity: a functional magnetic resonance imaging study. J. Abnorm. Psychol. 117, 924–935
Trends in Cognitive Sciences
January 2011, Vol. 15, No. 1
68 Wang, G. et al. (2002) Enhanced resting activity of the oral somatosensory cortex in obese subjects. Neuroreport 13, 1151–1155 69 Stice, E. et al. (2008) Relation between obesity and blunted striatal response to food is moderated by TaqIA A1 allele. Science 322, 449– 452 70 Schultz, W. (2002) Getting formal with dopamine and reward. Neuron 36, 241–263 71 Cornier, M.A. et al. (2009) The effects of overfeeding on the neuronal response to visual food cues in thin and reduced-obese individuals. PLoS One 4, e6310 72 Matsuda, M. et al. (1999) Altered hypothalamic function in response to glucose ingestion in obese humans. Diabetes 48, 1801–1806 73 Bruce-Keller, A.J. et al. (2009) Obesity and vulnerability of the CNS. Biochim. Biophys. Acta. 1792, 395–400 74 Bruehl, H. et al. (2009) Modifiers of cognitive function and brain structure in middle-aged and elderly individuals with type 2 diabetes mellitus. Brain Res. 1280, 186–194 75 Gunstad, J. et al. (2007) Elevated body mass index is associated with executive dysfunction in otherwise healthy adults. Compr. Psychiatry 48, 57–61 76 Cortese, S. et al. (2008) Attention-deficit/hyperactivity disorder (ADHD) and obesity: a systematic review of the literature. Crit. Rev. Food Sci. Nutr. 48, 524–537 77 Fotuhi, M. et al. (2009) Changing perspectives regarding late-life dementia. Nat. Rev. Neurol. 5, 649–658 78 Raji, C.A. et al. (2010) Brain structure and obesity. Hum. Brain Mapp. 31, 353–364 79 Gazdzinski, S. et al. (2008) Body mass index and magnetic resonance markers of brain integrity in adults. Ann. Neurol. 63, 652–657 80 Walther, K. et al. (2010) Structural brain differences and cognitive functioning related to body mass index in older females. Hum. Brain Mapp. 31, 1052–1064 81 Volkow, N.D. et al. (2008) Inverse association between BMI and prefrontal metabolic activity in healthy adults. Obesity 17, 60–65 82 Gazdzinski, S. et al. (2009) BMI and neuronal integrity in healthy, cognitively normal elderly: a proton magnetic resonance spectroscopy study. Obesity 18, 743–748 83 Pannacciulli, N. et al. (2006) Brain abnormalities in human obesity: a voxel-based morphometric study. Neuroimage 31, 1419–1425 84 Haltia, L.T. et al. (2007) Brain white matter expansion in human obesity and the recovering effect of dieting. J. Clin. Endocrinol. Metab. 92, 3278–3284 85 Haltia, L.T. et al. (2007) Effects of intravenous glucose on dopaminergic function in the human brain in vivo. Synapse 61, 748–756 86 Bickel, W.K. et al. (2007) Behavioral and neuroeconomics of drug addiction: competing neural systems and temporal discounting processes. Drug Alcohol. Depend. 90 (Suppl. 1), S85–S91 87 Brogan, A. et al. (2010) Anorexia, bulimia, and obesity: shared decision making deficits on the Iowa Gambling Task (IGT). J. Int. Neuropsychol. Soc. 1–5 88 Weller, R.E. et al. (2008) Obese women show greater delay discounting than healthy-weight women. Appetite 51, 563–569 89 Ikeda, S. et al. (2010) Hyperbolic discounting, the sign effect, and the body mass index. J. Health Econ. 29, 268–284 90 Cardinal, R.N. (2006) Neural systems implicated in delayed and probabilistic reinforcement. Neural Netw. 19, 1277–1301 91 Gregorios-Pippas, L. et al. (2009) Short-term temporal discounting of reward value in human ventral striatum. J. Neurophysiol. 101, 1507– 1523 92 Bjork, J.M. et al. (2009) Delay discounting correlates with proportional lateral frontal cortex volumes. Biol. Psychiatry 65, 710–713 93 Pine, A. et al. (2010) Dopamine, time, and impulsivity in humans. J. Neurosci. 30, 8888–8896 94 Mobini, S. et al. (2002) Effects of lesions of the orbitofrontal cortex on sensitivity to delayed and probabilistic reinforcement. Psychopharmacology 160, 290–298 95 Roesch, M.R. et al. (2007) Should I stay or should I go? Transformation of time-discounted rewards in orbitofrontal cortex and associated brain circuits. Ann. N. Y. Acad. Sci. 1104, 21–34 96 Schoenbaum, G. et al. (2009) A new perspective on the role of the orbitofrontal cortex in adaptive behaviour. Nat. Rev. Neurosci. 10, 885–892
45
Review 97 Schilman, E.A. et al. (2010) The role of the striatum in compulsive behavior in intact and orbitofrontal-cortex-lesioned rats: possible involvement of the serotonergic system. Neuropsychopharmacology 35, 1026–1039 98 Volkow, N.D. et al. (2009) Imaging dopamine’s role in drug abuse and addiction. Neuropharmacology 56 (Suppl. 1), 3–8 99 Davidson, T. et al. (2009) Contributions of the hippocampus and medial prefrontal cortex to energy and body weight regulation. Hippocampus 19, 235–252 100 Forloni, G. et al. (1986) Role of the hippocampus in the sex-dependent regulation of eating behavior: studies with kainic acid. Physiol. Behav. 38, 321–326 101 Haase, L. et al. (2009) Cortical activation in response to pure taste stimuli during the physiological states of hunger and satiety. Neuroimage 44, 1008–1021
46
Trends in Cognitive Sciences January 2011, Vol. 15, No. 1 102 Massa, F. et al. (2010) Alterations in the hippocampal endocannabinoid system in diet-induced obese mice. J. Neurosci. 30, 6273–6281 103 McNay, E.C. (2007) Insulin and ghrelin: peripheral hormones modulating memory and hippocampal function. Curr. Opin. Pharmacol. 7, 628–632 104 Bragulat, V. et al. (2010) Food-related odor probes of brain reward circuits during hunger: a pilot FMRI study. Obesity 18, 1566– 1571 105 Benarroch, E. (2010) Neural control of feeding behavior: overview and clinical correlations. Neurology 74, 1643–1650 106 Olszewski, P. et al. (2008) Analysis of the network of feeding neuroregulators using the Allen Brain Atlas. Neurosci. Biobehav. Rev. 32, 945–956