Psychophysiology, 48 (2011), 1. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01161.x
ANNOUNCEMENT
Full-length manuscripts should not exceed 30 pages of text (including references). It is suggested that the introductory section be limited to approximately 1000 words and the discussion to 2000 words. To qualify for consideration as a Brief Report, the entire manuscript should not exceed 3000 words, including references, tables, figures, figure legends, and abstract. Each table and each figure corresponds to 250 words. Unless otherwise specified, the guideline for preparation of manuscripts is the Publication Manual of the American Psychological Association (6th edition).
Starting with this January 2011 issue, Psychophysiology will increase its publication frequency from 6 to 12 issues per year. Over the past several years, submissions to the journal have more than doubled, resulting in an abundance of high quality manuscripts. The Society for Psychophysiological Research, which publishes the journal through Wiley-Blackwell, has decided to increase the number of issues and total pages to accommodate the high number of quality submissions to the journal, providing authors with more opportunities to publish and shorter publication lag times, and scholars with access to more high quality papers in a growing research area. At the same time, the Journal has made the following changes to the Author Guidelines regarding the length of submitted manuscripts and the preparation style:
1
Psychophysiology, 48 (2011), 2–3. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01142.x
In Memoriam John A. Stern (1925–2010) JOHN W. ROHRBAUGH Washington University School of Medicine
John Alexander Stern died on April 3, 2010, several months after being diagnosed with pancreatic cancer, in his home and with his family gathered. Upon his death, the Society for Psychophysiological Research lost a founding member, one who remained as an active contributor to the Society throughout his long and distinguished career. John was the Society’s seventh president, hosted its annual meeting in Saint Louis in 1971, served on a number of committees, and was a frequent contributor and reviewer for the journal Psychophysiology. In 1993 he was awarded the Society’s highest honor, the award for Distinguished Contributions to Psychophysiology. This service to the Society was part of his larger role in the development of psychophysiology as an integrative discipline. John was born January 4, 1925, in Montabour, Germany. In advance of the repression of Jews in Germany (combined with his self-described ‘‘youthful outspokenness’’), he was sent at age 9 to live with relatives in Holland, then on to New York in early 1936. Upon graduation from Peter Stuyvesant High School in 1943, he was drafted into the Army of the United States in which he saw heavy combat in the Mediterranean and European theaters. He was wounded on several occasions and was awarded three Bronze Stars. Although he was reticent to talk about his military service, it clearly marked his temperament and professional career. He related how his wartime experiences fostered his interest in psychologyFthe impressions left on him by the extremes of behavior seen under combat conditions and the seeming impossibility of predicting them on the basis of comportment during less challenging times. And, somewhat paradoxically, they left him with a certain kind of optimism, that nothing in comparison would be beyond him going forward. After returning from the war, John enrolled under GI benefits at Hunter College in New York, where he received his B.A. degree in 1949. This was followed by graduate studies in psychology at the University of Illinois in Urbana–Champaign, leading to a Ph.D. in 1953. It was there that he met his wife Carolyn, who was working in the University’s Student Counseling Center and who was to remain his partner through life. He is survived by Carolyn, a twin sister Helen Silber, daughters Julie Stern and Nancy Mozier, son John Stern, and their spouses, and five grandchildren. John spent his entire career at Washington University, first in the School of Medicine, where he headed the Division of Medical Psychology in the Department of Psychiatry from 1961 to 1969, after which he moved to the Department of Psychology, where he served as chair from 1987–1996. His tenure as chair was marked by a restoration of intradepartmental cooperation and rise in productivity. He oversaw the design and construction of a new psychology building, which laid the groundwork for substantial
subsequent expansion of the department. He retained an office and laboratory throughout the remainder of his career, becoming Professor Emeritus in 2000. He helped to form the company BioBehavioral Analysis Systems, as a vehicle for supporting his ongoing research. In the very first issue of this publication, John contributed a brief piece that aimed to define the domain of appropriate content for the new journal. While much discussed over the years, the thoughts expressed there have served in a much broader context to establish an identity for the discipline and to foster a research agenda. A read through his professional bibliography offers a tour of many of the themes that shaped psychophysiology as a scientific discipline and that persist as core issues. In an era of ever increasing specialization and focus, it is remarkable to reflect on a time that accommodated such breadth. 2
In Memoriam John’s early studies focused on the effects of electroconvulsive shock in rats, but with a clear emphasis at that seminal stage on the relationships among physiological and behavioral effects. It is to the benefit of human psychophysiology that John was forced to move away from these studies because he developed allergies to his rat subjects and so shifted to human research. Along the way, there were studies relating to the psychiatric issues of mental disorders, stress, and acute and chronic effects of drugs and common substances, but ranging widely to include studies of acupuncture and hypnosis, development and aging, hemispheric specialization, sleep, and sexual behavior. This work served to crystallize his interests in the integrative aspects of physiological responses in multiple response systems and the degree to which they supported differentiation among emotions, cognitions, and motivational states. His focus on the basic processes of conditioning, orienting, and habituation evolved to a broader perspective encompassing complex behaviors, including reading and other processes involved in the acquisition and processing of information. The final three decades of his career emphasized oculomotor activity. He was fond of saying that this interest evolved from a series of electroencephalogram (EEG) studies in which he became concerned with the then customary procedure of requiring subjects to suppress eye movements and blinks (in the interest of minimizing the attendant artifact). John felt that this risked artificially truncating the associated mental processes and that the ocular ‘‘artifact’’ was of interest in its own right. He went on to demonstrate in a landmark series of studies and reviews that even the simple eyeblink was not randomly distributed in time, but could be used to infer aspects regarding the timing and quality of cognitive processes. Late in his career he increasingly turned to using oculomotor measures in applied settings, to study problems of fatigue, situational awareness, air traffic control, driving, piloting, and the detection of deception. Even as a graduate student, he had carried aloft a primitive forensic polygraph to record physiological responses in pilots during simulated emergencies. He was instrumental in 1993 in the founding of the special interest group, Psychophysiology in Ergonomics, to foster this type of research. Throughout, his career was marked by a level of technical sophisticationFa ‘‘gadgeteering’’ that he admired in his pre-
3 decessors. His publications include an early description of a new stabilometer, and he readily adopted then novel methods for sensing skin temperature and blood volume. He was among the pioneers in applying computerized methods to the analysis of the EEG and eye movements, capitalizing on the availability of the LINC computer at Washington University (as a principal development site). He enthusiastically adopted camera-based hardware and software methods for recording oculomotor activity and interacted with major commercial developers to improve their products. In recognition that head movements are an essential aspect of the gaze control system, he developed a simple laboratory apparatus that allowed him to record head movements and enfold them into his measures. The dry facts of his curriculum vitae, impressive as they are, fail to capture the lively essence of his spirit as a friend and colleague. He was extremely generousFcertainly to me but also to a wide net of associates around the world. Over his career, he had intense and lengthy collaborations with Czech, Russian, Japanese, and Chinese investigators, as well as those in the United States, often hosting them in Saint Louis for extended stays. He was a committed teacher and mentor; one of his legacies is the John Stern Fund for Undergraduate Research at Washington University, which he established with a substantial donation. He was always optimistic, always supportive, always available. He was also committed to the arts and a number of social causes in the community beyond Washington University. I had the wonderful privilege of working closely with John for the past 20 years, often on a daily basis. If there was any disengagement as he aged, any dimming of the force of his intellect, I never saw it. He remained a productive scientist, continued to publish, secured research funding, and served as reviewer and consultant. His absences from the laboratory grew a bit longer and a bit more frequent because he followed his love of travel with Carolyn, but there was never any real winding down. Upon the passing of Chester Darrow, John eulogized in this journal many years ago (1967) that, ‘‘though he may have been grandfatherly in age at the time of his death he was always young in ideas and actively involved in research.’’ I think that John would see this, with pride, as a fitting tribute to him as well.
Psychophysiology, 48 (2011), 4–22. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01114.x
Auditory processing that leads to conscious perception: A unique window to central auditory processing opened by the mismatch negativity and related responses
RISTO NA¨A¨TA¨NEN,a,b,c TEIJA KUJALA,c and ISTVA´N WINKLER,c,d,e a
Department of Psychology, University of Tartu, Tartu, Estonia Centre of Integrative Neuroscience (CFIN), University of Aarhus, Aarhus, Denmark Cognitive Brain Research Unit, Institute of Behavioural Sciences, University of Helsinki, Helsinki, Finland d Department of Experimental Psychology, Institute for Psychology, Hungarian Academy of Sciences, Budapest, Hungary e Institute of Psychology, University of Szeged, Hungary b c
Abstract In this review, we will present a model of brain events leading to conscious perception in audition. This represents an updated version of Na¨a¨ta¨nen’s previous model of automatic and attentive central auditory processing. This revised model is mainly based on the mismatch negativity (MMN) and N1 indices of automatic processing, the processing negativity (PN) index of selective attention, and their magnetoencephalographic (MEG) and functional magnetic resonance imaging (fMRI) equivalents. Special attention is paid to determining the neural processes that might underlie conscious perception and the borderline between automatic and attention-dependent processes in audition. Descriptors: Central auditory processing, Event-related potential (ERP), Mismatch negativity (MMN), N1
will also be used in specifying this model.Before introducing the model, it is necessary to examine the functional significance and separability of these overlapping brain responses, because these are central to the model. Moreover, this discussion will help in interpreting the roles that the processes that generate these responses play in the model.
One of the most exciting issues of modern cognitive neuroscience is the division to, and borderline between, brain processes that underly or do not underly conscious experience. For example, how much of auditory processing occurs outside of our attention and conscious experience and, further, what are the brain mechanisms that determine whether conscious perception occurs or not? These questions will be the scope of the present review which will present a model of preconscious and conscious processes in audition that aims at determining the functional borderline between the two processing modes. This model is an extension of Na¨a¨ta¨nen’s (1990) model on attention and automaticity in auditory processing. The principal tool that has developed this model is the mismatch negativity (MMN) (Na¨a¨ta¨nen, Gaillard, & Ma¨ntysalo, 1978). The (auditory) MMN is a fronto–centrally negative event-related potential (ERP) component that is elicited by sounds that violate the automatic predictions of the central auditory system. The MMN and its magnetoencephalographic (MEG) equivalent, the MMNm (Hari et al., 1984) provide a unique window to preconscious central auditory processing. Results obtained for the N1 (Na¨a¨ta¨nen & Picton, 1987) and the processing negativity (PN) described by Na¨a¨ta¨nen et al. (1978)
The mismatch negativity (MMN). The MMN was initially isolated from the ‘‘N2’’ (Ford, Kopell, et al., 1976a, 1976b; Simson, Vaughan, & Ritter, 1976, 1977; Squires, Squires, & Hillyard, 1975; Squires, Wickens, Squires, & Donchin, 1976;) and the ‘‘N2-P3a’’ (Snyder & Hillyard, 1976) wave complexes, that are typically elicited in auditory oddball sequences, by Na¨a¨ta¨nen et al. (1978; see also Na¨a¨ta¨nen, 1975) through the use of deviantstandard difference waveforms. In contrast, the N1, an obligatory fronto–centrally negative-polarity response that peaks at about 100 ms from sound onset, manifests as a separate ERP peak. The MMN and its magnetic counterpart MMNm usually become clearly visible only through a subtraction procedure, in which the ERP response to some control stimulus, such as the frequent stimulus (‘‘standard’’), is subtracted from the response elicited by the infrequent stimulus ‘‘deviant.’’ For a review of the proper control for deriving the MMN, see Kujala, Tervaniemi, and Schro¨ger (2007). Further, Na¨a¨ta¨nen, Simpson, and Loveless (1982) showed that, after the MMN (‘‘N2a’’) is removed from the N2 wave complex by subtraction, the remaining waveform can be identified as the ‘‘N2b’’ response. The N2b response has a somewhat posterior topography compared to the N1 and also to
This research was supported by The Philips Nordic Prize for 2007 for Achieved and Continued Research Work on Neurodevelopmental Disorders, the Academy of Finland (projects 1122745 and 1128840), and the European Commission (ICT-FP7-231168). Address correspondence to: Risto Na¨a¨ta¨nen, Cognitive Brain Research Unit, Institute of Behavioural Sciences, P. O. Box 9 (Siltavuorenpenger 1B), 00014 University of Helsinki, Finland. E-mail:
[email protected] 4
Auditory processing that leads to conscious perception
5
the MMN and, together with the accompanying P3a, forms the N2-P3a or the N2b-P3a complex. The N2b-P3a complex is elicited by deviants when the stimulus sequence is attended or when there is an ‘‘attention leak’’ to the to-be-ignored channel, which has been reviewed by Na¨a¨ta¨nen and Gaillard (1983). For a
schematic illustration of the different ERP components obtained in the oddball paradigm, see Figure 1. The MMN was initially interpreted as being generated by an automatic memory-based change-detection mechanism that operates independently of the listener’s attention or behavioral
IGNORE CONDITION Components
ERP wave form standard deviant
N1-P2 MMN
MMN
Fz
MMN
N1 Cz P2
Pz − −100
0 stim
400
200
−100
[ms] +
0 stim
400
200
[ms]
ATTEND CONDITION N1-P2 MMN N2b-P3a P3b
Components MMN
ERP wave form
standard deviant
“N2”
“frontal slow wave” Fz
“N1”
N2b Cz P165
“P2”
P3a
Pz
−100
0 stim
200
400
“parietal slow wave”
−
P3 or P3b
“P3” or “P3b” −100
[ms] +
“P3a”
0 stim
200
400
[ms]
Figure 1. Top left: Different ERP components (contributions to the scalp-recorded ERP of separable generator processes; Na¨a¨ta¨nen & Picton, 1987) in the oddball paradigm are schematically illustrated, separately for standards and deviants, for midline electrodes Fz, Cz, and Pz in the IGNORE condition. These components are elicited even in the absence of attention and also when attention is strictly controlled, which reflects fully automatic processing in audition. Note that the N1 and P2 components might differ between deviants and standards, depending on the nature and probability of the stimuli used. Top right: ERP waveforms recorded from these midline electrodes composed of the components illustrated on the left are shown. Note also the much more frontal midline scalp topography of the MMN relative to that of the N1. Bottom left: The component structure of the ERPs recorded on the same midline electrodes is illustrated in the ATTEND condition. The most important difference to the IGNORE condition is the addition of the N2b-P3a complex, which is often preceded by a P165 (Goodin, Squires, Henderson, & Starr, 1978), and the slow frontal negative and parietal positive waves. Depending on instructions, attention may also enhance the N1 and MMN amplitudes. Bottom right: ERP waveforms recorded from these midline electrodes composed of the components illustrated on the left are shown. Source: Na¨a¨ta¨nen (1986).
6 goals (Na¨a¨ta¨nen et al., 1978; Na¨a¨ta¨nen & Michie, 1979; Na¨a¨ta¨nen, 1975), even though some studies (Woldorff, Hackley, & Hillyard, 1991; Woldorff, Hillyard, Gallen, Hampson & Bloom, 1998) showed that under some conditions, the MMN amplitude can be attenuated by strongly focusing attention to some other stimulus sequence. For two recent reviews of the attention-MMN relationship, see Haroush, Hochstein, and Deouell (2010) and Sussman (2007). Further, according to this prevailing interpretation (Na¨a¨ta¨nen, Paavilainen, Rinne, & Alho, 2007; Winkler, 2007), the MMN is based on a memory trace that encodes the repetitive aspects (termed regularity) of the most recent auditory stimulation. The MMN is elicited when the auditory input does not match the actual or predicted sensory information encoded in this trace (Grimm & Schro¨ger, 2007; Tervaniemi, Maury, & Na¨a¨ta¨nen, 1994a). The most recent interpretation of the MMN emphasizes the active role of the memory trace assumed to be used in MMN generation. The MMN is elicited by a mismatch between the auditory input and the predictions formed on the basis of the trends or rules that are automatically detected in the recent auditory stimulation (Na¨a¨ta¨nen, 1992; Na¨a¨ta¨nen & Winkler, 1999; Winkler, Karmos, & Na¨a¨ta¨nen, 1996; Winkler, Denham, & Nelken, 2009a; Winkler, 2007). The biological significance of the MMN-generation process might be the automatic switching of the organism’s attention to auditory change. This interpretation is supported by transient deteriorations in primary-task performance that accompany MMN elicitation by changes in irrelevant auditory background stimulation (Escera, Corral, & Yago, 2002; Escera, Yago, Corral, Corbera, & Nun˜ez, 2003; Schro¨ger, 1996, 1997; Yago, Escera, Alho, & Giard, 2001). It is possibly the frontal MMN subcomponent (Deouell, 2007; Giard, Perrin, Pernier, & Bouchet, 1990; Gomot, Giard, Roux, Barthelemy, & Bruneau, 2000; Ja¨a¨skela¨inen, Alho, Escera, Winkler, Sillanaukee, & Na¨a¨ta¨nen, 1996b; Ja¨a¨skela¨inen, Pekkonen, Hirvonen, Sillanaukee, & Na¨a¨ta¨nen, 1996a; Ja¨a¨skela¨inen, Varonen, Na¨a¨ta¨nen, & Pekkonen, 1999; Molholm, Martinez, Ritter, Javitt, & Foxe, 2005; Rinne, Alho, Ilmoniemi, Virtanen, & Na¨a¨ta¨nen, 2000; Tse and Penney, 2008), with an onset that follows that of the auditory-cortical MMN subcomponent by 10–20 ms (Rinne et al., 2000; Tse & Penney, 2008), that is generated by the attention-call process (O¨hman, 1979) to auditory deviance, as suggested by Giard et al. (1990). This is supported by, among other things, the fact that one of the important frontal-lobe functions controls the direction of attention (Fuster, 1989; Knight, 1991; Stuss & Benson, 1986). Furthermore, the involvement of the frontal cortex in MMN generation is also supported by results that show that lesions of dorsolateral prefrontal cortex attenuate the MMN amplitude (Alain, Woods, & Knight, 1998; Alho, Woods, Algazi, Knight, Na¨a¨ta¨nen, et al., 1994). Ja¨a¨skela¨inen, Alho, et al. (1996), Ja¨a¨skela¨inen, Pekkonen, et al. (1996), and Ja¨a¨skela¨inen et al. (1999) also demonstrated the role of the generator process of the frontal MMN component in attention switching. They found that even a moderate dose of alcohol selectively eliminated this frontal component, which leaves the auditory-cortex component intact and, simultaneously, abolishes the distracting effect of noise on the hit rate in the primary task that was observed in the absence of alcohol. Hence, ethanol blocks the route of auditory distraction to the involuntary attention-switching system reflected by the frontal MMN component. Further evidence implicating the role of the frontal MMN subcomponent in attention switching was pro-
R. Na¨a¨ta¨nen et al. vided by data obtained from closed head injury patients. These show an association between a pathologically strong frontal MMN-generator process and a pathologically sensitized involuntary attention switching (Kaipio et al., 2000). However, the auditory-cortical component was unaffected (Kaipio et al., 2000). The memory trace that encodes sensory information of the preceding stimuli assumed to be involved in MMN generation usually lasts for a few seconds (Bo¨ttscher-Gandor & Ullsperger, et al., 1992; Cheour et al., 2002; Cooper, Todd, McGill, & Michie, 2006; Glass, Sachse, & von Suchodoletz, 2008a, 2008b; Gomes et al., 1999; Grau, Escera, Yago, & Polo, 1998; Pekkonen et al., 1996; Ritter, Deacon, Gomes, Javitt, & Vaughan, 1995; Sams, Hari, Rif, & Knuutila, 1993). Thereafter no MMN can be elicited, unless the trace is reactivated by a ‘‘reminder’’ stimulus (Winkler & Cowan, 2005). Very importantly, no MMN can be elicited before this trace has been developed, that is, before the regular aspects of the auditory input have been extracted from the sound sequence (Bendixen, Roeber, & Schro¨ger, 2007; Bendixen & Schro¨ger, 2008; Cowan, Winkler, Teder, & Na¨a¨ta¨nen, 1993; Sams et al., 1985). Further, deviance based on any feature difference or combination of feature differences, that the listener is able to discriminate elicits the MMN (Deacon, Nousak, Pilotti, Ritter, & Yang, 1998; Na¨a¨ta¨nen & Alho, 1995, 1997). This suggests that the memory trace in question encodes the results of the full analysis of the acoustic features, including their integration into a unitary sensory-memory representation. In contrast, the refractoriness patterns of similar duration that account for the N1 adaptation effects probably encode acoustic features separately, and thus serve as buffers to the sensory data provided by the different feature detectors as a necessary prerequisite for auditory feature integration (Na¨a¨ta¨nen & Winkler, 1999).
Separability of N1 and MMN As to whether the MMN and the N1 are separable has been discussed since the discovery of the MMN. Recently, within the framework of this debate, it has been suggested (Ja¨a¨skela¨inen et al., 2004; Ja¨a¨skela¨inen, Ahveninen, Belliveau, Raij, & Sams, 2007; May & Tiitinen, 2009) that the deviant-standard difference wave can be fully explained by the N1 difference between deviants and standards. In contrast, new computational modeling results clearly separate N1- and MMN-related neural activity within the deviant-minus-standard difference wave (Friston & Kiebel, 2009; Garrido et al., 2008; Garrido, Kilner, Kiebel, & Friston, 2009a). The phenomenon of separability raises two questions: (1) Can the observable deviant-minus-control subtraction waveform be explained by differences in the N1 components elicited by the two stimulus events? (2) Does one need to assume the existence of a memory trace to account for the MMN results obtained during the past ca. 30 years? In the following, we shall show that the N1 and MMN ERP responses can be separated. Further, that the two discrete responses reflect different types of memory traces, both of which are important for understanding preconscious and conscious central auditory processing in the human brain. These will be described in the model we present. Furthermore, these processes can be separated from voluntary (conscious) operations on auditory information, as shown by a third important ERP response, the PN. We shall start by reviewing ERP studies that indicate that the MMN can be observed under conditions in which there can be no systematic N1 differences between deviants and standards. This is the case when deviants differ from stan-
Auditory processing that leads to conscious perception dards in terms of higher-order categories, or when deviants violate higher-order sequential contingency rules. In the next section, we will review evidence that shows that MMN is elicited or enhanced with no systematic acoustic difference. For instance, this occurs when speech-sound deviants are presented to listeners who speak or do not speak the language involved, or when deviant sounds violate linguistic or musicrelated rules. Language-specific MMNs. Several studies compared MMNs elicited by acoustically identical speech stimuli between native speakers of a language and control subjects who did not speak that language (Cheour et al., 1998; Dehaene-Lambertz, 1997; Dehaene-Lambertz, Dupoux, & Gout, 2000; Na¨a¨ta¨nen et al., 1997; Pulvermu¨ller et al., 2001; Sharma & Dorman, 2000; Winkler et al., 1999). The MMN difference obtained between the two groups cannot be explained by acoustic N1-related factors that were the same for the two groups. For instance, when Pulvermu¨ller et al. (2001) instructed Finnish subjects to ignore sounds and to watch a silent movie, they found that the MMNm to the same spoken Finnish syllable as a deviant stimulus was larger in amplitude when it ended a Finnish word than when it ended a pseudoword. In contrast, this effect did not occur in foreign participants who understood no Finnish. Moreover, the major intracranial source of this word-related MMNm was located in the left superior temporal lobe and it was clearly separable from the N1m locus, which demonstrates an MMN (MMNm) that could not be derived from the N1 response. For further MMNm data supporting this conclusion, see Shestakova et al. (2002) who found a left-hemispheric vowel-category MMNm with 150 randomized, acoustically varying exemplars in each vowel category. MMN to syntactic and semantic violations. The N1-independent generation of the MMN is also shown by studies that demonstrate the automatic processing of grammar. For instance, in Pulvermu¨ller and Shtyrov’s (2003) study that used grammatical and ungrammatical items as deviant stimuli, the MMNm was enhanced in amplitude for grammatical violations as compared with that elicited by grammatically correct deviants. This MMNm, with its main source in the left frontal cortex, indicated that the MMN mechanism was engaged when these grammar effects were elicited. The authors related this syntactic MMNm to the differential activation of neuronal memory traces for grammatical word sequences called ‘‘sequence detectors’’ (Bonte, Mitterer, Zellagui, Poelmans, & Blomert, 2005; Mitterer & Blomert, 2003; Pulvermu¨ller & Shtyrov, 2003). Subsequent studies confirmed and extended this initial finding to different kinds of syntactic and even to semantic violations (Gunter, Friederici, & Hahne, 1999; Hasting, Kotz, & Friederici, 2007; Menning et al., 2005; Pulvermu¨ller & Assadollahi, 2007; Shtyrov, Pulvermu¨ller, Na¨a¨ta¨nen, & Ilmoniemi, 2003; for reviews, see Pulvermu¨ller, 2001; Pulvermu¨ller & Knoblauch, 2009; Pulvermu¨ller & Shtyrov, 2006; Pulvermu¨ller, Shtyrov, & Hauk, 2009). See also the ‘‘early left anterior negativity’’ (ELAN) described by Friederici and her colleagues (Eckstein & Friederici, 2006; Friederici, 1995, 2002, 2004; Friederici et al. 1993, 1996, 2004; Rossi et al., 2006), which was elicited by syntactic violations as early as at 100–150 ms from the violation onset and which was not affected by attentional factors (Hahne & Friederici, 1999), hence closely resembling the MMNs to syntactic violations reviewed above.
7 In addition, evidence converging with results from language studies was obtained in the research on the automatic processing of musical syntax (Koelsch, Gunter, Schro¨ger, & Friederici, 2003; Koelsch, Grossman, et al., 2003; Leino et al., 2007; Loui et al., 2005). In these studies, chords with an irregular harmonic function that violated the rules of the Western music, presented within sequences of in-key chords, elicited the ‘‘early right anterior negativity’’ (ERAN). This, in turn, has been denoted as the ‘‘music-syntactic MMN’’ (Koelsch, Gunter, et al., 2003; Koelsch, Grossman, et al., 2003; Koelsch & Siebel, 2005; Mu¨nte, Altenmu¨ller, & Ja¨ncke, 2002; Tervaniemi & Brattico, 2004). In many further studies, the MMN was elicited by violating some musical regularity, while acoustic deviance, and its related N1 response difference were controlled. For example, in a study by Tervaniemi, Rytko¨nen, Schro¨ger, Ilmoniemi, and Na¨a¨ta¨nen (2001), subjects were presented with standard stimuli that consisted of melodic patterns that randomly occurred at very different frequency levels, which simulates a melody transposed to different keys. Nevertheless, occasional slight contour changes in patterns that widely varied in frequency also elicited the MMN but only in ‘‘musical’’ subjects. This finding was subsequently confirmed and extended to different types of musical material (Fujioka, Trainor, Ross, Kakigi, & Pantev, 2004; Tervaniemi, Castaneda, Knoll, & Uther, 2006; Trainor, McDonald, & Alain, 2002) and was even observed in newborn babies (Stefanics et al., 2009). In addition, violating a rhythm elicits an earlier MMN response in adults when the rhythm is violated at a more salient position of the metric hierarchy even when the acoustic deviation is equalized (Ladinig, Honing, Ha´den, & Winkler, 2009). In newborn infants, rhythmic violations occurring at salient metric positions elicited the MMN, whereas violations at non–salient positions did not (Winkler, Ha´den, Ladining, Sziller, & Honing, 2009b). MMN to violations of complex sequential stimulus-contingency rules. In this section, we review evidence that shows that MMN is elicited by violating complex sequential rules when no acoustic change is associated with the deviance. In the first of these studies, Paavilainen, Araja¨rvi, and Takegata (2007) presented their subjects, instructed to ignore sounds, with sounds that varied in two dimensions, duration and frequency, with stimuli being: short (50 ms) or long (150 ms), and low (1000 Hz) or high (1500 Hz). All combinations (short-low, short-high, long-low, longhigh) were equiprobably presented along with a silent inter-stimulus interval (ISI) of 300 ms. The stimulus sequences were constructed so that the duration of each stimulus, which was randomly either short or long, predicted the frequency of the next stimulus so that: (1) if the present stimulus is short in duration, then the subsequent stimulus will be low in frequency; and (2) if the present stimulus is long in duration, then the subsequent stimulus will be high in frequency. Occasional deviant events broke these rules. For example, a high-pitched stimulus following a short stimulus. In this design, all the four different stimulus combinations used could be perceived as either a standard or as a deviant event, depending on the duration of the preceding stimulus. Only the deviant events elicited the MMN. This MMN reversed its polarity at the mastoids, which suggested a source in the auditory cortex. Corroborating results were obtained from studies that used even more complex stimulus-sequence rules (Bendixen, Roeber, & Schro¨ger, 2007; Bendixen, Prinz, Horva´th, Trujillo-Barreto, & Schro¨ger, 2008; Bendixen & Schro¨ger, 2008; Schro¨ger, Bendixen, Trujillo-Barreto, & Roeber, 2007).
8 In the subsequent attend condition described by Paavilainen et al. (2007), subjects who had received no prior information regarding the rules used in constructing the sound sequences were asked to press a button upon hearing any sound they judged to be ‘‘strange’’ or ‘‘deviant.’’ Although they could detect only about 15% of the deviant events, and none of them could verbally express the rules in the later interviews, the MMN was nevertheless elicited. Hence, these results suggest that the neural mechanism that models the auditory environment may automatically learn the co-variation between the features of the successive events and make predictions of the properties of the forthcoming stimuli. If the predictions are not fulfilled, then the MMN is generated. Very recent additional MMN evidence for the predictive nature of the central auditory system has been reported by Bendixen, Schro¨ger, and Winkler (2009) and Todd, Myers, Pirillo, and Drysdale (2010). Furthermore, Sculthorpe, Ouellet, and Campbell (2009) found that sound patterns are extracted from acoustically varying stimuli and their violations detected even in REM sleep. Consequently, the information extracted by the sensorymemory mechanisms often is in an implicit form that is not directly available to conscious processes and difficult to express verbally, which was also confirmed by van Zuijen, Simoens, Paavilainen, Na¨a¨ta¨nen, and Tervaniemi (2006). Hence these results are consistent with the framework originally outlined by Winkler, Karmos, and Na¨a¨ta¨nen (1996), according to which the main function of the MMN process is to adjust a neural model to the various regularities of the auditory environment. This enables the central auditory system to manage a large part of its subsequent input automatically, i.e., without requiring the limited resources of the controlled-processing system. (For recent reviews, see Winkler, 2007, and Winkler, Denham, & Nelken, 2009a; for a very recent review of automatic sensory cognition in audition, see Na¨a¨ta¨nen et al., 2010.) The MMN-N1 generator loci differences in humans. Evidence for the separability of the MMN and the N1 is also provided by localization data that suggest that although they are adjacent to one another the N1 and MMN generator loci are clearly separable from one another. A large number of studies (Alho et al., 1993, 1998; Cse´pe, Pantev, Hoke, Hampson, & Ross, 1992; Korzyukov et al., 1999; Kropotov et al., 1995; Leva¨nen, Hari, McEvoy, & Sams, 1993; Leva¨nen, Ahonen, Hari, McEvoy, & Sams, 1996; Rosburg, 2003; Rosburg et al., 2004; Sams et al., 1985; Sams, Kaukoranta, Ha¨ma¨la¨inen, & Na¨a¨ta¨nen, 1991; Scherg, Vajsar, & Picton, 1989; Tiitinen et al., 1993) obtained separable generation loci for the N1 and MMN, with the MMN (or MMNm) equivalent current dipole (ECD) in several studies being located about 1 cm anteriorly to that for the N1 (for a review, see Alho, 1995). For example, Scherg et al. (1989) found that the difference-wave negativity to a small frequency change could be modeled with one dipole source in the supratemporal auditory cortex of each hemisphere, whereas two dipoles in each hemisphere were needed to explain the negativity elicited by a large frequency change. According to the authors, one of these two dipoles was probably the genuine MMN generator, which was even activated by small frequency changes. The other dipole, which was activated somewhat earlier, was located posteriorly to the MMN generator and appeared to indicate enhanced activity of the supratemporal N1 generator. Thus, these results suggest that in the deviant-standard difference waves the early negativity may be composed of, or enhanced by, the release-from-refrac-
R. Na¨a¨ta¨nen et al. toriness of the N1 neurons, whereas the later part represents a genuine MMN. Consistent with these results, optical-imaging data showed separate generators for the N1 and MMN, which corresponded to the source locations previously found with the MEG (Rinne et al., 1999; Tse & Penney, 2008). Furthermore, corroborating fMRI results were obtained by Opitz, Schro¨ger, and von Cramon (2005) and MEG results by Maess, Jacobsen, Schro¨ger, and Friederici (2007). Opitz et al. (2005) combined the event-related fMRI with an experimental protocol that controlled for the refractoriness effects (Campbell, Winkler, & Kujala, 2007; Jacobsen & Schro¨ger, 2001; Jacobsen, Schro¨ger, Horenkamp, & Winkler, 2003), and found N1-related activity in the primary auditory cortex, whereas MMN-related activity originated in nonprimary auditory areas in the anterior part of Heschl’s gyrus. The authors concluded that their experiment succeeded in delineating the cognitive mechanism. This was based on delineating the memory-comparison processes generating a genuine MMN for frequency change and subserved by nonprimary auditory areas in the anterior part of Heschl’s gyrus, from the contribution of the sensory mechanism associated with a differential state of refractoriness in the primary auditory cortex. These results were corroborated by Maess et al. (2007) on the basis of their MEG data that showed opposite orientations of the early and late effects. These authors concluded that the early part of the deviant-minus-standard difference for frequency change is mainly due to the sensorial, N1mrelated mechanism, whereas the later part of the difference wave is mainly due to the cognitive MMNm-related mechanism. Inverse modeling revealed that sources for both contributions were bilaterally located in the gyrus temporales transversus. These MEG results suggested distinct but temporally and spatially partially overlapping activities of sensorial (non-comparator-based) and cognitive (comparator-based) mechanisms of automatic frequency-change detection in the auditory cortex (as also reported by Rosburg et al., 2004). According to Schro¨ger (1997), the function of those non-primary areas that generate the genuine MMN might be to establish a sparse representation of simple and complex invariants inherent in the recent stimulation, thereby providing the neural basis for memory comparison. Both sensorial and cognitive mechanisms contribute to pitch-change detection in the classic oddball paradigm. According to Schro¨ger (1997), this parallel use of two different mechanisms in the service of the same function underlines the biological importance of preattentively detecting changes in the auditory environment. Furthermore, Alho et al. (1996) observed the MMNm response to a change within a sound pattern, in addition to that elicited by a change in one frequency element of a music chord, with supratemporal sources anterior to the N1 source. Moreover, a number of further studies (Alain, Achim, & Woods, 1999; Escera et al., 2002; Frodl-Bauch, Kathman, Mo¨ller, & Hegerl, 1997; Giard et al., 1995; Leva¨nen et al., 1996; Paavilainen et al., 1991; Rosburg, 2003; Sysoeva, Takegata, & Na¨a¨ta¨nen, 2006; Takegata et al., 2001) also showed that the MMNs (MMNms) and their fMRI equivalents (Molholm et al., 2005) for different auditory features are generated in separate loci of the auditory cortex, which necessarily dissociates at least some of the MMN loci from that of the N1 to the (common) standard. Thus, it appears that, in the deviant-minus-standard difference waves, the early negativity may be enhanced by the releasefrom-refractoriness of the N1 neurons, whereas its later part is fully accounted for by the ‘‘genuine’’ MMN. This is also evident in the study of Tiitinen, May, Reinikainen, and Na¨a¨ta¨nen
Auditory processing that leads to conscious perception (1994), which showed the deviant-minus-standard difference wave as a function of the magnitude of frequency change. With very small changes, the MMN is clearly separate from the N1 enhancement whereas, with an increasing frequency difference, the MMN commences earlier, and there is an increasing temporal overlap between the two responses. (For a delineation of the genuine MMN part of the deviant-minus-standard difference, see also Horva´th et al., 2008.) In addition, an analogous data pattern was obtained with intensity increments, whereas intensity decrements appeared to elicit a genuine MMN only, because the N1 amplitude decreases with decreasing sound intensities (Na¨a¨ta¨nen, 1992; Na¨a¨ta¨nen, Paavilainen, Alho, Reinikainen, & Sams, 1989). Importantly, Tiitinen et al. (1993) also found that the MMNm generator mechanism seemed to be tonotopically organized but differed from the tonotopy of the N1m generator by its anterior locus. Tiitinen et al. (1993) therefore concluded that this MMNm tonotopy ‘‘is presumably that of the neuronal population(s) underlying frequency representation in sensory memory,’’ and that ‘‘the separability of this memory tonotopy from the afferent tonotopy of the neuronal population underlying N1m generation is suggested by the clearly different loci of the two responses’’ (Tiitinen et al., 1993, p. 539). Furthermore, the frontal subcomponent of the MMN is predominantly right-hemispheric (Giard et al., 1990), whereas the frontal N1 subcomponent is bilaterally generated (Giard et al., 1994). In addition, there also appears to exist a parietal MMN subcomponent (Gomot et al., 2000; Leva¨nen et al., 1996), which is probably generated in the posterior parietal cortex (Gomot et al., 2000) where no N1 generator seems to exist (Na¨a¨ta¨nen & Picton, 1987). Intracranial animal recordings. The stimulus-specific adaptation (SSA) found in animal recordings from different levels of the auditory pathway (Moore, 2003; Nelken & Ulanovsky, 2007; Ulanovsky, Las, & Nelken, 2003) has been suggested to fully explain the generation of the MMN. This view has been recently rejected, however. In their single- and multi-unit, evoked local field potential recordings that were obtained from the primary auditory cortex of the awake rat, von der Behrens et al. (2009) found that both neurons and evoked local field potentials adapted in a stimulus-specific manner. However, no MMN kinds of response, with characteristics matching those of the human MMN or those of the MMNs demonstrated in a cat (Cse´pe, Karmos, & Molna´r, 1987; Pincze, Lakatos, Rajkai, Ulbert, & Karmos, 2001, 2002) or monkey (Javitt, Schroeder, Steinschneider, Arezzo, & Vaughan, 1992; Javitt, Steinschneider, Schroeder, Vaughan, & Arezzo, 1994; Javitt, Steinschneider, Schroeder, & Arezzo, 1996), were found. Instead, the researchers concluded that the stimulus-specific adaptation of isolated units in the rat primary auditory cortex profoundly contributed to changes in the P1-N1 complex. Furthermore, in a recent review, Winkler, Denham, and Nelken (2009a) suggested that the SSAexhibiting neurons observed in all previous experiments lie upstream from those generating the MMN and, further, that the SSA alone cannot fully explain the MMN response. Pharmacological effects on the MMN and N1. There are also opposite effects of psychopharmacological manipulations on the MMN and N1 that implicate separate mechanisms of these two components. For instance, Umbricht et al. (2000) found that ketamine, an NMDA receptor antagonist, diminished the MMN amplitude but enhanced the N1 amplitude. Moreover, in their
9 study on monkeys, Javitt et al. (1996) observed that the NMDAreceptor antagonist MK-801 had no effect on the N1, whereas the MMN was abolished. MMN elicitation to sound omission. As further evidence for the N1-MMN separability, of particular importance are results that show MMN elicitation even with no afferent input. This occurs when a stimulus is omitted from a stimulus sequence presented at short constant (Yabe, Tervaniemi, Reinikainen, & Na¨a¨ta¨nen, 1997; Yabe, Tervaniemi, Sinkkonen, Huotilainen, Ilmoniemi, & Na¨a¨ta¨nen, 1998; Yabe, Koyoma, Kakigi, Gunji, Tervaniemi, Sato, & Kaneko, 2001; Yabe, Matsuoka, Sato, Hiruma, Sutoh, & Koyama, 2005; Yabe, Winkler, Czigler, Koyama, Kakigi, Suto, et al., 2001; Yabe, Sutoh, Matsuoka, Asai, Hiruma, Sato, et al., 2005) or varied SOAs (Ocea´k et al., 2006), when the second of two closely paced paired tones is occasionally omitted (Tervaniemi, Saarinen, Paavilainen, Danilova, & Na¨a¨ta¨nen, 1994), or when a stimulus is partially omitted (Winkler & Na¨a¨ta¨nen, 1993).These results suggest an N1-independent elicitation of the MMN, as no afferent elements could be involved in the generation of the MMN to stimulus omission. Different Developmental Time Courses of the MMN and N1 The MMN generator process is recordable even in the fetus (Draganova et al., 2005, 2007; Huotilainen et al., 2005), whereas N1 shows a considerably later developmental time course (Cse´pe, 1995; Pasman, Rotteveel, Maassen, & Visco, 1999; Ponton, Eggermont, Kwong, & Don, 2000; Ponton, Eggermont, Khosla, Kwong, & Don, 2002; Sharma, Kraus, McGee, & Nicol, 1997). MMN-N1 dissociations in patients. In some patient groups, MMN can be present with no N1. This can be the case for comatose patients (Fischer et al. 1999) or subjects who have cochlear implants (Ponton, Don, et al., 2000). The MMN with acoustically identical standards and deviants. The MMN process can be elicited by an auditory phoneme stimulus paired with an occasional incongruent visual stimulus in a sequence of identical auditory stimuli paired with congruent visual stimuli (Mo¨tto¨nen, Krause, Tiippana, & Sams, 2002; Mo¨tto¨nen, Schurman, & Sams, 2004; Sams et al., 1991; Tiippana et al., 2004). This result was even obtained in 5-month-old infants (Kushnerenko, Teinonen, Volein, & Csibra, 2008). Different sensory or perceptual correlates of the MMN and N1. The memory trace reflected by the MMN corresponds to the feature- and temporally integrated auditory event, whereas the sensory information that is encoded by the N1 generator does not appear to correspond to the subjective contents of perception (Butler, 1972; Parasuraman & Beatty, 1980; Winkler, Tervaniemi, & Na¨a¨ta¨nen, 1997) but rather to its attention-catching properties (Rinne et al., 2006). Consistent with this, the N1 seems better at indexing detection rather than discrimination, judging from the result that the N1 amplitude correlated with the detection of the occurrence of a faint signal but did not correlate with its recognition (Parasuraman & Beatty, 1980), whereas the MMN appears to be the best objective index of auditory discrimination currently available (Kraus et al., 1995, 1996; Lang et al., 1990; Na¨a¨ta¨nen & Alho, 1997; Na¨a¨ta¨nen et al., 2007). Furthermore, the N1 generator encodes stimulus information over the first 40–50 ms from stimulus onset only; therefore, it is unable to integrate stimulus energy long enough for perceived loudness to emerge (Gage & Roberts, 2000; Scharf, 1978; Scharf
10 & Houtsma, 1986). This, in turn, results in a clear dissociation between the sensory magnitude and the N1 amplitude (Picton, Goodman, & Bryce, 1970; Picton, Woods, & Proulx, 1978; Pratt & Sohmer, 1977). In a similar vein, Woods and Elmasian (1986) observed that the strong attenuation of the N1 amplitude at the beginning of a stimulus block is not directly related to loudness (see also Donald, 1979), but rather to its attention-catching properties or disruptiveness (Campbell, 2005; Campbell et al., 2003, 2005; Rinne et al., 2006; Valtonen et al., 2003). For the same reason, the N1 generator process does not seem to be involved in feature integration. Moreover, in contrast to the traces reflected by N1, those used in MMN elicitation can even encode long-duration auditory stimulus patterns that last for several hundreds of ms (Schro¨ger, Na¨a¨ta¨nen, & Paavilainen, et al., 1992) (although the first 300 ms from stimulus onset seem to be the most accurately represented; Grimm & Schro¨ger, 2005).
Memory Reflected by the MMN and the N1 In this section, we compare the kinds of memories reflected by the MMN and N1 with each other and show that these two responses are associated with very different kinds of sensory-memory information. As described in the introduction, the MMN is traditionally interpreted in terms of a memory-dependent effect. In the literature, the closest correspondence can be found in the so-called echoic memory, a form of auditory sensory memory with perception kind of vivacity lasting ca. 10 s in young adult participants (Cowan, 1984, 1988; Kallman & Massaro, 1979; Massaro, 1970, 1976). Several studies (Winkler et al., 1992, 1995; Winkler & Na¨a¨ta¨nen, 1994) show that the subjective contents of the memory involved in the MMN generation indeed correspond to that in perception and sensory memory (for a review, see Na¨a¨ta¨nen & Winkler, 1999). However, the MMN is not a direct index of sensory-memory traces, as a deviant after a single standard does not elicit the MMN (Cowan et al., 1993), and usually a sequence of 2–3 standards is needed before the MMN can be elicited (Bendixen et al., 2007; Cowan et al., 1993; Winkler, Cowan, Cs´ epe, Czigler, & Na¨a¨ta¨nen, 1996). Moreover, a very large N1 but no MMN is elicited by the first stimulus in a sequence after a long period of silence (Na¨a¨ta¨nen et al., 1989; Sams et al., 1985). This is because the elicitation of the MMN is not directly related to the sensorymemory trace of a single sound, but rather to the memory that encodes the regular sensory and higher-order features of a sequence of sounds (Cowan et al., 1993; Winkler, 2007; Winkler, Karmos, et al., 1996, 2009a). Consequently, rather than forming an index of memory-trace formation, the MMN indexes sensorymemory updating. For example, when a deviant event suddenly starts to repeat with no intervening standards, it in fact becomes a new standard against which deviants start to elicit the MMN (Winkler, Karmos, et al., 1996; Na¨a¨ta¨nen & Rinne, 2002; Bendixen et al., 2008). Such data support Na¨a¨ta¨nen’s (2009) suggestion of the MMN being a universal index of the second of the brain’s two main tasks with regard to environmental information, namely, updating the system of environmental stimulus representations. The first main task of the brain is the initial formation of the stimulus representations. The N1 adaptation reflects the refractoriness of the corresponding feature trace(s), whereas the MMN indicates the presence of feature-integrated stimulus representations that
R. Na¨a¨ta¨nen et al. correspond to the subjective contents of perception (Na¨a¨ta¨nen & Winkler, 1999). On the basis of the afore-reviewed differences between the MMN and N1 responses, we conclude that they are clearly separate, and represent different steps in central auditory processing, with the N1 generator process being related to the processing of separate auditory stimulus features. In contrast, the MMN response reflects the representation of inter-sound regularities based on feature- and temporally integrated sensory stimulus information (Na¨a¨ta¨nen & Winkler, 1999). Consequently, these two responses are associated with very different kinds of sensorymemory information. Selective Attention Effects on N1: The Separability of the N1 and the Processing Negativity (PN) According to Na¨a¨ta¨nen’s (1975) review, the first valid demonstration of ‘‘the N1 effect’’ of selective attention was provided by Hillyard, Hink, Schwent, Picton, et al. (1973). In their selective dichotic-listening task with very short, irregular inter-stimulus intervals (ISIs), the left-ear tones were of a considerably higher pitch than the right-ear tones. In addition, both sequences included occasional, randomly placed, slightly higher tones. The subject’s task was to count these deviants among the standards in the designated ear and to ignore all the input to the opposite ear. Hillyard and his colleagues found that the vertex N1 showed a higher amplitude for the attended than for the ignored stimuli. The authors regarded their effect as an enhancement of the ‘‘N1 component’’ and suggested that it reflected Broadbent’s (1970, 1971) stimulus-set mode of attention. ‘‘A stimulus set preferentially admits all sensory input to an attended channel (stimuli having in common a simple sensory attribute, such as pitch, position in space, receptor surface, or the like) for further perceptual analysis while blocking or attenuating input arriving over irrelevant channels (for example, the unattended ear) at an early stage of processing’’ (Hillyard et al., 1973, p. 180). The authors stressed the short onset latency of the effect as critical evidence: ‘‘The early latency of the attention effects upon N1 (evident at 60–70 ms in most subjects) suggests that the underlying attentional process is a tonically maintained set favoring one ear over the other rather than an active discrimination and recognition of each individual stimulus’’ (Hillyard et al., 1973, p. 179). Subsequently, by using a considerably longer and constant (800 ms) ISI in an otherwise quite similar experimental condition, Na¨a¨ta¨nen et al. (1978) found a slow negative shift which they termed the processing negativity (PN). The effect, recorded over the vertex and both left and right auditory cortices, appeared to represent no modulation of any obligatory ERP component but was rather a new component that emerged during selective attention. The peak amplitude of the N1 deflection was not affected. However, the N1 peak was followed by a low-amplitude (1–2 mV) negative displacement of the ERP of the attended standards compared with the unattended standards. Further, this displacement began at 150 ms, during the descending limb of the N1 deflection, and persisted for at least 500 ms. In addition, in their subsequent study, Na¨a¨ta¨nen, Gaillard, and Ma¨ntysalo (1980) obtained PNs over the temporal areas that were as large in amplitude as those over the vertex, which suggests that at least a part of the PN was generated in the sensory-specific auditory regions. Na¨a¨ta¨nen and Michie (1979) also proposed that the PN has a frontal generator. Na¨a¨ta¨nen et al. (1978) suggested that the PN is an endogenous component that was generated by a cerebral mechanism
Auditory processing that leads to conscious perception different from that of the N1 component. They also proposed that the N1 effect reported by Hillyard et al. (1973) might have been caused by the PN rather than by an amplification of the generator process of the N1 component. Namely, their considerably shorter ISIs might have shortened the PN latency so that the PN overlapped the N1 component, and caused an artificial increase in its measured amplitudes (Hillyard et al., 1973). Subsequently, the existence of the PN was verified by several further studies (Alho, Donauer, Paavilainen, Reinikainen, Sams, & Na¨a¨ta¨nen, 1987; Alho, To¨tto¨la¨, Reinikainen, Sams, & Na¨a¨ta¨nen, 1987; Okita, 1979; Okita, Konishi, & Inamori, 1983; Parasuraman, 1978), and its MEG equivalent was described by Hari et al. (1989). The PN was also observed by Hillyard and his colleagues (Hansen & Hillyard, 1980, 1983, 1984; Hillyard & Hansen, 1986; Hillyard & Kutas, 1983). Nonetheless, Hansen and Hillyard (1980) also suggested that a ‘‘genuine’’ N1 enhancement, too, may have been present in their data. Subsequent studies have indeed shown that, during very strongly focused selective attention, both the PN and the ‘‘genuine’’ enhancement of the N1 component may co-occur, thus supporting Hillyard’s position. Hence, the N1 effect of selective attention cannot be fully explained by the early onset of the PN (Na¨a¨ta¨nen, 1990, 1992; Na¨a¨ta¨nen, Schro¨ger, & Alho, 2002). Subsequent studies confirmed the two-component structure of the PN, proposed by Na¨a¨ta¨nen et al. (1978) and Na¨a¨ta¨nen & Michie (1979), with the sensory-specific auditory-cortex component that has a slightly earlier onset than that of the frontal component. The sensory-specific component was interpreted as being elicited by an on-line comparison between the incoming input and the so-called ‘‘attentional trace,’’ a voluntarily maintained representation of the to-be-attended stimulus (Na¨a¨ta¨nen 1982). It is developed by using fresh sensory-memory data of this stimulus (Donald & Young, 1982; Donald & Nugent, 1986) for tuning it to exactly correspond to its critical features (Na¨a¨ta¨nen, 1982, 1990). Further, this matching process between the incoming stimulus and the attentional trace, and hence the PN generation, terminates the sooner, the more different the stimulus is from the to-be-attended one. The PN runs its full course only in the case of a perfect match, with the generating selection process accepting the input to the prepared further-processing stages or for an immediate response (Alho et al., 1987a; Na¨a¨ta¨nen, 1982, 1990). The frontal component of the PN, in turn, might be related to the maintenance or control processes of the selectiveattention state (Na¨a¨ta¨nen, 1975, 1990, 1992). Model of Preconscious and Conscious Perceptual Processing in Audition The afore-reviewed data can be regarded as showing that the N1, MMN, and PN represent separate brain responses, each of which reflect its own auditory processing stages and separate properties of the storage of auditory sensory information. The N1 component is associated with the afferent response to sound onset (transient detection that subserves conscious stimuli perception) and also associated with feature analysis beyond that accomplished by the lower-level mechanisms (Banai et al., 2005, 2007; Galbraith et al., 1995, 1997; Johnson et al., 2007, 2008; King et al., 2002; Kraus & Nicol, 2005). The MMN, in turn, is elicited by auditory change. In more general terms, it is elicited by the violation of detected auditory regularities, which include the fully integrated auditory sensory information of the stimulus embedded in its sequential context. This violation is usually consciously
11 Table 1. Auditory ERP Components, Functions Reflected, and Roles of their Generators in Attention Component N1 MMN PN
Function
Role in attention
Onset detection and feature encoding Sensory-memory updating and change/rule violation detection Template (attentional-trace) matching
Conscious stimulus perception Conscious change detection Stimulus selection
perceived because of the attention-triggering property of the MMN process (Na¨a¨ta¨nen, 1990; Winkler, 2007). Finally, the PN is associated with attentional stimulus selection (see Table 1) and is based on the voluntarily maintained memory representation of the critical features of the to-be-attended sound (Alho et al., 1987a; Na¨a¨ta¨nen, 1982). The three ERP responses and their MEG in addition to fMRI equivalents, and the related behavioral data, constitute the empirical justification of the model presented in Figure 2. This is an updated and considerably elaborated version of that described by Na¨a¨ta¨nen (1990), and has been developed with the aim of defining the borderline between the automatic and attention-dependent processes in audition and to illustrate the emergence of conscious auditory perception. With a very short latency, sound (S) onset activates the feature-detector neuronal networks that correspond to the different stimulus features, such as the frequency-specific neurons along the afferent pathway. These early stimulus-specific processes mainly occur well before the N1 onset, and generate the auditory brainstem (ABR; Picton, Stapells, & Campbell, 1981; Starr & Don, 1988; Vaughan & Arezzo, 1988) and middle-latency responses (MLR; Picton, Hillyard, Krausz, & Galambos, 1974). Further, even though a large proportion of the N1 neurons are nonspecific (cf. the three N1 components described by Na¨a¨ta¨nen & Picton, 1987) or relatively nonspecific, because they have wide receptive fields (Woods & Elmasian, 1986), there is also some evidence for the N1 generator containing highly stimulus-specific neuronal populations (Butler, 1968; Na¨a¨ta¨nen et al., 1988; Picton et al., 1978). The outputs of the different feature detectors are then automatically integrated in time (the temporal window of integration; TWI) with a duration of approximately 200 ms (Atienza et al., 2003; Na¨a¨ta¨nen & Winkler, 1999; Nousak, Deacon, Ritter, & Vaughan, 1996; Ocea´k et al., 2006; Tervaniemi, Saarinen, et al., 1994; Yabe et al., 1997, 1998; Yabe, Koyoma, Kakigi, Gunji, Tervaniemi, Sato, & Kaneko, 2001; Yabe, Winkler, Czigler, Koyoma, Kakigi, Suto, et al., 2001; Yabe, Matsuoka, Sato, Hiruma, Sutoh, Koyoma, et al., 2005; Yabe, Sutoh, Matsuoka, Asai, Hiruma, Sato, et al., 2005) and across the different features (Gomes et al., 1995, 1997; Ritter et al., 2000; Takegata & Morotomi, 1999; Takegata et al., 1999, 2001, 2005; Winkler et al., 2005). For instance, loudness integration continues for 200 ms from stimulus onset, with the outcome of this process determining the loudness, the perceived intensity of the sound (Moore, 1989; Scharf & Houtsma, 1986; Zwicker & Fastl, 1990), which provides an estimate of the duration of the TWI. During the TWI, masking may also occur, with a subsequent stimulus often preventing the accurate perception of the preceding stimulus (Bazana & Stelmack, 2002; Cowan, 1984; Foyle & Watson, 1984; Hawkins & Presson, 1977, 1986; Massaro, 1970). This can also
12
R. Na¨a¨ta¨nen et al.
State Sensory Analysis (0 – 100 ms)
Transient Detectors
Sensory Memory (100 – 200 ms)
Att.Call (N1) EXCITABILITY
S
Feature Detectors
CONSC. PERC.
TWI
Sensory Memory
REHEARSAL
Temporary Feature Recognizers (attentional trace)
Post-sensory analysis and control (200 ms +) A T T E N T I O N C O N T R O L
EXEC. MECHANISMS + LTM
. .
Matching (AC-PN)
Figure 2. A model of conscious and unconscious processes in audition. The sound stimulus is first very rapidly analyzed by the different feature detectors. Thereafter, the outputs from the different feature detectors are temporarily integrated and with each other in the Temporal Window of Integration. The accumulation of this integrated sensory information in the mechanisms of Sensory Memory that evolves in time provides the sensory data of subjective contents of percepts, i.e., the central sound representation (Na¨a¨ta¨nen & Winkler, 1999). This central representation becomes consciously experienced, depending on the strength of the attention-call signal elicited by the dynamogenic stimulus features indexed by the N1 amplitude. Further, if some discernible change in auditory stimulation occurs, then this change results in the updating of auditory representations in Sensory Memory, eliciting the auditory-cortex MMN component. This, in turn, activates the frontal-cortex mechanisms generating the frontal MMN component (representing an attention-call signal to auditory change). During selective attention, the Executive Mechanisms use fresh sensory-memory data to set up and tune the Attentional Trace, a temporary template for the rapid selection of the to-be-attended input for further processing or response. This selection mechanism continuously depends on the active maintenance and rehearsal of the aspects of sensory input that very rapidly enable the listener to distinguish the relevant sensory input stream among the concurrent stimulus streams.
be shown by using the MMN, which is abolished when a masking stimulus follows each stimulus of the oddball paradigm with a very short interval. Therefore the MMN can also be used for determining the TWI duration (Winkler & Na¨a¨ta¨nen, 1994; Winkler, Reinikainen, & Na¨a¨ta¨nen, 1993). The outputs from the TWI process accumulate in the neural populations that subserve sensory memory. Further, the phase of the rapid accumulation of this stimulus-specific information underlies the stimulus perception. This, in turn, becomes conscious if the N1 transient-detector system, activated by the same stimulus, generates a signal (attention call) that is strong enough to exceed some temporally varying threshold. This mainly depends, if stimuli are ignored, on the following: the strength of attentional focus elsewhere, the rise time of the stimulus (Kodera, Hink, Yamada, & Suzuki, 1979; Onishi & Davis, 1968; Ostroff, McDonald, Schneider, & Alain, 2003; Pedersen & Salomon, 1977), and the degree of refractoriness of the neuronal population involved in the generation of this signal (Escera, Alho, Schro¨ger, & Winkler, 2000). The N1 amplitude apparently reflects the magnitude of the sensory ‘‘refreshment’’ of the feature trace involved (Na¨a¨ta¨nen, 1984). The biological significance of the long duration of these N1 refractoriness patterns might lie in ‘‘optimizing’’ the strength and frequency of the attention-call signals elicited. This transient-detector system (Graham, 1979; Loveless, 1983; MacMillan, 1973;
Newstead & Dennis, 1979; Phillips, 2001; Walter, 1964) is mainly composed of N1 neurons of non-specific or relatively nonspecific type, and probably also include the neurons that generate the frontal-cortical N1 component (Alcaini, Giard, Echallier, & Pernier, 1995; Giard et al., 1994). Consistent with this notion, previously, Walter (1964) suggested that the ‘‘vertex potential’’ notified the brain that something was happening while the specific sensory areas determined what it was (see also Davis & Zerlin, 1966; Gersuni, 1971; Na¨a¨ta¨nen, 1975). A second major cerebral route to attention switch/conscious perception is provided for violations of the automatic predictions that are based on the regularities extracted from the preceding sequence. Similarly, the MMN generator process also causes attention switch to the eliciting auditory event when this signal exceeds some momentary threshold, as in the case of the route from the transient detectors (N1). This is mediated by the auditory-cortex MMN process activating the frontal-cortex MMN process. Furthermore, in many cases, a deviant stimulus may, in addition to activating the MMN attention-call mechanisms, also enhance the generator process of the N1 component. This results in the attention-call signal triggered by the stimulus onset, thus increasing the probability of the conscious perception of stimulus change (Rinne et al., 2006). In either case, exceeding the threshold results in the conscious perception of the parallel featureand temporally integrated sensory contents incorporated in the
Auditory processing that leads to conscious perception memory trace (for a description of the parallel processing of features and the integrated stimulus representation, see Ritter et al., 1995). Hence, similar to the N1, depending on the strength of this MMN attention-call signal, it may lead to an attention switch to, and conscious perception of, auditory deviation or regularity violation that elicits the P3a (Escera et al., 1998, 2001; Friedman, Cycowicz, & Gaeta, 2001; Squires et al., 1975) or the N2b-P3a responses (Na¨a¨ta¨nen et al., 1982; for a review, see Na¨a¨ta¨nen, 1992). It is also possible that even some P3 (Sutton, Baren, Zubin, & John, 1965) and slower positivity are elicited, when the stimulus is recognized as a target. In addition, autonomic nervous system (ANS) responses may also be observed (Lyytinen, Blomberg, & Na¨a¨ta¨nen, 1992; Lyytinen & Na¨a¨ta¨nen, 1987). This attention switch is also manifested by transient deteriorations in primary-task performance that accompanies the MMN, as already reviewed (Escera et al., 1998). The conscious perception/experience of auditory stimulus representations (perception or rehearsal and imagination) are indicated by the yellow coloring in Figure 2. The conscious perception/awareness of the contents of sensory memory occurs either when one of the attentional-call processes is strong enough to exceed some momentarily varying threshold (Na¨a¨ta¨nen, 1990, 1992) or when the stimulus features of the to-be-attended stimulus are maintained in the attentional trace. The presence of the attentional trace continuously depends on its conscious, voluntary maintenance by the attentional control mechanisms reflected by the frontal PN component (FR-PN) (Hansen & Hillyard, 1980, 1983, 1984; Na¨a¨ta¨nen, 1982; Okita et al., 1983). During the lifetime of the attentional trace, each stimulus initiates a comparison process that is reflected by the auditory-cortex PN. The more discernable the stimulus is from that represented by the trace, the sooner the comparison process terminates. This comparison process runs its full time course only when the input fully matches with the stimulus represented by the attentional trace (Alho, To¨tto¨la¨, Reinikainen, Sams, & Na¨a¨ta¨nen, 1987; Alho, Donauer, Paavilainen, Reinikainen, Sams, & Na¨a¨ta¨nen, 1987). This is illustrated by the arrow in the bottom of Figure 2. See also Table 1. Illustrated in the figure is also another type of attention effect (EXCITABILITY) that is channel- rather than stimulus-specific. This is supported by the very early Hillyard type of N1 effect found in the condition in which the subject attends to stimuli presented to the designated ear at a very rapid rate (Hillyard et al., 1973). In this case, this effect expresses ‘‘a tonically maintained set rather than an active discrimination and recognition of each individual stimulus’’ (Hillyard et al., 1973, p. 179). Furthermore, even earlier selective-attention effects of this type were subsequently reported (Hackley, Wolldorf, & Hillyard, 1987, 1990; McCallum et al., 1983; Michie et al., 1993; Rinne et al., 2008; Woldorff, Hansen, & Hillyard, 1987; Woldorff, Hackley, & Hillyard, 1991; Woldorff & Hillyard, 1991), supporting the presence of attentional control over the input-channel excitability (exogenous attention effects; EXOG. AE; see Figure 2). The part of the model in which perception can become conscious closely corresponds to Na¨a¨ta¨nen and Winkler’s (1999) distinction between the representational/pre-representational systems. According to these authors, the representational system contrasts with the pre-representational system in that the stimulus code: (a) is stable, even though it is subject to decay and interference; (b) it contains the outcome of complete sensory analysis, has temporal properties, and corresponds to the per-
13 cept; (c) it can be brought into conscious experience by an attentional-call process or subject-initiated attention, imagination, or rehearsal; hence these codes are accessible to top-down operations; and (d) depending on the outcome of (c), the stimulus code can contact the LTM, which may result in the recognition of the stimulus and semantic activation (Massaro, 1976; Posner & Snyder, 1975; Pulvermu¨ller & Shtyrov, 2006; Pulvermu¨ller et al., 2009). The present model is consistent with these suggestions, but it can also accommodate the very early attention effects on auditory processing. Even though these top-down selective-attention effects are manifested peripherally from the borderline between the representational and pre-representational systems, the nature of these effects nevertheless is channel-specific, rather than stimulus-specific, which is in agreement with the suggested borderline between the representational and pre-representational systems. Finally, the general vigilance state of the organism is also illustrated. The excitability of the Transient Detectors depends on the subject’s state (Eason et al., 1964; Eason & Dudley, 1971; Fruhstorfer & Bergstro¨m, 1969; Hermanutz, Cohen, Sommer, 1981; Na¨a¨ta¨nen, 1970, 1975; Na¨a¨ta¨nen & Picton, 1987), but such effects might involve Feature Detectors, too (for reviews, see Na¨a¨ta¨nen & Picton, 1987; Sokolov et al., 2002). Furthermore, the Transient-Detector activation probably also contributes to the increased vigilance of the subject (Lindsley, 1960).
Concluding Discussion In the aforegoing, an updated version of Na¨a¨ta¨nen’s (1990) model of attention and automaticity in central auditory processing, which was developed more than 20 years ago, is introduced. First, the present model focuses on the dynamics of stimulus perception by incorporating the temporal and feature integration mechanism called the Temporal Window of Integration (TWI). The TWI integrates Feature-Detector outputs that form the neural basis for auditory event perception. Second, the present model acknowledges the very early selective-attention effects reported during the last two decades, starting from the now classic study of Hillyard et al. (1973). Consequently, it is now endowed with mechanisms of general centrifugal sensory excitability control of the Transient-Detector and Feature-Detector systems. These modulate all inputs through these channels in the same way rather than in a stimulus-specific manner. Third, the model also specifies brain events associated with change detection by suggesting that the attention-call signal elicited by auditory deviance specifically originates from the frontal mechanisms of MMN generation that is triggered by the auditory-cortex MMN generator process within a slightly earlier time course (Rinne et al., 2000; Tse & Penney, 2008). Fourth, the present model also separates the frontal Attention Control mechanisms within the Executive Mechanisms, which can also be commanded by internal attention-call signals that are generated during the automatic processing of auditory input. Fifth, state factors are now also represented in the model. Sixth, most importantly, the present model explicitly illustrates the stages or aspects of central auditory processing that can be consciously experienced. Consequently, the present revised model can contribute to the reconciliation between the two major competing lines of behavioral and ERP evidence pertinent to the role of attention in
14
R. Na¨a¨ta¨nen et al.
auditory processing. On the one hand, a large bulk of the results suggest automaticity even at the highest levels of central auditory processing (Deutsch & Deutsch, 1963; Holender, 1986; Kahneman & Treisman, 1984; Norman, 1968). On the other hand, a number of more recent studies (Alcaini et al., 1995; Hackley et al., 1987, 1990; McCallum et al., 1983; Woldorff et al., 1991; Woldorff & Hillyard, 1991) point to selective-attention effects even at the peripheral levels of auditory processing. Hence, results that stress high-level automaticity might be, at least partially, accounted for by the powerful automatic attention-switching mechanisms that are controlled by stimulus onsets, offsets, and changes. These transient and change (regularity violation) detectors cause the release of fully analyzed and integrated sensory information from sensory memory to the LTM system. This in turn leads to semantic activation (Escera et al., 2003; Na¨a¨ta¨nen, 1990, 1992). In this way, these automatic mechanisms could account for the data interpreted in terms of the ‘‘break-through of the unattended’’ that is found in selectiveattention experiments (Broadbent, 1982; Kahneman & Treisman, 1984; Moray, 1959; Treisman, 1960) even under the strict control of the attentional focus. The presence of such powerful attention-switching mechanisms serves the vital biological function of securing the rapid conscious evaluation of the significance of the eliciting event and a prompt response to it. It is to be stressed that, for most of the time, the far-reaching automaticity of stimulus processing that is endowed with powerful attention-switching mechanisms to potentially significant events is absolutely necessary in the auditory domain, in view of the presence of multiple concurrent auditory (Winkler, 2007; Winkler et al., 2009a) and other sensory-modality input streams. Moreover, the focus of attention is often directed to the visual domain. Therefore, it is of vital importance that auditory stimulation can alert one to potentially significant events that occur outside the focus of attention. The present model can also account for the experimental demonstrations of very early selective-attention effects on auditory ERPs by postulating centrifugal gain (excitability) control mechanisms through which the Executive Mechanisms can extend the attentional inflow control far down towards the periphery. In contrast to the Attentional-Trace mechanism, this very early selection process does not use specific stimulus representations in input selection but is rather based on selective inputchannel facilitation, with the stimulus set described as a tonic set of facilitation of inputs that arrives from a designated ear (Hillyard et al., 1973). Therefore, it appears possible to draw the borderline between representational and non-representational central auditory processing (Na¨a¨ta¨nen & Winkler, 1999) at the level of sensory-memory representations. More specifically, the borderline can be drawn at the input to this stage, where the memory-trace formation that underlies the emergence of auditory percepts occurs. This borderline also constrains the locus or loci of the possible conscious processes in the central auditory system. Depending on attentional factors, neural events that
subserve such conscious processes may occur at the entry of temporarily and feature-integrated feature detector inputs to the sensory memory. This occurs as a built-up phase of the central auditory representation for perception (Na¨a¨ta¨nen & Winkler, 1999), but does not occur more peripherally. Furthermore, this borderline is also essential for understanding the relationship between the MMN and the N1. These two responses are probably generated by neural events on the opposite sides of this critical borderline, and imply that even the highest afferent mechanisms, reflected by obligatory afferent ERP components, such as the N1 and P2, do not encode featureand temporally integrated stimulus information. Consequently, it appears that the sufficient immediate, direct neural basis of feature-integrated auditory event perception is only formed at the level of sensory-memory mechanisms, where the MMN is generated as an expression of memory updating and associated alarm functions. Finally, it might also be possible to develop a more general information-processing model by using these principles to explain the interplay between the voluntary (top-down) and involuntary (bottom-up) factors that compete for the moment-tomoment control of the direction of attention. In addition to the different modalities that have N1-types of responses to stimulus onset, including visual (Vogel & Luck, 2000) and somatosensory modalities (Kekoni et al., 1997), recent studies conducted on visual modality (Czigler, 2007; Pazo-Alvarez, Cadaveira, & Amendo, 2003) demonstrated the presence of a visual MMN (vMMN). This vMMN is, as the auditory MMN, generated in the modality-specific cortex (Astikainen, Ruusuvirta, Wikgren, & Korhonen, 2004; Astikainen & Hietanen, 2009; Kremlacek et al., 2004, 2006; Pazo-Alvarez, Amendo, & Cadaveira, 2004), it is memory-dependent (Astikainen, Lillstrang, & Ruusuvirta, 2008; Czigler, Balazs, & Winkler, 2002; Czigler, Winkler, Pato´, Va´rnagy, Weisz, & Bala´zs, 2006; Pazo-Alvarez et al., 2004), and is also independent of attention (Mu¨ller et al., 2010; Kremlacek et al., 2006). On the basis of their vMMN data, Kremlacek et al. (2006) concluded that the sensory information extracted by the magnocellular system undergoes processing capable of detecting changes in sequences of unattended peripheral motion stimuli. Moreover, the presence of the MMN with a sensoryspecific topography has also been demonstrated in the somatosensory modalities (Kekoni et al., 1997) and olfactory (Krauel et al., 1999) modalities, and also for integrated audio–visual stimuli (Widman et al., 2004; Winkler et al., 2009c). Furthermore, very recent vMMN data (Mu¨ller et al., 2010; Winkler et al., 2009c) showed, analogously to the auditory modality, the occurrence of attention-independent feature integration in visual object formation. Hence these results suggest that, by forming object representations early on, our perceptual system prepares the stage for higher cognitive processes and, generally, for successful adaptation to the ever-changing environment.
REFERENCES Alain, C., Achim, A., & Woods, D. L. (1999). Separate memory-related processing for auditory frequency and patterns. Psychophysiology, 36, 737–744. Alain, C., Woods, D. L., & Knight, R. T. (1998). A distributed cortical network for auditory sensory memory in humans. Brain Research, 812, 23–37.
Alcaini, M., Giard, M. H., Echallier, J. F., & Pernier, J. (1995). Selective auditory attention effects in tonotopically organized cortical areas: A topographic ERP study. Human Brain Mapping, 2, 159–169. Alcaini, M., Giard, M. H., Thevenet, M., & Pernier, J. (1994). Two separate frontal components in the N1 wave of the human auditory evoked response. Psychophysiology, 31, 611–615.
Auditory processing that leads to conscious perception Alho, K. (1995). Cerebral generators of mismatch negativity (MMN) and its magnetic counterpart (MMNm) elicited by sound change. Ear and Hearing, 16, 38–51. Alho, K., Donauer, N., Paavilainen, P., Reinikainen, K., Sams, M., & Na¨a¨ta¨nen, R. (1987). Stimulus selection during auditory spatial attention as expressed by event-related potentials. Biological Psychology, 24, 153–162. Alho, K., Huotilainen, M., Tiitinen, H., Ilmoniemi, R. J., Knuutila, J., & Na¨a¨ta¨nen, R. (1993). Memory-related processing of complex sound patterns in human auditory cortex: A MEG study. NeuroReport, 4, 391–394. Alho, K., Tervaniemi, M., Huotilainen, M., Lavikainen, J., Tiitinen, H., Ilmoniemi, R. J., et al. (1996). Processing of complex sounds in the human auditory cortex as revealed by magnetic brain responses. Psychophysiology, 33, 369–375. Alho, K., To¨tto¨la¨, K., Reinikainen, K., Sams, M., & Na¨a¨ta¨nen, R. (1987). Brain mechanism of selective listening reflected by eventrelated potentials. Electroencephalography and Clinical Neurophysiology, 68, 458–470. Alho, K., Winkler, I., Escera, C., Huotilainen, M., Virtanen, J., Ja¨a¨skela¨inen, et al. (1998). Processing of novel sounds and frequency changes in the human auditory cortex: Magnetoencephalographic recordings. Psychophysiology, 35, 211–224. Alho, K., Woods, D. L., Algazi, A., Knight, R. T., & Na¨a¨ta¨nen, R. (1994). Lesions of frontal cortex diminish the auditory mismatch negativity. Electroencephalography and Clinical Neurophysiology, 82, 356–368. Astikainen, P., & Hietanen, J. K. (2009). Event-related potentials to taskirrelevant changes in facial expressions. Behavioural and Brain Functions, 5, 30 doi:10.1186/1744-9081-5-30. Astikainen, P., Lillstrang, E., & Ruusuvirta, T. (2008). Visual mismatch negativity for changes in orientation – sensory memory-dependent response. European Journal of Neuroscience, 28, 2319–2324. Astikainen, P., Ruusuvirta, T., Wikgren, J., & Korhonen, T. (2004). The human brain processes visual changes that are not cued by attended auditory stimulation. Neuroscience Letters, 368, 231–4. Atienza, M., Cantero, J. L., Grau, C., Gomez, C., Dominguez-Marin, E., & Escera, C. (2003). Effects of temporal encoding on auditory object formation: A mismatch negativity study. Cognitive Brain Research, 16, 359–371. Banai, K., Abrams, D., & Kraus, N. (2007). Sensory-based learning disability: Insights from brainstem processing of speech sounds. International Journal of Audiology, 46, 524–532. Banai, K., Nicol, T., Zecker, S. G., & Kraus, N. (2005). Brainstem timing: Implications for cortical processing and literacy. Journal of Neuroscience, 25, 9850–9857. Bazana, P. G., & Stelmack, R. M. (2002). Intelligence and information processing during an auditory discrimination task with backward masking: An event-related potential analysis. Journal of Personality and Social Psychology, 4, 998–1008. Bendixen, A., Prinz, W., Horva´th, J., Trujillo-Barreto, N. J., & Schro¨ger, E. (2008). Rapid extraction of auditory feature contingencies. NeuroImage, 41, 1111–1119. Bendixen, A., Roeber, U., & Schro¨ger, E. (2007). Regularity extraction and application in dynamic auditory stimulus sequences. Journal of Cognitive Neuroscience, 19, 1664–1677. Bendixen, A., & Schro¨ger, E. (2008). Memory trace formation for abstract auditory features and its consequences in different attentional contexts. Biological Psychology, 78, 231–241. Bendixen, A., Schro¨ger, E., & Winkler, I. (2009). I heard that coming: Event-related potential evidence for stimulus-driven prediction in the auditory system. The Journal of Neuroscience, 29, 8447–8451. Bonte, M., Mitterer, H., Zellagui, N., Poelmans, H., & Blomert, L. (2005). Auditory cortical tuning to statistical regularities in phonology. Clinical Neurophysiology, 116, 2765–2774. Bo¨ttcher-Gandor, C., & Ullsperger, P. (1992). Mismatch negativity in event-related potentials of auditory stimuli as a function of varying interstimulus interval. Psychophysiology, 29, 546–550. Broadbent, D. E. (1970). Stimulus set and response set: Two kinds of selective attention. In D. I. Mostofsky (Ed.), Attention: Contemporary theory and analysis (pp. 51–60). Appleton-Century-Crofts, New York. Broadbent, D. E. (1971). Decision and stress. Academic Press, London. Broadbent, D. E. (1982). Task combination and selective intake of information. Acta Psychologica, 50, 253–290.
15 Butler, R. A. (1968). Effect of changes in stimulus frequency and intensity on habituation of the human vertex potential. Journal of the Acoustical Society of America, 44, 945–950. Butler, R. A. (1972). The auditory evoked response to stimuli producing periodicity pitch. Psychophysiology, 9, 233–237. Campbell, T. A. (2005). The cognitive neuroscience of auditory distraction. Trends in Cognitive Sciences, 9, 3–5. Campbell, T., Winkler, I., & Kujala, T. (2005). Disruption of immediate memory and brain processes: An auditory ERP protocol. Brain Research Protocols, 14, 77–86. Campbell, T., Winkler, I., & Kujala, T. (2007). N1 and the mismatch negativity are spatiotemporally distinct ERP components: Disruption of immediate memory by auditory distraction can be related to N1. Psychophysiology, 44, 530–540. Campbell, T., Winkler, I., Kujala, T., & Na¨a¨ta¨nen, R. (2003). The N1 hypothesis and irrelevant sound: Evidence from token set size effects. Cognitive Brain Research, 18, 39–47. Cheour, M., Ceponiene, R., Lehtokoski, A., Luuk, A., Allik, J., Alho, K., & Na¨a¨ta¨nen, R. (1998). Development of language-specific phoneme representations in the infant brain. Nature Neuroscience, 1, 351–353. Cheour, M., Ceponiene, R., Leppa¨nen, P., Alho, K., Kujala, T., Renlund, M., et al. (2002). The auditory sensory memory trace decays rapidly in newborns. Scandinavian Journal of Psychology, 43, 33–39. Cooper, R. J., Todd, J., McGill, K., & Michie, P. T. (2006). Auditory sensory memory and the aging brain: A mismatch negativity study. Neurobiology of Aging, 27, 752–762. Cowan, N. (1984). On short and long auditory stores. Psychological Bulletin, 96, 341–370. Cowan, N. (1988). Evolving conceptions of memory storage, selective attention, and their mutual constraints within the human information-processing system. Psychological Bulletin, 104, 163–191. Cowan, N., Winkler, I., Teder, W., & Na¨a¨ta¨nen, R. (1993). Memory prerequisites of the mismatch negativity in the auditory event-related potential (ERP). Journal of Experimental Psychology: Human Perception and Performance, 19, 909–921. Cse´pe, V. (1995). On the origin and development of the mismatch negativity. Ear & Hearing, 16, 91–104. Cse´pe, V., Karmos, G., & Molna´r, M. (1987). Evoked potential correlates of stimulus deviance during wakefulness and sleep in cat: Animal model of mismatch negativity. Electroencephalography and Clinical Neurophysiology, 66, 571–578. Cse´pe, V., Pantev, C., Hoke, M., Hampson, S., & Ross, B. (1992). Evoked magnetic responses of the human auditory cortex to minor pitch changes: Localization of the mismatch field. Electroencephalography and Clinical Neurophysiology, 84, 538–548. Czigler, I. (2007). Visual mismatch negativity. Violation of nonattended environmental regularities. Journal of Psychophysiology, 21, 224–230. Czigler, I., Balazs, L., & Winkler, I. (2002). Memory-based detection of task-irrelevant visual changes. Psychophysiology, 39, 869–873. Czigler, I., Winkler, I., Pato´, L., Va´rnagy, A., Weisz, J., & Bala´zs, L. (2006). Visual temporal window of integration as revealed by the visual mismatch negativity event-related potential to stimulus omission. Brain Research, 1104, 129–140. Davis, H., & Zerlin, S. (1966). Acoustic relations of the human vertex potential. Journal of the Acoustical Society of America, 39, 109–116. Deacon, D., Nousak, J. M., Pilotti, M., Ritter, W., & Yang, C.-M. (1998). Automatic change detection: Does the auditory system use representations of individual stimulus features or gestalts? Psychophysiology, 35, 413–419. Dehaene-Lambertz, G. (1997). Electrophysiological correlates of categorical phoneme perception in adults. NeuroReport, 8, 919–924. Dehaene-Lambertz, G., Dupoux, E., & Gout, A. (2000). Electrophysiological correlates of phonological processing: A cross-linguistic study. Journal of Cognitive Neuroscience, 12, 635–647. Deouell, L. Y. (2007). The frontal generator of the mismatch negativity revisited. Journal of Psychophysiology, 21, 188–203. Deutsch, J. A., & Deutsch, D. (1963). Attention: Some theoretical considerations. Psychological Review, 70, 80–90. Donald, M. W. (1979). Limits on current theories of transient evoked potentials. In J. Desmedt (Ed.), Cognitive components in cerebral event-related potentials and selective attention. Progress in Clinical Neurophysiology (Vol. 6, pp. 187–199). Basel, Switzerland: Karger. Donald, M. W., & Nugent, P. M. (1986). Attentional tuning of a spatially specific auditory component. In W. C. McCallum, R. Zappoli, & F.
16 Denoth (Eds.), Cerebral psychophysiology: Studies in event-related potentials (pp. 38–42; Suppl 38 Electroencephalography and Clinical Neurophysiology). Amsterdam: Elsevier. Donald, M. W., & Young, M. J. (1982). The time course of selective neural tuning in auditory attention. Experimental Brain Research, 46, 357–367. Draganova, R., Eswaran, H., Murphy, P., Huotilainen, M., Lowery, C., & Preissl, H. (2005). Sound frequency change detection in fetuses and newborns, a magnetoencephalographic study. NeuroImage, 28, 354– 361. Draganova, R., Eswaran, H., Murphy, P., Lowery, C., & Preissl, H. (2007). Serial magnetoencephalographic study of fetal and newborn auditory discriminative evoked responses. Early Human Development, 83, 199–207. Eason, R. G., Aiken, L. R. Jr., White, C. T., & Lichtenstein, M. (1964). Activation and behavior: II. Visually evoked cortical potentials in man as indicates of activation level. Perceptual and Motor Skills, 19, 875–895. Eason, R. G., & Dudley, L. M. (1971). Physiological and behavioral indicates of activation. Psychophysiology, 7, 223–232. Eckstein, K., & Friederici, A. D. (2006). It´s early: Event-related potential evidence for initial interaction of syntax and prosody in speech comprehension. Journal of Cognitive Neuroscience, 18, 1696–1711. Escera, C., Alho, K., Schro¨ger, E., & Winkler, I. (2000). Involuntary attention and distractibility as evaluated with event-related brain potentials. Audiology & Neuro-Otology, 5, 151–166. Escera, C., Alho, K., Winkler, I., & Na¨a¨ta¨nen, R. (1998). Neural mechanisms of involuntary attention switching to novelty and change in the acoustic environment. Journal of Cognitive Neuroscience, 10, 590– 604. Escera, C., Corral, M.-J., & Yago, H. (2002). An electrophysiological and behavioral investigation of involuntary attention towards auditory frequency, duration and intensity changes. Cognitive Brain Research, 14, 325–332. Escera, C., Yago, E., & Alho, K. (2001). Electrical responses reveal the temporal dynamics of brain events during involuntary attention switching. European Journal of Neuroscience, 14, 877–883. Escera, C., Yago, E., Corral, M.-J., Corbera, S., & Nun˜ez, M. I. (2003). Attention capture by auditory significant stimuli: Semantic analysis follows attention switching. European Journal of Neuroscience, 18, 2408–2412. Fischer, C., Morlet, D., Bouchet, P., Luaute, J., Jourdan, C., & Salord, F. (1999). Mismatch negativity and late auditory evoked potentials in comatose patients. Clinical Neurophysiology, 110, 1601–1610. Ford, J. M., Roth, W. T., & Kopell, B. S. (1976a). Auditory evoked potentials to unpredictable shifts in pitch. Psychpophysiology, 13, 32– 39. Ford, J. M., Roth, W. T., & Kopell, B. S. (1976b). Attention effects on auditory evoked potentials to infrequent events. Biological Psychology, 4, 65–77. Foyle, D. C., & Watson, C. S. (1984). Stimulus-based versus performance-based measurement of auditory backward recognition masking. Perception and Psychophysics, 36, 515–522. Friederici, A. D. (1995). The time course of syntactic activation during language processing: A model based on neuropsychological and neurophysiological data. Brain and Language, 50, 259–281. Friederici, A. D. (2002). Towards a neural basis of auditory sentence processing. Trends in Cognitive Science, 6, 78–84. Friederici, A. D. (2004). Event-related brain potential studies in language. Current Neurology and Neuroscience Reports, 4, 466–470. Friederici, A. D., Gunter, T. C., Hahne, A., & Mauth, K. (2004). The relative timing of syntactic and semantic processes in sentence comprehension. NeuroReport, 15, 165–169. Friederici, A. D., Hahne, A., & Mecklinger, A. (1996). The temporal structure of syntactic parsing: Early and late event-related brain potential effects elicited by syntactic anomalies. Journal of Experimental Psychology: Learning, Memory and Cognition, 22, 1219–1248. Friederici, A. D., Pfeifer, E., & Hahne, A. (1993). Event-related brain potentials during natural speech processing: Effects of semantic, morphological and syntactic violations. Cognitive Brain Research, 1, 183–192. Friedman, D., Cycowicz, Y. M., & Gaeta, H. (2001). The novelty P3: An event-related brain potential (ERP) sign of the brain’s evaluation of novelty. Neuroscience and Biobehavioral Reviews, 25, 355–373.
R. Na¨a¨ta¨nen et al. Friston, K., & Kiebel, S. (2009). Cortical circuits for perceptual inference. Neural Networks, 22, 1093–1104. Frodl-Bauch, T., Kathmann, N., Mo¨ller, H. J., & Hegerl, U. (1997). Dipole localization and test-retest reliability of frequency and duration mismatch negativity generator processes. Brain Topography, 10, 3–8. Fruhstorfer, H., & Bergstro¨m, R. M. (1969). Human vigilance and auditory evoked responses. Electroencephalography & Clinical Neurophysiology, 27, 346–355. Fujioka, T., Trainor, L. J., Ross, B., Kakigi, R., & Pantev, C. (2004). Musical training enhances automatic encoding of melodic contour and interval structure. Journal of Cognitive Neuroscience, 16, 1010–1021. Fuster, J. (1989). The prefrontal cortex: Anatomy, physiology, and neuropsychology of the frontal lobe. New York: Raven Press. Gage, N. M., & Roberts, T. P. (2000). Temporal integration: Reflections in the M100 of the auditory evoked field. NeuroReport, 11, 2723–2726. Galbraith, G. C., Arbagey, P. W., Branski, R., Comerci, N., & Rector, P. M. (1995). Intelligible speech encoded in the human brain stem frequency-following response. NeuroReport, 6, 2363–2367. Galbraith, G. C., Jhaveri, S. P., & Kuo, J. (1997). Speech-evoked brainstem frequency-following responses during verbal transformations due to word repetition. Electroencephalography & Clinical Neurophysiology, 102, 46–53. Garrido, M. I., Friston, K. J., Kiebel, S. J., Stephan, K. E., Baldeweg, T., & Kilner, J. M. (2008). The functional anatomy of the MMN: A DCM study of the roving paradigm. NeuroImage, 42, 936–944. Garrido, M. I., Kilner, J. M., Kiebel, S. J., & Friston, K. J. (2009a). Dynamic causal modeling of the response to frequency deviants. Journal of Neurophysiology, 101, 2620–2631. Garrido, M., Kilner, J. M., Stephan, K. E., & Friston, K. J. (2009b). The mismatch negativity: A review of underlying mechanisms. Clinical Neurophysiology, 120, 453–463. Gersuni, G. V. (1971). Temporal organization of the auditory function. In Gersuni, G. V. (Ed.), Sensory processes at the neuronal and behavioral levels (pp. 85–114). New York: Academic Press. Giard, M. H., Lavikainen, J., Reinikainen, K., Perrin, F., Bertrand, O., Pernier, J., & Na¨a¨ta¨nen, R. (1995). Separate representation of stimulus frequency, intensity, and duration in auditory sensory memory: An event-related potential and dipole-model analysis. Journal of Cognitive Neuroscience, 7, 133–143. Giard, M. H., Perrin, F., Echallier, J. F., The´venet, M., Froment, J. C., & Pernier, J. (1994). Dissociation of temporal and frontal components in the human auditory N1 wave: A scalp current density and dipole model analysis. Electroencephalography and Clinical Neurophysiology, 92, 238–252. Giard, M. H., Perrin, F., Pernier, J., & Bouchet, P. (1990). Brain generators implicated in processing of auditory stimulus deviance: A topographic event-related potential study. Psychophysiology, 27, 627–640. Glass, E., Sachse, S., & von Suchodoletz, W. (2008a). Development of auditory sensory memory from 2 to 6 years: An MMN study. Journal of Neural Transmission, 115, 1221–1229. Glass, E., Sachse, S., & von Suchodoletz, W. (2008b). Auditory sensory memory in 2-year-old children: An event-related potential study. NeuroReport, 19, 569–573. Gomes, H., Bernstein, R., Ritter, W., Vaughan Jr., H. G., & Miller, J. (1997). Storage of feature conjunctions in transient auditory memory. Psychophysiology, 34, 712–716. Gomes, H., Ritter, W., & Vaughan Jr., H. G. (1995). The nature of preattentive storage in the auditory system. Journal of Cognitive Neuroscience, 7, 81–94. Gomes, H., Sussman, E, Ritter, W., Kurtzberg, D., Cowan, N., & Vaughan, H. G. Jr. (1999). Electrophysiological evidence of developmental changes in the duration of auditory sensory memory. Developmental Psychology, 35, 294–302. Gomot, M., Giard, M.-H., Roux, S., Barthelemy, C., & Bruneau, N. (2000). Maturation of frontal and temporal components of mismatch negativity (MMN) in children. NeuroReport, 11, 3109–3112. Goodin, D. S., Squires, K. C., Henderson, B. H., & Starr, A. (1978). An early event-related cortical potential. Psychophysiology, 15, 360–365. Graham, F. K. (1979). Distinguishing among orienting, defence and startle reflexes. In H. D. Kimmel, E. H. van Olst, & J. F. Orlebeke
Auditory processing that leads to conscious perception (Eds.), The orienting reflex in humans (pp. 137–167). Hillsdale, NJ: Erlbaum. Grau, C., Escera, C., Yago, E., & Polo, D. (1998). Mismatch negativity and auditory memory evaluation: A new faster paradigm. NeuroReport, 9, 2451–2456. Grimm, S., & Schro¨ger, E. (2005). Preattentive and attentive processing of temporal and frequency characteristics within long sounds. Cognitive Brain Research, 25, 711–721. Grimm, S., & Schro¨ger, E. (2007). The processing of frequency deviations within sounds: Evidence for the predictive nature of the mismatch negativity (MMN) system. Restorative Neurology and Neuroscience, 25, 241–249. Gunter, T. C., Friederici, A. D., & Hahne, A. (1999). Brain responses during sentence reading: Visual input affects central processing. NeuroReport, 10, 3175–3178. Hackley, S. A., Woldorff, M., & Hillyard, S. A. (1987). Combined use of microreflexes and event-related brain potentials as measures of auditory selective attention. Psychophysiology, 24, 632–647. Hackley, S. A., Woldorff, M., & Hillyard, S. A. (1990). Cross-model selective attention effects on retinal, myogenic, brainstem, and cerebral evoked potentials. Psychophysiology, 27, 195–208. Hahne, A., & Friederici, A. D. (1999). Electrophysiological evidence for two steps in syntactic analysis: Early automatic and late controlled processes. Journal of Cognitive Neuroscience, 11, 194–205. Hansen, J. C., & Hillyard, S. A. (1980). Endogenous brain potentials associated with selective auditory attention. Electroencephalography & Clinical Neurophysiology, 49, 277–290. Hansen, J. C., & Hillyard, S. A. (1983). Selective attention to multidimensional auditory stimuli in man. Journal of Experimental Psychology: Human Perception and Performance, 9, 1–9. Hansen, J. C., & Hillyard, S. A. (1984). Effects of stimulation rate and attribute cuing on event-related potentials during selective auditory attention. Psychophysiology, 21, 394–405. Hari, R., Ha¨ma¨la¨inen, M., Ilmoniemi, R. J., Kaukoranta, E., Reinikainen, K., Salminen, J., et al. (1984). Responses of the primary auditory cortex to pitch changes in a sequence of tone pips: Neuromagnetic recordings in man. Neuroscience Letters, 50, 127–132. Hari, R., Ha¨ma¨la¨inen, M., Kaukoranta, E., Ma¨kela¨, J., Joutsiniemi, S. L., & Tiihonen, J. (1989). Selective listening modifies activity of the human auditory cortex. Experimental Brain Research, 74, 463–470. Haroush, K., Hochstein, S., & Deouell, L. Y. (2010). Momentary fluctuations in allocation of attention: Cross-modal effects of visual task load on auditory discrimination. Journal of Cognitive Neuroscience, 22, 1440–1451. Hasting, A. S., Kotz, S. A., & Friederici, A. D. (2007). Setting the stage for automatic syntax processing: The mismatch negativity as an indicator of syntactic priming. Journal of Cognitive Neuroscience, 19, 386–400. Hawkins, H. L., & Presson, J. C. (1977). Masking and pereptual selectivity in auditory recognition. In S. Dornic (Ed.), Attention and performance VI (pp. 195–211). Hillsdale, NJ: Erlbaum. Hawkins, H. L., & Presson, J. C. (1986). Auditory information processing. In K. R. Boff, L. Kaufman, & J. P. Thomas (Eds.), Handbook of Perception and Human Performance (Vol. 2, pp. 261–264). New York: Wiley. Hermanutz, M., Cohen, R., & Sommer, W. (1981). The effects of serial order in long sequences of auditory stimuli on event-related potentials. Psychophysiology, 18, 415–423. Hillyard, S. A., & Hansen, J. C. (1986). Attention: Electrophysiological approaches. In M. G. H. Coles, E. Donchin, & S. W. Porges (Eds.), Psychophysiology: Systems, processes, and applications (pp. 227–243). New York: The Guilford Press. Hillyard, S. A., Hink, R. F., Schwent, V. L., & Picton, T. W. (1973). Electrical signs of selective attention in the human brain. Science, 182, 177–180. Hillyard, S. A., & Kutas, M. (1983). Electrophysiology of cognitive processing. Annual Review of Psychology, 34, 33–61. Holender, D. (1986). Semantic activation without conscious identification in dichotic listening, parafoveal vision, and visual masking: A survey and appraisal. Behavioral and Brain Sciences, 9, 1–66. Horva´th, J., Czigler, I., Jacobsen, T., Maess, B., Schro¨ger, E., & Winkler, I. (2008). MMN or no MMN: No magnitude of deviance effect on the MMN amplitude. Psychophysiology, 45, 60–69. Huotilainen, M., Kujala, A., Hotakainen, M., Parkkonen, L., Taulu, S., Simola, J., et al. (2005). Short-term memory functions of the human
17 fetus recorded with magnetoencephalography. NeuroReport, 16, 81–84. Ja¨a¨skela¨inen, I. P., Ahveninen, J., Belliveau, J. W., Raij, T., & Sams, M. (2007). Short-term plasticity in auditory cognition. Trends in Neurosciences, 30, 653–661. Ja¨a¨skela¨inen, I. P., Ahveninen, J., Bonmassar, G., Dale, A. M., Ilmoniemi, R. J., Leva¨nen, S., et al. (2004). Human posterior auditory cortex gates novel sounds to consciousness. Proceedings of the National Academy of Sciences, 101, 6809–6814. Ja¨a¨skela¨inen, I. P., Alho, K., Escera, C., Winkler, I., Sillanaukee, P., & Na¨a¨ta¨nen, R. (1996b). Effects of ethanol and auditory distraction on forced choice reaction time. Alcohol, 13, 153–156. Ja¨a¨skela¨inen, I. P., Pekkonen, E., Hirvonen, J., Sillanaukee, P., & Na¨a¨ta¨nen, R. (1996a). Mismatch negativity subcomponents and ethyl alcohol. Biological Psychology, 43, 13–25. Ja¨a¨skela¨inen, I. P., Varonen, R., Na¨a¨ta¨nen, R., & Pekkonen, E. (1999). Decay of cortical pre-attentive sound discrimination in middle-age. NeuroReport, 10, 123–126. Jacobsen, T., & Schro¨ger, E. (2001). Is there pre-attentive memory-based comparison of pitch? Psychophysiology, 38, 723–727. Jacobsen, T., Schro¨ger, E., Horenkamp, T., & Winkler, I. (2003). Mismatch negativity to pitch change: Varied stimulus proportions in controlling effects of neural refractoriness on human auditory event related brain potentials. Neuroscience Letters, 344, 79–82. Javitt, D. C., Schroeder, C. E., Steinschneider, M., Arezzo, J. C., & Vaughan Jr., H. G. (1992). Demonstration of mismatch negativity in the monkey. Electroencephalography and Clinical Neurophysiology, 83, 87–90. Javitt, D. C., Steinschneider, M., Schroeder, C. E., & Arezzo, J. C. (1996). Role of cortical N-methyl-D-aspartate receptors in auditory sensory memory and mismatch negativity generation: Implications for schizophrenia. Proceedings of the National Academy of Sciences of the USA, 93, 11962–11967. Javitt, D. C., Steinschneider, M., Schroeder, C. E., Vaughan Jr., H. G., & Arezzo, J. C. (1994). Intracortical mechanisms of mismatch negativity (MMN) generation. Brain Research, 667, 192–200. Johnson, K. L., Nicol, T., Zecker, S. G., Bradlow, A. R., Skoe, E., & Kraus, N. (2008). Brainstem encoding of voiced consonant-vowel stop syllables. Clinical Neurophysiology, 119, 2613–2635. Johnson, K. L., Nicol, T., Zecker, S. G., & Kraus, N. (2007). Auditory brainstem correlates of perceptual timing deficits. Journal of Cognitive Neuroscience, 19, 376–385. Kahneman, D., & Treisman, A. (1984). Changing views of attention and automaticity. In R. Parasuraman & D. R. Davies (Eds.), Varieties of attention (pp. 29–61). New York: Academic Press. Kaipio, M-L., Cheour, M., Ceponiene, R., O¨hman, J., Alku, P., & Na¨a¨ta¨nen, R. (2000). Increased distractibility in closed head injury as revealed by event-related potentials. NeuroReport, 11, 1463–1468. Kallman, H. J., & Massaro, D. W. (1979). Similarity effects in backward recognition masking. Journal of Experimental Psychology: Human Perception and Performance, 5, 110–128. Kekoni, J., Ha¨ma¨la¨inen, H., Saarinen, M., Gro¨hn, J., Reinikainen, K., Lehtokoski, A., & Na¨a¨ta¨nen, R. (1997). Rate effect and mismatch responses in the somatosensory system: ERP-recordings in humans. Biological Psychology, 46, 125–142. King, C., Warrier, C. M., Hayes, E., & Kraus, N. (2002). Deficits in auditory brainstem pathway encoding of speech sounds in children with learning problems. Neuroscience Letters, 319, 111–115. Knight, R. T. (1991). Evoked potential studies of attention capacity in human frontal lobe lesions. In H. Levin, H. Eisenberg, & F. Benton (Eds.), Frontal lobe function and dysfunction (pp. 139–153). Oxford: Oxford University Press. Kodera, K., Hink, R. F., Yamada, O., & Suzuki, J. (1979). Effects of rise time on simultaneously recorded auditory-evoked potentials from the early, middle and late ranges. Audiology, 18, 395–402. Koelsch, S., Gunter, T. C., Schro¨ger, E., & Friederici, A. D. (2003). Processing tonal modulations: An ERP study. Journal of Cognitive Neuroscience, 15, 1149–1159. Koelsch, S., Grossmann, T., Gunter, T. C., Hahne, A., Schro¨ger, E., & Friederici, A. D. (2003). Children processing music: Electric brain responses reveal musical competence and gender differences. Journal of Cognitive Neuroscience, 15, 683–693. Koelsch, S., & Siebel, W. A. (2005). Towards a neural basis of music perception. Trends in Cognitive Sciences, 9, 578–584.
18 Korzyukov, O., Alho, K., Kujala, A., Gumenyuk, V., Ilmoniemi, R. J., Virtanen, J., et al.. (1999). Electromagnetic responses of the human auditory cortex generated by sensory-memory based processing of tone-frequency changes. Neuroscience Letters, 276, 169–172. Krauel, K., Schott, P., Sojka, B., Pause, B. M., & Ferstl, R. (1999). Is there a mismatch negativity analogue in the olfactory event-related potential? Journal of Psychophysiology, 13, 49–55. Kraus, N., McGee, T. J., Carrell, T. D., King, C., Tremblay, K., & Nicol, T. G. (1995). Central auditory plasticity associated with speech discrimination training. Journal of Cognitive Neuroscience, 7, 25–32. Kraus, N., McGee, T. J., Carrell, T. D., Zecker, S. G., Nicol, T. G., & Koch, D. B. (1996). Auditory neurophysiologic response and discrimination deficits in children with learning problems. Science, 273, 971–973. Kraus, N., & Nicol, T. G. (2005). Brainstem origins for cortical ‘‘what’’ and ‘‘where’’ pathways in the auditory system. Trends in Neuroscience, 28, 176–181. Kremlacek, J., Kuba, M., Chlubnova, J., & Kubova, Z. (2004). Effect of stimulus localization on motion-onset VEP. Vision Research, 44, 2989–3000. Kremlacek, J., Kuba, M., Kubova, Z., & Langrova, J. (2006). Visual mismatch negativity elicited by magnocellular system activation. Vision Research, 46, 485–490. Kropotov, J. D., Na¨a¨ta¨nen, R., Sevostianov, A. V., Alho, K., Reinikainen, K., & Kropotova, O. V. (1995). Mismatch negativity to auditory stimulus change recorded directly from the human temporal cortex. Psychophysiology, 32, 418–422. Kujala, T., Tervaniemi, M., & Schro¨ger, E. (2007). The mismatch negativity in cognitive and clinical neuroscience: Theoretical and methodological considerations. Biological Psychology, 74, 1–19. Kushnerenko, E., Teinonen, T., Volein, A., & Csibra, G. (2008). Electrophysiological evidence of illusory audiovisual speech percept in human infants. Proceedings of the National Academy of Sciences USA., 105, 11442–11445. Ladinig, O., Honing, H., Ha´den, G., & Winkler, I. (2009). Probing attentive and pre–attentive emergent meter in adult listeners without extensive music training. Music Perception, 26, 377–386. Lang, H. A., Nyrke, T., Ek, M., Aaltonen, O., Raimo, I., & Na¨a¨ta¨nen, R. (1990). Pitch discrimination performance and auditory event-related potentials. In C. H. M. Brunia, A. W. K. Gaillard, A. Kok, G. Mulder, & M. N. Verbaten (Eds.), Psychophysiological brain research (Vol. 1, pp. 294–298). Tilburg, The Netherlands: Tilburg University Press. Leino, S., Brattico, E., Tervaniemi, M., & Vuust, P. (2007). Representation of harmony rules in the human brain: Further evidence from event-related potentials. Brain Research, 1142, 169–177. Leva¨nen, S., Ahonen, A., Hari, R., McEvoy, L., & Sams, M. (1996). Deviant auditory stimuli activate human left and right auditory cortex differently. Cerebral Cortex, 6, 288–296. Leva¨nen, S., Hari, R., McEvoy, L., & Sams, M. (1993). Responses of the human auditory cortex to changes in one versus two stimulus features. Experimental Brain Research, 97, 177–183. Lindsley, D. B. (1960). Attention, consciousness, sleep and wakefulness. In J. Field, et al. (Eds.) Handbook of Physiology, Neurophysiology (vol. III, Section 1, pp. 1553–1593). Washington, D.C.: American Physiological Society. Loui, P., Grent-t-Jong, T., Torpey, D., & Woldorff, M. (2005). Effects of attention on the neural processing of harmonic syntax in Western music. Cognitive Brain Research, 25, 678–687. Loveless, N. (1983). The orienting response and evoked potentials in man. In D. Siddle (Ed.) Orienting and habituation: Perspectives in human research (pp. 71–108). New York: John Wiley & Sons. Lyytinen, H., Blomberg, A. P., & Na¨a¨ta¨nen, R. (1992). Event-related potentials and autonomic responses to a change in unattended auditory stimuli. Psychophysiology, 29, 523–534. Lyytinen, H., & Na¨a¨ta¨nen, R. (1987). Autonomic and ERP responses to deviant stimuli: Analysis of covariation. In R. Johnson, J. W. Rohrbaugh, & R. Parasuraman (Eds.) Current Trends in EventRelated Brain Potential Research. Electroencephalography and Clinical Neurophysiology. Suppl., 40, 108–117. MacMillan, N. A. (1973). Detection and recognition of intensity changes in tone and noise: The detection-recognition disparity. Perception and Psychophysics, 13, 65–75.
R. Na¨a¨ta¨nen et al. Maess, B., Jacobsen, T., Schro¨ger, E., & Friederici, A. D. (2007). Localizing pre-attentive auditory memory-based comparison: Magnetic mismatch negativity to pitch change. NeuroImage, 37, 561–571. Massaro, D. W. (1970). Retroactive interference in short-term recognition memory for pitch. Journal of Experimental Psychology, 83, 32–39. Massaro, D. W. (1976). Auditory information processing. In W. K. Estes (Ed.), Handbook of learning and cognitive processes (pp. 275–320). Lawrence Erlbaum Associates, Hillsdale, NJ. May, P. J. C., & Tiitinen, H. (2009). Mismatch negativity (MMN), the deviance-elicited auditory deflection, explained. Psychophysiology, 46, 1–57. McCallum, W. C., Curry, S. H., Cooper, R., Pocock, P. V., & Papakostopoulos, D. (1983). Brain event-related potentials as indicators of early selective processes in auditory target localization. Psychophysiology, 20, 1–17. Menning, H., Zwitserlood, P., Scho¨ning, S., Hihn, H., Bo¨lte, J., Dobel, C., et al. (2005). Pre-attentive detection of syntactic and semantic errors. NeuroReport, 16, 77–80. Michie, P. T., Solowij, N., Crawford, J. M., & Glue, L. C. (1993). The effects of between-source discriminability on attended and unattended auditory ERPs. Psychophysiology, 30, 205–220. Mitterer, H., & Blomert, L. (2003). Coping with phonological assimilation in speech perception: Evidence for early compensation. Perception & Psychophysics, 65, 956–969. Molholm, S., Martinez, A., Ritter, W., Javitt, D. C., & Foxe, J. J. (2005). The neural circuitry of pre-attentive auditory change-detection: An fMRI study of pitch and duration mismatch negativity generators. Cerebral Cortex, 15, 545–551. Moore, B. C. J. (1989). An Introduction to the Psychology of Hearing. London: Academic Press. Moore, D. R. (2003). Cortical neurons signal sound novelty. Nature Neuroscience, 6, 330–332. Moray, N. (1959). Attention in dichotic listening: Affective cues and the influence of instruction. The Quarterly Journal of Experimental Psychology, 9, 56–60. Mo¨tto¨nen, R., Krause, C. M., Tiippana, K., & Sams, M. (2002). Processing of changes in visual speech in the human auditory cortex. Cognitive Brain Research, 13, 417–425. Mo¨tto¨nen, R., Schurman, M., & Sams, M. (2004). Time course of multisensory interactions during audiovisual speech perception in humans: A magnetoencephalographic study. Neuroscience Letters, 363, 112–115. Mu¨ller, D., Winkler, I., Roeber, U., Schaffer, S., Czigler, I., & Schro¨ger, E. (2010). Visual object representations can be formed outside the focus of voluntary attention: Evidence from event-related potentials. Journal of Cognitive Neuroscience, 22, 1179–1188. Mu¨nte, T. F., Altenmu¨ller, E., & Ja¨ncke, L. (2002). The musician’s brain as a model of plasticity. Nature Reviews Neuroscience, 3, 473–478. Na¨a¨ta¨nen, R. (1970). EEG, slow potential and evoked potential correlates of selective attention. In A. F. Sanders (Ed.), Attention and performance III. Acta Psychologica (33, 178–192). Na¨a¨ta¨nen, R. (1975). Selective attention and evoked potentials in humans - a critical review. Biological Psychology, 2, 237–307. Na¨a¨ta¨nen, R. (1982). Processing negativity: An evoked-potential reflection of selective attention. Psychological Bulletin, 92, 605–640. Na¨a¨ta¨nen, R. (1984). In search of a short-duration memory trace of a stimulus in the human brain. In L. Pulkkinen & P. Lyytinen (Eds.), Human action and personality. Essays in honour of Martti Takala (pp. 29–43). Jyva¨skyla¨: University of Jyva¨skyla¨. Na¨a¨ta¨nen, R. (1986). N2 and automatic vs. controlled processes: A classification of N2 kinds of ERP components. Electroencephalography and Clinical Neurophysiology, Suppl., 38, 169–171. Na¨a¨ta¨nen, R. (1990). The role of attention in auditory information processing as revealed by event-related potentials and other brain measures of cognitive function. The Behavioral and Brain Sciences, 13, 201–288. Na¨a¨ta¨nen, R. (1992). Attention and brain function. Hillsdale, NJ: Lawrence Erlbaum Associates. Na¨a¨ta¨nen, R. (2009). Somatosensory mismatch negativity: A new clinical tool for developmental neurological research? Developmental Medicine & Child Neurology, 51, 927–931. Na¨a¨ta¨nen, R., & Alho, K. (1995). Mismatch negativity: A unique measure of sensory processing in audition. International Journal of Neuroscience, 80, 317–337.
Auditory processing that leads to conscious perception Na¨a¨ta¨nen, R., & Alho, K. (1997). Mismatch negativity (MMN) - the measure for central sound representation accuracy. Audiology & Neuro-Otology, 2, 341–353. Na¨a¨ta¨nen, R., Astikainen, P., Ruusuvirta, T., & Huotilainen, M. (2010). Automatic auditory intelligence: An expression of the sensory-cognitive core of cognitive processes. Brain Research, 64, 123–136. Na¨a¨ta¨nen, R., & Gaillard, A. W. K. (1983). The orienting reflex and the N2 deflection of the event-related potential (ERP). In A. W. K. Gaillard & W. Ritter (Eds.), Tutorials in Event Related Potential Research: Endogenous components (pp. 119–141). Amsterdam: North Holland Publishing. Na¨a¨ta¨nen, R., Gaillard, A. W. K., & Ma¨ntysalo, S. (1978). Early selective-attention effect on evoked potential reinterpreted. Acta Psychologica, 42, 313–329. Na¨a¨ta¨nen, R., Gaillard, A. W. K., & Ma¨ntysalo, S. (1980). Brain potential correlates of voluntary and involuntary attention. In H. H. Kornhuber & L. Deecke (Eds.), Motivation, motor and sensory processes of the brain: Electrical potentials, behavior and clinical use. Progress in Brain Research, 54, 343–348. Amsterdam: Elsevier. Na¨a¨ta¨nen, R., Lehtokoski, A., Lennes, M., Cheour, M., Huotilainen, M., Iivonen, A., et al. (1997). Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature, 385, 432–434. Na¨a¨ta¨nen, R., & Michie, P. T. (1979). Early selective attention effects on the evoked potential: A critical review and reinterpretation. Biological Psychology, 8, 81–136. Na¨a¨ta¨nen, R., Paavilainen, P., Alho, K., Reinikainen, K., & Sams, M. (1989). Do event-related potentials reveal the mechanism of the auditory sensory memory in the human brain? Neuroscience Letters, 98, 217–221. Na¨a¨ta¨nen, R., Paavilainen, P., Rinne, T., & Alho, K. (2007). The mismatch negativity (MMN) in basic research of central auditory processing: A review. Clinical Neurophysiology, 118, 2544–2590. Na¨a¨ta¨nen, R., & Picton, T. (1987). The N1 wave of the human electric and magnetic response to sound: A review and an analysis of the component structure. Psychophysiology, 24, 375–425. Na¨a¨ta¨nen, R., & Rinne, T. (2002). Electric brain response to sound repetition in humans: An index of long-term-memory-trace formation? Neuroscience Letters, 318, 49–51. Na¨a¨ta¨nen, R., Sams, M., Alho, K., Paavilainen, P., Reinikainen, K., & Sokolov, E. N. (1988). Frequency and location specificity of the human vertex N1 wave. Electroencephalography and Clinical Neurophysiology, 69, 523–531. Na¨a¨ta¨nen, R., Schro¨ger, E., & Alho, K. (2002). Electrophysiology of attention. In J. Wixted (Ed.), Stevens’ Handbook of Experimental Psychology (3rd edition, pp. 601–653). New York: John Wiley & Sons. Inc. Na¨a¨ta¨nen, R., Simpson, M., & Loveless, N. E. (1982). Stimulus deviance and evoked potentials. Biological Psychology, 14, 53–98. Na¨a¨ta¨nen, R., & Winkler, I. (1999). The concept of auditory stimulus representation in cognitive neuroscience. Psychological Bulletin, 6, 826–859. Nelken, I., & Ulanovsky, N. (2007). Mismatch negativity and stimulusspecific adaptation in animal models. Journal of Psychophysiology, 21, 214–223. Newstead, S. E., & Dennis, I. (1979). Lexical and grammatical processing of unshadowed messages: A reexamination of the McKay effect. Quarterly Journal of Experimental Psychology, 31, 477–488. Norman, D. A. (1968). Toward a theory of memory and attention. Psychological Review, 75, 522–536. Nousak, J. M. K., Deacon, D., Ritter, W., & Vaughan Jr., H. G. (1996). Storage of information in transient auditory memory. Cognitive Brain Research, 4, 305–317. Ocea´k, A., Winkler, I., Sussman, E., & Alho, K. (2006). Loudness summation and the mismatch negativity event related brain potential in humans. Psychophysiology, 43, 13–20. O¨hman, A. (1979). The orienting response, attention and learning: An information-processing perspective. In H. D. Kimmel, E. H. van Olst, & J. F. Orlebeke (Eds.), The orienting reflex in humans (pp. 443–471). Hillsdale, NJ: Lawrence Erlbaum Associates. Okita, T. (1979). Event-related potentials and selective attention to auditory stimuli varying in pitch and localization. Biological Psychology, 9, 271–284.
19 Okita, T., Konishi, K., & Inamori, R. (1983). Attention-related negative brain potential for speech words and pure tones. Biological Psychology, 16, 29–47. Onishi, S., & Davis, H. (1968). Effects of duration and rise time of tonebursts on evoked V-potentials. Journal of the Acoustical Society of America, 44, 582–591. Opitz, B., Schro¨ger, E., & von Cramon, D. Y. (2005). Sensory and cognitive mechanisms for preattentive change detection in auditory cortex. European Journal of Neuroscience, 21, 531–535. Ostroff, J. M., McDonald, K. L., Schneider, B. A., & Alain, C. (2003). Aging and the processing of sound duration in human auditory cortex. Hearing Research, 181, 1–7. Paavilainen, P., Alho, K., Reinikainen, K., Sams, M., & Na¨a¨ta¨nen, R. (1991). Right-hemisphere dominance of different mismatch negativities. Electroencephalography and Clinical Neurophysiology, 78, 466–479. Paavilainen, P., Araja¨rvi, P., & Takegata, R. (2007). Preattentive detection of nonsalient contingencies between auditory features. NeuroReport, 18, 159–163. Parasuraman, R. (1978). Auditory potentials and divided attention. Psychophysiology, 15, 460–465. Parasuraman, R., & Beatty, J. (1980). Brain events underlying detection and recognition of weak sensory signals. Science, 210, 80–83. Pasman, J. W., Rotteveel, J. J., Maassen, B., & Visco, Y. M. (1999). The maturation of auditory cortical evoked responses between (preterm) birth and 14 years of age. European Journal of Neurology, 3, 79–82. Pazo-Alvarez, P., Amenedo, E., & Cadaveira, F. (2004). Automatic detection of motion direction changes in the human brain. European Journal of Neuroscience, 19, 1978–1986. Pazo-Alvarez, P., Cadaveira, F., & Amenedo, E. (2003). MMN in visual modality: A review. Biological Psychology, 63, 199–236. Pedersen, C. B., & Salomon, G. (1977). Temporal integration of acoustic energy. Acta Otolaryngology, 83, 417–423. Pekkonen, E., Rinne, T., Reinikainen, K., Kujala, T., Alho, K., & Na¨a¨ta¨nen, R. (1996). Aging effects on auditory processing: An eventrelated potential study. Experimental Aging Research, 22, 171–184. Phillips, C. (2001). Levels of representation in the electrophysiology of speech perception. Cognitive Science, 25, 711–731. Picton, T. W., Goodman, W. S., & Bryce, D. P. (1970). Amplitude of evoked responses to tones of high intensity. Acta Otolaryngology, 70, 77–82. Picton, T. W., Hillyard, S. A., Krausz, H. I., & Galambos, R. (1974). Human auditory evoked potentials: I. Evaluation of components. Electroencephalography & Clinical Neurophysiology, 36, 179–190. Picton, T. W., Stapells, D. R., & Campbell, K. N. (1981). Auditory evoked potentials from the human cochlea and brainstem. Journal of Otolaryngology, 9(Suppl.), 1–41. Picton, T. W., Woods, D. L., & Proulx, G. B. (1978). Human auditory sustained potentials. II. Stimulus relationships. Electroencephalography & Clinical Neurophysiology, 45, 198–210. Pincze, Z., Lakatos, P., Rajkai, C., Ulbert, I., & Karmos, G. (2001). Separation of mismatch negativity and the N1 wave in the auditory cortex of the cat: A topographic study. Clinical Neurophysiology, 112, 778–784. Pincze, Z., Lakatos, P., Rajkai, C., Ulbert, I., & Karmos, G. (2002). Effect of deviant probability and interstimulus/interdeviant interval on the auditory N1 and mismatch negativity in the cat auditory cortex. Cognitive Brain Research, 13, 249–253. Ponton, C. W., Don, M., Eggermont, J. J., Waring, M. D., Kwong, B., Cunningham, J., & Trautwein, P. (2000). Maturation of the mismatch negativity: Effects of profound deafness and cochlear implant use. Audiology & Neuro-Otology, 5, 167–185. Ponton, C., Eggermont, J. J., Khosla, D., Kwong, B., & Don, M. (2002). Maturation of human central auditory system activity: Separating auditory evoked potentials by dipole source modeling. Clinical Neurophysiology, 113, 407–420. Ponton, C., Eggermont, J. J., Kwong, B., & Don, M. (2000a). Maturation of human central auditory system activity: Evidence from multi-channel evoked potentials. Clinical Neurophysiology, 111, 220–236. Posner, M. I., & Snyder, C. R. R. (1975). Attention and cognitive control. In R. L. Solso (Ed.), Information processing and cognition: The Loyola Symposium (pp. 205–223). New Jersey: Erlbaum. Pratt, H., & Sohmer, H. (1977). Correlations between psychophysical magnitude estimates and simultaneously obtained auditory nerve,
20 brain stem and cortical responses to click stimuli in man. Electroencephalography & Clinical Neurophysiology, 43, 802–812. Pulvermu¨ller, F. (2001). Brain reflections of words and their meaning. Trends in Cognitive Sciences, 5, 517–524. Pulvermu¨ller, F., & Assadollahi, R. (2007). Grammar or serial order?: Discrete combinatorial brain mechanisms reflected by the syntactic mismatch negativity. Journal of Cognitive Neuroscience, 19, 971–980. Pulvermu¨ller, F., & Knoblauch, A. (2009). Discrete combinatorial circuits emerging in neural networks: A mechanism for rules grammar in the human brain? Neural Networks, 22, 161–172. Pulvermu¨ller, F., Kujala, T., Shtyrov, Y., Simola, J., Tiitinen, H., Alku, P., et al. (2001). Memory traces for words as revealed by the mismatch negativity. NeuroImage, 14, 607–616. Pulvermu¨ller, F., & Shtyrov, Y. (2003). Automatic processing of grammar in the human brain as revealed by the mismatch negativity. NeuroImage, 20, 159–172. Pulvermu¨ller, F., & Shtyrov, Y. (2006). Language outside the focus of attention: The mismatch negativity as a tool for studying higher cognitive processes. Progress in Neurobiology, 79, 49–71. Pulvermu¨ller, F., Shtyrov, Y., & Hauk, O. (2009). Understanding in an instant: Neurophysiological evidence for mechanistic language circuits in the brain. Brain & Language, 110, 81–94. Rinne, T., Alho, K., Ilmoniemi, R. J., Virtanen, J., & Na¨a¨ta¨nen, R. (2000). Separate time behaviors of the temporal and frontal MMN sources. NeuroImage, 12, 14–19. Rinne, T., Balk, M. H., Koistinen, S., Autti, T., Alho, K., & Sams, M. (2008). Auditory selective attention modulates activation of human inferior collicus. Journal of Neurophysiology, 100, 3323–3327. Rinne, T., Gratton, G., Fabiani, M., Cowan, N., Maclin, E., Stinard, A., et al. (1999). Scalp-recorded optical signals make sound processing in the auditory cortex visible. NeuroImage, 10, 620–624. Rinne, T., Sa¨rkka¨, A., Degerman, A., Schro¨ger, E., & Alho, K. (2006). Two separate mechanisms underlie auditory change detection and involuntary control of attention. Brain Research, 1077, 135–143. Ritter, W., Deacon, D., Gomes, H., Javitt, D. C., & Vaughan Jr., H. G. (1995). The mismatch negativity of event-related potentials as a probe of transient auditory memory: A review. Ear and Hearing, 16, 52–67. Ritter, W., Sussman, E., & Molholm, S. (2000). Evidence that the mismatch negativity system works on the basis of objects. NeuroReport, 11, 61–63. Rosburg, T. (2003). Left hemispheric dipole locations of the neuromagnetic mismatch negativity to frequency, intensity and duration deviants. Cognitive Brain Research, 16, 83–90. Rosburg, T., Haueisen, J., & Kreitschmann-Andermahr, I. (2004). The dipole location shift within the auditory evoked neuromagnetic field components N100m and mismatch negativity (MMNm). Clinical Neurophysiology, 115, 906–913. Rossi, S., Gugler, M. F., Friederici, A. D., & Hahne, A. (2006). The impact of proficiency on syntactic second-language processing of German and Italian: Evidence from event-related potentials. Journal of Cognitive Neuroscience, 18, 2030–2048. Sams, M., Ha¨ma¨la¨inen, M., Antervo, A., Kaukoranta, E., Reinikainen, K., & Hari, R. (1985). Cerebral neuromagnetic responses evoked by short auditory stimuli. Electroencephalography and Clinical Neurophysiology, 61, 254–266. Sams, M., Hari, R., Rif, J., & Knuutila, J. (1993). The human auditory sensory memory trace persists about 10 sec: Neuromagnetic evidence. Journal of Cognitive Neuroscience, 5, 363–370. Sams, M., Kaukoranta, E., Ha¨ma¨la¨inen, M., & Na¨a¨ta¨nen, R. (1991). Cortical activity elicited by changes in auditory stimuli: Different sources for the magnetic N100m and mismatch responses. Psychophysiology, 28, 21–29. Scharf, B. (1978). Loudness. In E. C. Carterette & N. P. Friedman (Eds.), Handbook of perception, vol. IV, Hearing (pp. 187–242). New York: Academic Press. Scharf, B., & Houtsma, A. J. (1986). Audition II. Loudness, pitch, localization, aural disortion, pathology. In K. R. Boff, L. Kaufman, & J. P. Thomas (Eds.), Handbook of perception and human performance: vol. 1. Sensory processes and perception (pp. 15.1–60). New York: Wiley.
R. Na¨a¨ta¨nen et al. Scherg, M., Vajsar, J., & Picton, T. W. (1989). A source analysis of the late human auditory evoked potentials. Journal of Cognitive Neuroscience, 1, 336–355. Schro¨ger, E. (1996). A neural mechanism for involuntary attention shifts to changes in auditory stimulation. Journal of Cognitive Neuroscience, 8, 527–539. Schro¨ger, E. (1997). On the detection of auditory deviants: A pre-attentive activation model. Psychophysiology, 34, 245–257. Schro¨ger, E., Bendixen, A., Trujillo-Barreto, N. J., & Roeber, U. (2007). Processing of abstract rule violations in audition. PLoS One, 11, e1131. Schro¨ger, E., Na¨a¨ta¨nen, R., & Paavilainen, P. (1992). Event-related potentials reveal how non-attended complex sound patterns are represented by the human brain. Neuroscience Letters, 146, 183–186. Sculthorpe, L. D., Ouellet, D. R., & Campbell, K. B. (2009). MMN elicitation during natural sleep to violations of an auditory pattern. Brain Research, 1290, 52–62. Sharma, A., & Dorman, M. F. (2000). Neurophysiologic correlates of cross-language phonetic perception. Journal of the Acoustical Society of America, 107, 2697–2703. Sharma, A., Kraus, N., McGee, T. J., & Nicol, T. G. (1997). Developmental changes in P1 and N1 central auditory responses elicited by consonant-vowel syllables. Electroencephalography and Clinical Neurophysiology, 104, 540–545. Shestakova, A., Brattico, E., Huotilainen, M., Galunov, V., Soloviev, A., Sams, M., et al. (2002). Abstract phoneme representations in the left temporal cortex: Magnetic mismatch negativity study. NeuroReport, 13, 1813–1816. Shinozaki, N., Yabe, H., Sato, Y., Hiruma, T., Sutoh, T., Matsuoka, T., & Kaneko, S. (2003). Spectrotemporal window of integration of auditory information in the human brain. Cognitive Brain Research, 17, 563–571. Shtyrov, Y., Pulvermu¨ller, F., Na¨a¨ta¨nen, R., & Ilmoniemi, R. J. (2003). Grammar processing outside the focus of attention: An MEG study. Journal of Cognitive Neuroscience, 15, 1195–1206. Simson, R., Vaughan Jr., H. G., & Ritter, W. (1976). The scalp topography of potentials associated with missing visual or auditory stimuli. Electroencephalography and Clinical Neurophysiology, 40, 33–42. Simson, R., Vaughan Jr., H. G., & Ritter, W. (1977). The scalp topography of potentials in auditory and visual discrimination tasks. Electroencephalography and Clinical Neurophysiology, 42, 528–535. Snyder, E., & Hillyard, S. A. (1976). Long latency evoked potentials to irrelevant, deviant stimuli. Behavioural Biology, 16, 319–331. Sokolov, E. N., Spinks, J. A., Na¨a¨ta¨nen, R., & Lyytinen, H. (2002). The orienting response in information processing. Mahwah, NJ: Erlbaum. Squires, K. C., Wickens, C., Squires, N. K., & Donchin, E. (1976). The effect of stimulus sequence on the waveform of the cortical eventrelated potential. Science, 193, 1142–1146. Squires, N. K., Squires, K. C., & Hillyard, S. A. (1975). Two varieties of long-latency positive waves evoked by unpredictable auditory stimuli in man. Electroencephalography and Clinical Neurophysiology, 38, 387–401. Starr, A., & Don, M. (1988). Brain potentials evoked by acoustic stimuli. In T. W. Picton (Ed.), Human event-related potentials (Vol. 3, pp. 97– 157). Amsterdam: Elsevier. Stefanics, G., Ha´den, G. P., Sziller, I., Bala´zs, L., Beke, A., & Winkler, I. (2009). Newborn infants process pitch intervals. Clinical Neurophysiology, 120, 304–308. Stuss, D. T., & Benson, D. F. (1986). The frontal lobes. New York: Raven Press. Sussman, E. S. (2007). A new view on the MMN and attention debate: The role of context in processing auditory events. Journal of Psychophysiology, 21, 164–175. Sutton, S., Braren, M., Zubin, J., & John, E. R. (1965). Evoked-potential correlates of stimulus uncertainty. Science, 150, 1187–1188. Sysoeva, O., Takegata, R., & Na¨a¨ta¨nen, R. (2006). Pre-attentive representation of sound duration in the human brain. Psychophysiology, 43, 272–276. Takegata, R., Brattico, E., Tervaniemi, M., Varyagina, O., Na¨a¨ta¨nen, R., & Winkler, I. (2005). Preattentive representation of feature conjunctions for simultaneous, spatially distributed auditory objects. Cognitive Brain Research, 25, 169–179. Takegata, R., Huotilainen, M., Rinne, T., Na¨a¨ta¨nen, R., & Winkler, I. (2001). Changes in acoustic features and their conjunctions are processed by separate neuronal populations. NeuroReport, 12, 525–529.
Auditory processing that leads to conscious perception Takegata, R., & Morotomi, T. (1999). Integrated neural representation of sound and temporal features in human auditory sensory memory: An event-related potential study. Neuroscience Letters, 274, 29 207– 29 210. Takegata, R., Paavilainen, P., Na¨a¨ta¨nen, R., & Winkler, I. (1999). Independent processing of changes in auditory single features and feature conjunctions in humans as indexed by the mismatch negativity (MMN). Neuroscience Letters, 266, 109–112. Tervaniemi, M., & Brattico, E. (2004). From sounds to music. Towards understanding the neurocognition of musical sound perception. Journal of Consciousness Studies, 11, 9–27. Tervaniemi, M., Castaneda, A., Knoll, M., & Uther, M. (2006). Sound processing in amateur musicians and nonmusicians: Event-related potential and behavioral indices. NeuroReport, 17, 1225–1228. Tervaniemi, M., Maury, S., & Na¨a¨ta¨nen, R. (1994). Neural representations of abstract stimulus features in the human brain as reflected by the mismatch negativity. NeuroReport, 5, 844–846. Tervaniemi, M., Rytko¨nen, M., Schro¨ger, E., Ilmoniemi, R. J., & Na¨a¨ta¨nen, R. (2001). Superior formation of cortical memory traces for melodic patterns in musicians. Learning & Memory, 8, 295–300. Tervaniemi, M., Saarinen, J., Paavilainen, P., Danilova, N., & Na¨a¨ta¨nen, R. (1994). Temporal integration of auditory information in sensory memory as reflected by the mismatch negativity. Biological Psychology, 38, 157–167. Tiippana, K., Andersen, T. S., & Sams, M. (2004). Visual attention modulates audiovisual speech perception. European Journal of Cognitive Psychology, 16, 457–472. Tiitinen, H., Alho, K., Huotilainen, M., Ilmoniemi, R. J., Simola, J., & Na¨a¨ta¨nen, R. (1993). Tonotopic auditory cortex and the magnetoencephalographic (MEG) equivalent of the mismatch negativity. Psychophysiology, 30, 537–540. Tiitinen, H., May, P., Reinikainen, K., & Na¨a¨ta¨nen, R. (1994). Attentive novelty detection in humans is governed by pre-attentive sensory memory. Nature, 370, 90–92. Todd, J., Myers, R., Pirillo, R., & Drysdale, K. (2010). Neuropsychological correlates of auditory perceptual inference: A mismatch negativity (MMN) study. Brain Research, 1310, 113–123. Trainor, L. J., McDonald, K. L., & Alain, C. (2002). Automatic and controlled processing of melodic contour and interval information measured by electrical brain activity. Journal of Cognitive Neuroscience, 14, 1–13. Treisman, A. M. (1960). Contextual cues in selective listening. The Quarterly Journal of Experimental Psychology, 12, 242–248. Tse, C.-Y., & Penney, T. B. (2008). On the functional role of temporal and frontal cortex activation in passive detection of auditory deviance. NeuroImage, 41, 1462–1470. Ulanovsky, N., Las, L., & Nelken, I. (2003). Processing of low-probability sounds by cortical neurons. Nature Neuroscience, 6, 391–398. Umbricht, D., Schmid, L., Koller, R., Vollenweider, F. X., Hell, D., & Javitt, D. C. (2000). Ketamine-induced deficits in auditory and visual context-dependent processing in healthy volunteers. Archives of General Psychiatry, 57, 1139–1147. Valtonen, J., May, P., Ma¨kinen, V., & Tiitinen, H. (2003). Visual short-term memory load affects sensory processing of irrelevant sounds in human auditory cortex. Cognitive Brain Research, 17, 358–367. van Zuijen, T., Simoens, V. L., Paavilainen, P., Na¨a¨ta¨nen, R., & Tervaniemi, M. (2006). Implicit, intuitive, and explicit knowledge of abstract regularities in a sound sequence: An event-related brain potential study. Journal of Cognitive Neuroscience, 18, 1292–1303. Vaughan, H. G. Jr., & Arezzo, J. C. (1988). The neural basis of eventrelated potentials. In T. W. Picton (Ed.), Human event-related potentials (Vol. 3, pp. 45–96). Amsterdam: Elsevier. Vogel, E. K., & Luck, S. J. (2000). The visual N1 component as an index of a discrimination process. Psychophysiology, 37, 190–203. Behrens, von der W., Ba¨uerle, P., Ko¨ssl, M., & Gaese, B. H. (2009). Corrrelating stimulus-specific adaptation of cortical neurons and local field potentials in the awake rat. The Journal of Neuroscience, 29, 13837–13849. Walter, W. G. (1964). The convergence and interaction of visual, auditory and tactile responses in human non-specific cortex. Annals of the New York Academy of Science, 112, 320–361.
21 Wang, W., Datta, H., & Sussman, E. (2005). The development of the length of the temporal window of integration for rapidly presented auditory information as indexed by MMN. Clinical Neurophysiology, 116, 1695–1706. Widmann, A., Kujala, T., Tervaniemi, M., Kujala, A., & Schro¨ger, E. (2004). From symbols to sounds: Visual symbolic information activates sound representations. Psychophysiology, 41, 709–715. Winkler, I. (2007). Interpreting the mismatch negativity. Journal of Psychophysiology, 21, 147–163. Winkler, I., & Cowan, N. (2005). From sensory to long–term memory: Evidence from auditory memory reactivation studies. Experimental Psychology, 52, 3–20. Winkler, I., Cowan, N., Cse´pe, V., Czigler, I., & Na¨a¨ta¨nen, R. (1996). Interactions between transient and long-term auditory memory as reflected by the mismatch negativity. Journal of Cognitive Neuroscience, 8, 403–415. Winkler, I., Czigler, I., Sussman, E., Horva´th, J., & Bala´zs, L. (2005). Preattentive binding of auditory and visual stimulus features. Journal of Cognitive Neuroscience, 17, 320–339. Winkler, I., Denham, S. L., & Nelken, I. (2009a). Modeling the auditory scene: Predictive regularity representations and perceptual objects. Trends in Cognitive Sciences, doi:10.1016/j.tics.2009.09.003. Winkler, I., Ha´den, G. P., Ladinig, O., Sziller, I., & Honing, H. (2009b). Newborn infants detect the beat in music. Proceedings of the National Academy of Sciences USA, 106, 2468–2471. Winkler, I., Horva´th, J., Weisz, J., & Trejo, L. (2009c). Deviance detection in congruent audiovisual speech: Evidence for implicit integrated audiovisual memory representations. Biological Psychology, 82, 281–292. Winkler, I., Karmos, G., & Na¨a¨ta¨nen, R. (1996). Adaptive modeling of the unattended acoustic environment reflected in the mismatch negativity event-related potential. Brain Research, 742, 239–252. Winkler, I., Kujala, T., Tiitinen, H., Sivonen, P., Alku, P., Lehtokoski, A., et al. (1999). Brain responses reveal the learning of foreign language phonemes. Psychophysiology, 36, 638–642. Winkler, I., & Na¨a¨ta¨nen, R. (1993). Event-related brain potentials to infrequent partial omissions in series of auditory stimuli. In H.-J. Heinze, G. R. Mangun, & T. F. Mu¨nte (Eds.), New developments in event-related potentials (pp. 219–226). Boston-Basel-Berlin: Birkha¨user. Winkler, I., & Na¨a¨ta¨nen, R. (1994). The effects of auditory backward masking on event-related brain potentials. Electroencephalography and Clinical Neurophysiology, 44, 185–189. Winkler, I., Paavilainen, P., & Na¨a¨ta¨nen, R. (1992). Can echoic memory store two traces simultaneously? A study of event-related brain potentials. Psychophysiology, 29, 337–349. Winkler, I., Reinikainen, K., & Na¨a¨ta¨nen, R. (1993). Event related brain potentials reflect traces of the echoic memory in humans. Perception & Psychophysics, 53, 443–449. Winkler, I., Tervaniemi, M., Huotilainen, M., Ilmoniemi, R., Ahonen, A., Salonen, O., et al. (1995). From objective to subjective: Pitch representation in the human auditory cortex. NeuroReport, 6, 2317–2320. Winkler, I., Tervaniemi, M., & Na¨a¨ta¨nen, R. (1997). Two separate codes for missing-fundamental pitch in the human auditory cortex. Journal of the Acoustical Society of America, 102, 1072–1082. Woldorff, M. G., Hackley, S. A., & Hillyard, S. A. (1991). The effects of channel-selective attention on the mismatch negativity wave elicited by deviant tones. Psychophysiology, 28, 30–42. Woldorff, M. G., Hansen, J. C., & Hillyard, S. A. (1987). Evidence for effects of selective attention in the mid-latency range of the human auditory event-related potential. In R. Johnson Jr., J. W. Rohrbaugh, & R. Parasuraman (Eds.), Current trends in event-related brain potential research (Suppl. 40 to Electroencephalography and Clinical Neurophysiology; pp. 146–154). Amsterdam: Elsevier. Woldorff, M. G., Hillyard, S. A., Gallen, C. C., Hampson, S. R., & Bloom, F. E. (1998). Magnetoencephalographic recordings demonstrate attentional modulation of mismatch-related neural activity in human auditory cortex. Psychophysiology, 35, 283–292. Woldorff, M. G., & Hillyard, S. A. (1991). Modulation of early auditory processing during selective listening to rapidly presented tones. Electroencephalography and Clinical Neurophysiology, 79, 170–191.
22 Woods, D. L., & Elmasian, R. (1986). The habituation of event-related potentials to speech sounds and tones. Electroencephalography & Clinical Neurophysiology, 65, 447–459. Yabe, H., Koyoma, S., Kakigi, R., Gunji, A., Tervaniemi, M., Sato, Y., & Kaneko, S. (2001). Automatic discriminative sensitivity inside temporal window of sensory memory as a function of time. Cognitive Brain Research, 12, 39–48. Yabe, H., Matsuoka, T., Sato, Y., Hiruma, T., Sutoh, T., Koyama, S., et al. (2005). Time may be compressed in sound representation as replicated in sensory memory. NeuroReport, 16, 95–98. Yabe, H., Sutoh, T., Matsuoka, T., Asai, R., Hiruma, T., Sato, Y., et al. (2005). Transient gamma-band response is dissociated from sensory memory as reflected by MMN. Neuroscience Letters, 380, 80–82. Yabe, H., Tervaniemi, M., Reinikainen, K., & Na¨a¨ta¨nen, R. (1997). Temporal window of integration revealed by MMN to sound omission. NeuroReport, 8, 1971–1974.
R. Na¨a¨ta¨nen et al. Yabe, H., Tervaniemi, M., Sinkkonen, J., Huotilainen, M., Ilmoniemi, R. J., & Na¨a¨ta¨nen, R. (1998). Temporal window of integration of auditory information in the human brain. Psychophysiology, 35, 615–619. Yabe, H., Winkler, I., Czigler, I., Koyama, S., Kakigi, R., Suto, T., et al. (2001). Organizing sound sequences in the human brain: the interplay of auditory streaming and temporal integration. Brain Research, 897, 222–227. Yago, E., Escera, C., Alho, K., & Giard, M.-H. 2001. Cerebral mechanisms underlying orienting of attention towards auditory frequency changes. NeuroReport, 12, 2583–7. Zwicker, E., & Fastl, H. (1990). Psychoacoustics: Facts and models. Berlin: Springer-Verlag. (Received December 17, 2009; Accepted June 7, 2010)
Psychophysiology, 48 (2011), 23–30. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01038.x
Change-related responses in the human auditory cortex: An MEG study
KOYA YAMASHIRO,a KOJI INUI,a,b NAOFUMI OTSURU,a,b and RYUSUKE KAKIGIa,b a
Department of Integrative Physiology, National Institute for Physiological Sciences, Okazaki, Japan Department of Physiological Sciences, School of Life Sciences, Graduate University for Advanced Studies, Hayama, Japan
b
Abstract We recorded cortical activity in response to the onset, offset, and frequency change of a pure tone using magnetencephalograms (MEGs) to clarify the physiological significance of N1m relating to the detection of changes. Four interstimulus intervals (ISIs) (0.5, 1.5, 3, and 6 s) were used for each of the three auditory events. Results showed that (i) all three auditory events elicited N1m with a similar topography and similar temporal profile, (ii) the source of N1m was located in the superior temporal gyrus (STG) for all events under all ISI conditions, (iii) the amplitude of the STG activity as a function of the duration of the steady state preceding the change was similar among the three events, and (iv) there was a significant positive correlation in amplitude between on-N1m and off-N1m and between on-N1m and change-N1m. These results suggested that N1m for the three events has a similar physiological significance relating to the detection of changes. Descriptors: N1m, ISIs, Auditory change detection
for detecting changes was automatically activated by comparing the new event (on or off) with the preceding condition (silent or continuous stimuli) using memory trace. Thus, if the N1m components are elicited by a memory trace by comparing any change with the steady state preceding the change’s occurrence, they will be larger with a longer preceding steady state. In fact, the amplitude of auditory N1m (Hari, Kaila, Katila, Tuomisto, & Varpula, 1982; Sams et al., 1993) and the activity in the superior temporal gyrus (STG) (Howard et al., 2000; Tanaka et al., 2008) are sensitive to the stimulus rate in the auditory modality. That is, its amplitude increases with a decrease in the stimulus rate. Some previous studies (Hari et al., 1987; Hillyard & Picton, 1978; Pfefferbaum, Buchsbaum, & Gips, 1971) also showed that the amplitude of N1m (N1) in response to both the onset (ON-N1) and offset (OFF-N1) of a sound stimulus was dependent on the duration of preceding intervals (sound or silent). Given that the ON- and OFF-N1 components are a response to auditory changes, this finding is quite natural since a longer preceding interval indicates an abrupt break of a longer silent period (ON) or longer sound (OFF). The idea that N1 is an automatic response to an auditory change can also explain why only a single N1 component of a similar duration is evoked by various types of auditory stimuli (brief, long, continuous, and repetitive). Previous studies with magnetoencephalograms (MEGs) (Hari et al., 1987; Noda et al., 1998; Pantev et al., 1996; Yamashiro et al., 2009) showed the time course of activation, location, and orientation to be very similar between the activities for ON-N1m and OFF-N1m. Therefore, we speculate that they originate from a similar group of neurons, or even identical neurons, sensitive to auditory changes. Supporting this idea, a study
One of the most important functions of sensory processing in animals is to quickly detect and respond to changes or new events in the surrounding environment. Automatic shifts of attention to an event lead to the facilitation of subsequent processes to execute appropriate behavior. In humans, a cortical network sensitive to sensory changes of various modalities is known (Downar, Crawley, Mikulis, & Davis, 2000; Tanaka, Kida, Inui, & Kakigi, 2009). Previous studies (Hari et al., 1987; Noda et al., 1998; Pantev, Eulitz, Hampson, Ross, & Roberts, 1996; Yamashiro, Inui, Otsuru, Kida, & Kakigi, 2009) showed that auditory on- and offevents elicit similar components peaking at around 100 ms, that is, N1m. The N1m (N1) component is elicited by various types of sounds including pure tone (Pantev et al., 1996; Sams, Hari, Rif, & Knuutila, 1993), clicks (Joutsiniemi, Hari, & Vilkman, 1989; Picton, Hillyard, Krausz, & Galambos, 1974), the human voice (Hari & Lounasmaa, 1989), animal sounds (Altmann et al., 2008), and the offset of a sound (Hari et al., 1987; Noda et al., 1998; Pantev et al., 1996; Yamashiro et al., 2009). Therefore, N1m seems to be a result of any abrupt auditory event. In addition to these findings, our previous study (Yamashiro et al., 2009) showed that the latency of N1m for the off-response could be determined precisely using the offset-discriminating point (ODP; the latency of the last pulse plus the interstimulus interval). These results suggested that a similar cortical network We are very grateful to Mr. Y. Takeshima for technical help during this study. Address correspondence to: Koya Yamashiro, Department of Integrative Physiology, National Institute for Physiological Sciences, Okazaki 444-8585, Japan. E-mail:
[email protected] 23
24 using an abrupt change in frequency from 988 to 1108 Hz found a larger N1m than the control N1m elicited by brief (100 ms) discrete deviant stimuli of 1108 Hz (Lavikainen, Huotilainen, Ilmoniemi, Simola, & Na¨a¨ta¨nen, 1995). Although results of numerous auditory studies (Altmann et al., 2008; Hari & Lounasmaa, 1989; Hari et al., 1987; Joutsiniemi et al., 1989; Lavikainen et al., 1995; Noda et al., 1998; Pantev et al., 1996; Picton et al., 1974; Yamashiro et al., 2009) appear to support the notion that a similar N1m component is elicited by any abrupt auditory change, whether this indicates an indicator of change-detection based on sensory memory remains to be elucidated. The present study aimed to clarify the physiological significance of the N1m component to the detection of changes. For this purpose, we recorded auditory evoked magnetic fields and compared the N1m components elicited by three different auditory events (onset, offset, and frequency change) under four different interstimulus interval (ISI) conditions (0.5 ! 6 s).
Methods Subjects Three experiments were performed on twelve (three females and nine males for Experiments 1 and 2) and eleven (two females and nine males for Experiment 3) normal hearing, right-handed volunteers (25–45 years). The same subjects took part in all three experiments. The study was in accordance with the declaration of Helsinki and approved in advance by the Ethics Committee of the National Institute for Physiological Sciences, Okazaki, Japan, and written informed consent was obtained from all the subjects. Auditory Stimulation and Paradigm Auditory evoked magnetic fields (AEFs) were elicited with a pure tone presented binaurally through a plastic tube and ear pieces (E-A-Rtone 3A, Aero Company, Indianapolis, IN). The intensity of the pure tone was adjusted to 60 db above the threshold for each subject. In Experiment 1 (ON), AEFs were elicited by the onset of a 1000 Hz pure tone (duration of 300 ms including 5 ms rise and fall times) presented at four different ISIs, 0.5, 1.5, 3, and 6 s. That is, the onset of the sound was preceded by a silence lasting 0.5, 1.5, 3, and 6 s in each ISI condition. In Experiment 2 (OFF), AEFs elicited by the offset of the sound were recorded by replacing the tone and silence in Experiment 1 (Figure 1). That is, a 1000 Hz pure tone 0.5. 1.5, 3, or 6 s in duration was presented with a subsequent silent blank of 300 ms, and AEFs were recorded using the offset point as a trigger. In this experiment, therefore, the offset of the sound was preceded by a 0.5, 1.5, 3, or 6 s sound. In Experiment 3 (CHANGE), AEFs elicited by an abrupt change in sound frequency were recorded by replacing the 300 ms of silence in Experiment 2 with a 1100 Hz pure tone 300 ms in duration (including 5 ms rise and fall times). As in Experiments 1 and 2, the duration of the 1000 Hz tone prior to the change to the 1100 Hz tone was either 0.5, 1.5, 3, or 6 s. In all experiments, AEFs for different ISIs were recorded in separate sessions. The sessions with four different ISIs were performed in random order for each subject. All the experiments were conducted on a single day. MEG Recording and Analysis The experiments were carried out in a magnetically shielded room. Subjects were instructed to watch a silent movie through-
K. Yamashiro et al. onset
ON
ISI offset
OFF
ISI 1000 Hz tone change
CHANGE
ISI 1100 Hz tone
Figure 1. Stimulation paradigm. Trigger points are shown by arrows.
out the experiment. AEFs were recorded with a helmet-shaped 306-channel MEG system (Vector-view, ELEKTA Neuromag, Helsinki, Finland), which comprised 102 identical triple sensor elements. Each sensor element consisted of two orthogonal planar gradiometers and one magnetometer coupled to a multi-superconducting quantum interference device (SQUID), and thus provided 3 independent measurements of the magnetic fields. In this study, we analyzed MEG signals recorded from 204 planartype gradiometers. These planar gradiometers can detect the largest signal just over local cerebral sources. The signals were recorded with a bandpass of 0.1–200 Hz and digitized at 997 Hz. The period of analysis for the on-, off-, and change-responses was 350 ms, including a period of 50 ms before the trigger that was used as the baseline. Trials with noise (42700 fT/cm) were rejected from the analysis automatically. For each on-, off-, and change-response, 100 artifact-free trials were recorded in each ISI condition. The average data was filtered with a 1–50 Hz bandpass filter and then used for the analysis (Yabe et al., 2004, 2005). To identify sources of the evoked activities, the equivalent current dipole (ECD), which best explains the measured data, was computed by using a least-squares search. A subset of 14–20 channels including the local signal maxima was used to estimate ECDs (Forss & Jousmaki, 1998; Nakata, Inui, Wasaka, Akatsuka, & Kakigi, 2005; Wasaka et al., 2005). These calculations gave the three-dimensional (3D) location, orientation, and strength of the ECD in a spherical conductor model, which was based on each subject’s magnetic resonance imaging (MRI) to show the source’s location. The goodness-of-fit value of an ECD was calculated to indicate in percentage terms how much the dipole accounted for the measured field variance. A single dipole can explain activity from several dipolar sources with a goodness of fit of 80%. Only ECDs explaining more than 80% of the field variance for selected periods of time were used for further analysis. The period of analysis was extended to the entire time period, and all channels were taken into account when computing a time-varying multi-dipole model. The strength of the previously found ECDs was allowed to change, while locations and orientations were kept fixed. The data acquisition and analysis followed Ha¨ma¨la¨inen, Hari, Illmoniemi, Knuutila, and Lounasmaa (1993). MRI scans were obtained from all subjects with a 3.0-T MRI system (Allegra; Siemens, Erlangen, Germany). T1-weighted coronal, axial, and sagittal image slices obtained every 1.5 mm were used to render the 3D reconstruction
Change-related responses in the human auditory cortex
25
of the brain’s surface. Prior to the recording, a current was fed to four head position indicator (HPI) coils placed at known sites to obtain the exact location of the head with respect to the sensor, and the resulting magnetic fields were measured with the magnetometer, which allowed for aligning the individual head coordinate systems with the magnetometer coordinate system. The four HPI coils attached to the subject’s head were measured with respect to the three anatomical landmarks using a 3D digitizer to allow alignment of the MEG and MRI. The x-axis was fixed with the preauricular points, the positive direction being to the right. The positive y-axis passed through the nasion and the z-axis thus pointed upward. The peak latency and peak amplitude of each cortical activity were subjected to a three-way repeated measure analysis of variance (ANOVA) (event ! hemisphere ! ISI) between ON and OFF and between ON and CHANGE. The main focus of this study was ON-N1m. Therefore, we compared ON-N1m and OFF-N1m, and ON-N1m and CH-N1m separately. The Greenhouse-Geisser epsilon was used to correct the degrees of freedom.
A
The statistical significance of the source’s location was assessed by a discriminant analysis using x, y, and z coordinates as variables for each condition. The relationship of the amplitude of the ECD between ON and OFF across all subjects was assessed under three conditions (ISI: 1.5, 3, and 6 s) by determining a Pearson product-moment correlation coefficient, r. The relationship in amplitude between ON and CHANGE across ten subjects was assessed under four conditions (ISI: 0.5, 1.5, 3, and 6 s), again by determining a Pearson product-moment correlation coefficient, r.
Results On- and Off-Responses Figure 2A shows superimposed waveforms of on- and offresponses at an ISI of 6 s in a representative subject. Both the onset and offset of the tone elicited a magnetic component peaking at around 100 ms (ON-N1m and OFF-N1m) in the temporal
ON vs OFF
ON vs CHANGE
a’
a
b’
b
100ft / mm 100ms a
b
a’
ON
OFF
b’
CHANGE
B
ISI 0.5 S
ISI 1.5 S
ISI 3.0 S
ISI 6.0 S
Figure 2. Magnetic responses to the onset (ON), offset (OFF), and frequency change (CHANGE) of a pure tone. Data from a representative subject. (A) The top view trace of all sensors. (B) Location of dipoles superimposed on the subject’s own MR images.
26
K. Yamashiro et al.
area of each hemisphere. Clear N1m components were evoked in all the subjects in all four ISI conditions, except for the 0.5 s condition of OFF. The dipoles responsible for these responses were estimated to be located in the STG of both hemispheres. The location of the source did not differ significantly between ON and OFF (Table 1, Figure 2B). In a comparison between ON and OFF, a three-way ANOVA indicated the ISI (F(2,22) 5 15.6, po.01, e 5 0.87) to be a significant factor determining the peak latency of the STG activities. That is, the peak latency increased with an increase in ISI (Table 2). Although the peak latency was shorter for the offresponse than the on-response, the difference only tended to be significant (event factor, F(1,11) 5 4.6, p 5 .056). As for peak amplitude, results of the ANOVA indicated event (F(1,11) 5 127.3, po.001, e 5 1), hemisphere (F(1,11) 5 7.0, po.05, e 5 1) and ISI (F(2,22) 5 76.4, po.001, e 5 0.87) to be significant factors (Table 2). The overall peak amplitude was greater for ON than OFF, greater for Rt-STG than Lt-STG, and greater for the longer ISI condition.
Amplitude of the STG Activity as a Function of the Duration of the Steady State Preceding the Change Figure 3A shows the effects of the ISI on the amplitude of the STG activity. The activity increased in amplitude as the ISI increased for all three events, suggesting that its amplitude was determined by the duration of the prior state, the silence (ON) and tone (OFF and CHANGE). As shown in Figure 3B, the amplitude was linearly correlated with the log of the duration of the steady state preceding the change for all three events. In individual subjects, the correlation efficient, r, was 0.84–0.99, 0.58–0.99, and 0.90-0.98 for ON, OFF, and CHANGE, respectively.
On- and Change-Response Figure 2A shows superimposed waveforms of on- and changeresponses at an ISI of 6 s in a representative subject. The abrupt change in tone frequency elicited a clear component (CH-N1m) similar to On- and Off-N1m in the temporal area bilaterally except for one subject. The dipoles responsible for CH-N1m were also estimated to be located in the STG bilaterally. The location of the source did not differ significantly between ON and CHANGE (Table 1, Figure 2B). A three-way ANOVA indicated the events (F(1,9) 5 13.0, po.01, e 5 1) and ISI (F(3,27) 5 6.4, po.01, e 5 0.654) to be significant factors determining the peak latency of the STG activity. That is, the peak latency of the change-response was longer than that of the on-response (Table 2). Like the on- and offresponse, the change-response in the STG increased in amplitude with an increase in ISI. As for the peak amplitude of the STG activity, results of ANOVA indicated that the events (F(1,9) 5 5.2, po.05, e 5 1) and ISI (F(3,27) 5 178.7, po.001, e 5 0.556) were significant factors and hemisphere (F(1,9) 5 4.5, p 5 .062, e 5 1) tended to be a significant factor. The peak amplitude of the activity was significantly greater for ON than CHANGE. Like the on- and off-responses, the amplitude of the change-response increased with an increase in ISI, as described in detail below.
Discussion
Relationship of the STG Amplitude Between the On- and Off- or Change-Response Among the Subjects When the relationship of the amplitude of the STG activity between ON and OFF was compared across subjects, the correlation coefficient, r, was 0.79 (po.0001) (Figure 4A). The slope of the regression line (off/on) was 0.54. Likewise, r was 0.82 (po.0001) for the relationship between ON and CHANGE with a slope (change/on) of 0.74 (Figure 4B).
In the present study, we investigated the auditory N1m component to clarify its physiological significance to the detection of changes. Results showed that (i) a similar N1m was elicited by the on-, off-, and change-events, and the location and orientation of the source in the STG did not differ significantly among the three events, (ii) the amplitude of the activity as a function of the duration of the steady state preceding the change was similar among the three events, and (iii) there was a significant positive correlation in amplitude between ON and OFF, and between ON and CHANGE among subjects. Based on these findings, we consider that N1m is automatically elicited with any abrupt change, and N1m is elicited by various types of auditory events that would have similar physiological significance relating to the detection of changes. Similar STG Activity in On-, Off-, and Change-Events Similar activity in the STG was elicited by the on-, off-, and change-events (Figure 2) in terms of location, orientation, and time course. Therefore, it is possible that the activity in on-, off-, and change-events was elicited by a similar group of neurons and has a similar function related to the detection of changes. In support of this, the source of the activity was estimated to be
Table 1. Locations of the Dipoles for the On- and Off-Responses Under the Three Conditions ON
OFF
Lt-STG
x
ISI 0.5 s ISI 1.5 s ISI 3.0 s ISI 6.0 s Rt-STG ISI 0.5 s ISI 1.5 s ISI 3.0 s ISI 6.0 s
! 55 ! 53 ! 55 ! 54
" " " "
6 4 4 4
11.1 12.6 14.3 13.2
y " " " "
5.7 5.6 6.4 6.6
60.8 58.3 58.1 58.4
z " " " "
4.6 4.1 4.7 4.1
x
53 53 51 50
" " " "
5 6 5 4
16.0 19.7 19.6 21.4
" " " "
4.7 5.1 4.5 4.7
56.6 56.4 56.2 56.9
" " " "
5.8 4.7 5.6 4.0
y
CHANGE z
! 54.2 " 6.8 ! 57.1 " 5.4 ! 54.2 " 3.3
15.1 " 8.2 13.5 " 6.8 13.4 " 6.2
57.6 " 5.5 60.0 " 4.3 57.7 " 5.1
52.9 " 5.4 51.9 " 5.9 50.5 " 5.1
18.5 " 6.3 18.3 " 5.1 19.0 " 4.5
57.4 " 5.0 57.1 " 5.4 58.5 " 4.3
x
y
z
! 57.5 ! 53.2 ! 55.1 ! 52.6
" " " "
7.7 5.1 .4 4.9
14.2 18.5 16.4 14.6
" " " "
5.0 6.3 8.2 8.0
56.7 59.1 57.6 56.1
" " " "
4.0 3.4 4.0 4.5
51.3 51.2 50.9 49.8
" " " "
4.9 4.8 5.2 4.7
20.0 23.5 22.2 20.7
" " " "
6.1 4.2 7.4 5.3
55.6 57.1 57.3 55.2
" " " "
3.6 3.4 2.7 4.0
Note: The x-axis was fixed with the preauricular points, the positive direction being to the right. The positive y-axis passed through the nasion, and the z-axis thus pointed upward. Rt-STG, right superior temporal gyrus; Lt-STG, left superior temporal gyrus.
Change-related responses in the human auditory cortex
27
Table 2. The Peak Latency and Amplitude of Each Cortical Source Under the Four Conditions ON (n 5 12) Latency (ms) ISI 0.5 s ISI 1.5 s ISI 3.0 s ISI 6.0 s Amplitude (nAm) ISI 0.5 s ISI 1.5 s ISI 3.0 s ISI 6.0 s
Lt-STG
OFF (n 5 12) Rt-STG
90.5 92.0 93.5 95.8
! ! ! !
10.2 7.9 7.9 6.5
89.3 92.6 94.8 95.4
! ! ! !
7.9 9.2 5.8 6.2
12.6 35.1 47.1 59.0
! ! ! !
7.2 15.9 17.5 22.1
17.0 41.4 63.2 74.2
! ! ! !
9.1 10 14.1 19.5
Lt-STG
CHANGE (n 5 10) Rt-STG
83.3 ! 10.6 90.3 ! 10.8 91.9 ! 12.3
85.0 ! 8.8 84.7 ! 8.6 89.4 ! 9.4
12.4 ! 8.0 15.4 ! 8.0 23.8 ! 10.5
14.3 ! 7.2 27.7 ! 12.7 41.6 ! 14.7
Lt-STG
Rt-STG
99.2 98.7 102.2 106
! ! ! !
15.5 10.3 10.4 7.2
98 97 99.9 105.8
! ! ! !
14.8 13.2 9.1 11.8
10.6 30 41.9 49.9
! ! ! !
7.6 9.8 18.0 21.0
16.8 42.6 59.7 63.1
! ! ! !
12.7 15.6 18.7 17.4
Note: Data are expressed as the mean ! SD. Rt-STG, right superior temporal gyrus; Lt-STG, left superior temporal gyrus.
within the change-sensitive area in an fMRI study (Downar, Crawley, Mikulis, & Davis, 2000). They used the change of continuous sounds of running water and croaking frogs in turn, and suggested the activation of the primary and secondary auditory cortices in response to changes in continuous auditory input. The idea that the N1 component is involved in the detection of changes was consistent with a report (Hari et al., 1987) that N1m seems to reflect cortical activities related to any abrupt change in the auditory environment. In fact, for the on-response, a similar N1m component was elicited by various types of sounds in addition to a pure tone including a click (Joutsiniemi et al., 1989; Picton et al., 1974), human voice (Hari & Lounasmaa, 1989) or animal sound (Altmann et al., 2008). However, the difference in the latency of the STG activity among the three events or among the four ISI conditions needs some discussion. At first, the peak latency tended to be shorter for OFF than ON. This phenomenon seems to be caused by the
A ISI
0.5 s
physical difference in stimuli between ON and OFF. That is, the ON-STG activity involves at least two factors, the on-event and physical features of the stimulus, while the OFF-STG activity only involves the off-event. The N1 component is known to consist of a number of different cerebral processes with a peak latency of between 50 and 150 ms (Na¨a¨ta¨nen & Picton, 1987). The ON-STG activity was considered to reflect a complex process including the frequency, intensity, and location of the sound, while the OFF-STG activity reflects a more simple process of the off-event. Therefore, the peak latency of the OFF-STG activity may be shorter than that of the ON-STG activity because of the lack of the latter part of its activity relating to processing of the sound feature. Second, the latency of the CH-STG activity was significantly longer than that of the ON-STG activity, perhaps due to the period necessary for processing the change in frequency from 1000 Hz to 1100 Hz. Given that the CH-STG activity in this
1.5 s
3.0 s
6.0 s
ON
CHANGE 40nAm OFF 0
300 0
300 0
300 0
300
B 80 R=0.99
80 R=0.97
60
60
40
40
20
20
20
10
0
0 1.0
3.0 ON
6.0 (s)
0
40 R=0.99 30
0 1.0
3.0 CHANGE
6.0 (s)
0
0 1.0
3.0 OFF
6.0 (s)
Figure 3. Effects of the interstimulus interval (ISI) on the amplitude of the evoked response. (A) superimposed waveforms of the activity in the right superior temporal gyrus (Rt-STG) for all the subjects (gray) and respective grand-averaged waveforms (black). (B) the mean amplitude of the Rt-STG activity plotted against the ISI.
28
K. Yamashiro et al.
A
B
(nAm) r=0.79 P<0.0001
60
80
40
CH
OFF
(nAm) 120 r=0.82 p<0.0001
40
20 0
0
40
80 ON
120 (nAm)
0
0
40
80 ON
120 (nAm)
Figure 4. Positive correlation of the amplitude of the STG activity between on- and off-responses (A) and between on- and change-responses (B). A correlation coefficient, r, and p values are indicated.
study mainly reflects the response to the change event, it should appear after the comparison of the frequency of the present (1100 Hz) and previous (1000 Hz) sounds is completed. This phenomenon is compatible with findings in studies on mismatch negativity (MMN). The latency of MMN is dependent on the magnitude of the change in stimulus (for a review, see Na¨a¨ta¨nen, Jacobsen, & Winkler, 2005). For example, MMN peaked at about 200 ms with a very small change in frequency and at 100– 150 ms with larger changes (Tiitinen, May, Reinikainen, & Na¨a¨ta¨nen, 1994). Since the change in frequency in the present study (10%) was relatively large, the peak latency of the CHSTG activity (100 ! 106 ms) seems to be a reasonable value. Third, the latency of the STG activity increased with an increase in ISI. Since the peak amplitude also increased as the ISI increased, this finding is probably due to the fact that larger numbers of neurons were recruited as the ISI increased and, therefore, a longer time was necessary to reach the maximum amplitude. The STG Activity Relating to Detection of Changes We would like to discuss briefly the ongoing debate regarding whether the onset and offset vs. change responses reflect different underlying mechanisms. MMN is interpreted as a memory-based automatic response to any distinguishable changes (e.g., frequency, duration, sound pressure, sound localization, and even omission) in regular auditory inputs, which usually peaks 100– 200 ms from the change onset and is elicited even in the absence of attention. The prevailing hypothesis is that the infrequent stimulus triggers more of N1 and that it does (see, e.g., Na¨a¨ta¨nen et al., 2005) or does not (see, e.g., May & Tiitinen, 2009) elicit additional MMN. In the former hypothesis, the mechanisms for the detection of changes involved a dissociation of comparatorand non-comparator-based processes (Schro¨ger 1997, 2007). In fact, there is evidence that these two mechanisms involve different areas in the brain (Opitz, Schro¨ger, & von Cramon, 2005) and therefore the STG activity may reflect noncomparator-based change detection for abrupt event changes in the present study. However, a unique finding here was a similar behavior of ON-, OFF-, and CH-STG activities against the ISI; that is, the amplitude of STG activity was dependent on the duration of the steady state preceding the change occurrence. These results imply the possibility that the STG activity is an indicator of changedetection related to sensory memory. On the other hand, there has been a report that the change response is due to differential adaptation of N1 subcomponents. Ja¨a¨skela¨inen et al. (2004) suggested that MMN is generated as a result of the differential adaptation of anterior and posterior components of N1, that is, the posterior N1 component is rapidly
suppressed with decreasing sound novelty, whereas the anterior N1 component is less affected. Several researchers have supported the idea that N1 comprises an anterior and posterior component with different time courses and possibly different significance using MEG (Loveless, Levanen, Jousmaki, Sams, & Hari, 1996; Lu, Williamson, & Kaufman, 1992b; Sams et al., 1993). The present results showed that the amplitude of the OFFSTG activity was dependent on the duration of the steady state preceding the change, as well as that of the ON- and CH-STG activity. Therefore, this STG activity- generating mechanism also seems to be attributed to a group of change-sensitive neurons. Taking the present results and previous findings into consideration, we consider that any kind of auditory stimulus change, including the onset and offset, activates STG neurons, and this activity could be involved in both adaptation and memory mechanisms. Provided that the STG activity reflects memory of an auditory stimulus and, therefore, its amplitude at various ISI conditions reflects how much auditory memory for the previous stimulus decays, one can expect a correlation between the amplitude of the STG activity and the behavioral index as for the detection of changes. In fact, in a study using N1m and a twoalternative forced choice task, Lu et al. (1992a) found that the recovery of the N1m response was related to the decay of an auditory sensory memory trace that underlies perception and provides a basis for voluntary discrimination. On the other hand, Cowan (1995) argued that in Lu et al.’s study, attention focused on the stimuli could have affected both the discrimination performance and the amplitude of N1 at long ISIs. They suggested that the correlation between N1 and behavior might not be the same as what is commonly referred to as sensory memory. Na¨a¨ta¨nen and Winkler (1999) also showed evidence that N1 is part of the pre-representational stage of acoustic processing. In the present study, the behavior of the ON-, OFF-, and CH-STG activities as a function of ISI were very similar under a condition where subjects ignored the stimuli, implying the possibility that the recovery of the STG activities were more or less related to the decay of sensory memory. It seems difficult to record both the behavioral data and the STG activity without attention effects. To solve this problem, a new experimental paradigm will be necessary. The Dominance of the Right Hemisphere The present results showed that the peak amplitude of the RtSTG activity was larger than that of the Lt-STG activity in all events and conditions (Table 2). A previous MEG-based study (Zouridakis, Simos, & Papanicolaou, 1998) also showed asym-
Change-related responses in the human auditory cortex
29
metry in the distribution of N1m sources between the two hemispheres, with those in the right hemisphere covering more of the anterior-posterior and medial-lateral axes, suggesting that the surface of the temporal lobe involved in the generation of the N1m component is larger in the right hemisphere. This finding appears consistent with the present results. Deouell, Bentin, and Soroker (2000) showed that the MMN elicited by deviance occurring in the left ear of a patient with damage to the right hemisphere was reduced relative to that elicited by deviance occurring in the right ear. They suggested that a pre-attentive deficit contributed to neglect and that this deficit may be related to a failure to link perceived sensory events to the attentional system. In addition, unilateral neglect is a frequent symptom of damage to the right hemisphere, and is much less frequent in patients with damage in the left hemisphere (for a review, see Mesulam, 1990). Therefore, the right hemisphere may play an important role in the automatic process of detecting changes or the cognition of objects. These findings are consistent with the dominant Rt-STG activity for change-events. Interestingly, Downar et al. (2000, 2002) showed a right-lateralized multimodal network area that responds to change or salience including the anterior insula, inferior frontal gyrus and temporoparietal junction. They suggested that this network serves in mediating attention to, and awareness of, salience in the sensory environment. Amplitude of the STG Activity as a Function of the Duration of the Steady State Preceding the Change As shown in Figure 3, the amplitude of the STG activity was dependent on the duration of the steady state preceding the change in all events (on, off, and change). Previous studies revealed that the amplitude of N1 or N1m (Davis, Mast, Yoshie, & Zerlin, 1966; Fruhstorfer, Soveri, & Jarvilehto, 1970; Hari et al., 1982; Sams et al., 1993) was dependent on the ISI. The amplitude of OFF-N1 or OFF-N1m also increased with an increase in the interval preceding a sound (Hari et al., 1987; Pfefferbaum et al., 1971). Moreover, Nelson and Lassman (1968) found that the amplitude of N1-P2 was a linear function of the log10 ISI. The present results were consistent with these previous findings. The present study also showed that the amplitude of the STG activity was linearly
correlated with the log of the steady state duration preceding the change (Figure 3B) for the on, off, and change events. These effects of ISI have been interpreted as habituation or refractoriness, which reflects a psychologically relevant process or a more basic neurophysiological process (Budd, Barry, Gordon, Rennie, & Michie, 1998; Ritter, Vaughan, & Costa, 1968; Rosburg, 2004; Rosburg et al., 2006; Sokolov, 1963). Budd et al. (1998) examined whether the reduction in N1 with the repetition of auditory stimuli results from the process of habituation or from the refractory period of the neural elements using repetition (seven trains) in three ISI conditions (1, 3, and 10 s). Although a significant decrease in N1 with repetition was observed under the 1 and 3 s conditions, this decline was complete by the second stimulus in the short trains. They considered that the result was consistent with the view that the weakening of N1 reflects a refractory period (Ritter et al., 1968). However, the theory of refractoriness does not explain the offresponse in the present study. If the decrease in N1 is due to refractoriness, the OFF-STG activity should be independent of the ISI. In the current study, the OFF-STG activity increased in amplitude with an increase in ISI. The refractoriness theory does not explain the delay in latency of the CH-STG activity relative to the ON-STG activity. If the CH-STG activity originates from neurons sensitive to 1100 Hz and the decrease in its amplitude with shorter ISIs is due to refractoriness, the latency of the activity should be the same between ON-STG and CH-STG. Although the present study did not provide direct evidence that the ISI-dependent behavior of the on-response does not come from refractoriness, we consider that the N1 component of all the events reflects the activity of a group of neurons sensitive to new events. This notion is consistent with the proposal by Sokolov (1963) that an orienting response is generated whenever a mismatch occurs between the established neuronal model and a new stimulus. The positive correlation of the amplitude of the STG activity among the three events (Figure 4) supports this idea. This finding indicates that a subject with an On-N1m of large amplitude also has a large amplitude OFF-N1m or CH-N1m, suggesting a common physiological significance or generating mechanism among them.
REFERENCES Altmann, C. F., Nakata, H., Noguchi, Y., Inui, K., Hoshiyama, M., Kaneoke, Y., & Kakigi, R. (2008). Temporal dynamics of adaptation to natural sounds in the human auditory cortex. Cerebral Cortex, 18, 1350–1360. Budd, T. W., Barry, R. J., Gordon, E., Rennie, C., & Michie, P. T. (1998). Decrement of the N1 auditory event-related potential with stimulus repetition: Habituation vs. refractoriness. International Journal of Psychophysiology, 31, 51–68. Cowan, N. (1995). Attention and memory. An integrated framework. Oxford, England: Oxford University Press. Davis, H., Mast, T., Yoshie, N., & Zerlin, S. (1966). The slow response of the human cortex to auditory stimuli: Recovery process. Electroencephalography and Clinical Neurophysiology, 21, 105–113. Deouell, L. Y., Bentin, S., & Soroker, N. (2000). Electrophysiological evidence for an early (pre-attentive) information processing deficit in patients with right hemisphere damage and unilateral neglect. Brain, 123(Pt 2), 353–365. Downar, J., Crawley, A. P., Mikulis, D. J., & Davis, K. D. (2000). A multimodal cortical network for the detection of changes in the sensory environment. Nature Neuroscience, 3, 277–283. Downar, J., Crawley, A. P., Mikulis, D. J., & Davis, K. D. (2002). A cortical network sensitive to stimulus salience in a neutral behavioral
context across multiple sensory modalities. Journal of Neurophysiology, 87, 615–620. Forss, N., & Jousmaki, V. (1998). Sensorimotor integration in human primary and secondary somatosensory cortices. Brain Research, 781, 259–267. Fruhstorfer, H., Soveri, P., & Jarvilehto, T. (1970). Short-term habituation of the auditory evoked response in man. Electroencephalography and Clinical Neurophysiology, 28, 153–161. Ha¨ma¨la¨inen, M., Hari, R., Ilmoniemi, R. J., Knuutila, J., & Lounasmaa, O. V. (1993). MagnetoencephalographyFtheory, instrumentation, and applications to noninvasive studies of the working human brain. Reviews of Modern Physics, 65, 413. Hari, R., Kaila, K., Katila, T., Tuomisto, T., & Varpula, T. (1982). Interstimulus interval dependence of the auditory vertex response and its magnetic counterpart: Implications for their neural generation. Electroencephalography and Clinical Neurophysiology, 54, 561–569. Hari, R., & Lounasmaa, O. V. (1989). Recording and interpretation of cerebral magnetic fields. Science, 244, 432–436. Hari, R., Pelizzone, M., Makela, J. P., Hallstrom, J., Leinonen, L., & Lounasmaa, O. V. (1987). Neuromagnetic responses of the human auditory cortex to on- and offsets of noise bursts. Audiology, 26, 31–43.
30 Hillyard, S. A., & Picton, T. W. (1978). On and off components in the auditory evoked potential. Perception and Psychophysics, 24, 391–398. Howard, M. A., Volkov, I. O., Mirsky, R., Garell, P. C., Noh, M. D., Granner, M., et al. (2000). Auditory cortex on the human posterior superior temporal gyrus. Journal of Comparative Neurology, 416, 79–92. Ja¨a¨skela¨inen, I. P., Ahveninen, J., Bonmassar, G., Dale, A. M., Ilmoniemi, R. J., Levanen, S., et al. (2004). Human posterior auditory cortex gates novel sounds to consciousness. Proceedings of the National Academy of Sciences U S A, 101, 6809–6814. Joutsiniemi, S. L., Hari, R., & Vilkman, V. (1989). Cerebral magnetic responses to noise bursts and pauses of different durations. Audiology, 28, 325–333. Lavikainen, J., Huotilainen, M., Ilmoniemi, R. J., Simola, J. T., & Na¨a¨ta¨nen, R. (1995). Pitch change of a continuous tone activates two distinct processes in human auditory cortex: A study with whole-head magnetometer. Electroencephalography and Clinical Neurophysiology, 96, 93–96. Loveless, N., Levanen, S., Jousmaki, V., Sams, M., & Hari, R. (1996). Temporal integration in auditory sensory memory: Neuromagnetic evidence. Electroencephalography and Clinical Neurophysiology, 100, 220–228. Lu, Z. L., Williamson, S. J., & Kaufman, L. (1992a). Behavioral lifetime of human auditory sensory memory predicted by physiological measures. Science, 258, 1668–1670. Lu, Z. L., Williamson, S. J., & Kaufman, L. (1992b). Human auditory primary and association cortex have differing lifetimes for activation traces. Brain Research, 572, 236–241. May, P. J., & Tiitinen, H. (2009). Mismatch negativity (MMN), the deviance-elicited auditory deflection, explained. Psychophysiology, 46, 1–57. Mesulam, M. M. (1990). Large-scale neurocognitive networks and distributed processing for attention, language, and memory. Annals of Neurology, 28, 597–613. Na¨a¨ta¨nen, R., Jacobsen, T., & Winkler, I. (2005). Memory-based or afferent processes in mismatch negativity (MMN): A review of the evidence. Psychophysiology, 42, 25–32. Na¨a¨ta¨nen, R., & Winkler, I. (1999). The concept of auditory stimulus representation in cognitive neuroscience. Psychological Bulletin, 125, 826–859. Na¨a¨ta¨nen, R., & Picton, T. (1987). The N1 wave of the human electric and magnetic response to sound: A review and an analysis of the component structure. Psychophysiology, 24, 375–425. Nakata, H., Inui, K., Wasaka, T., Akatsuka, K., & Kakigi, R. (2005). Somato-motor inhibitory processing in humans: A study with MEG and ERP. European Journal of Neuroscience, 22, 1784–1792. Nelson, D. A., & Lassman, F. M. (1968). Effects of intersignal interval on the human auditory evoked response. Journal of the Acoustical Society of America, 44, 1529–1532. Noda, K., Tonoike, M., Doi, K., Koizuka, I., Yamaguchi, M., Seo, R., et al. (1998). Auditory evoked off-response: Its source distribution is different from that of on-response. NeuroReport, 9, 2621–2625. Opitz, B., Schro¨ger, E., & von Cramon, D. Y. (2005). Sensory and cognitive mechanisms for preattentive change detection in auditory cortex. European Journal of Neuroscience, 21, 531–535. Pantev, C., Eulitz, C., Hampson, S., Ross, B., & Roberts, L. E. (1996). The auditory evoked ‘‘off’’ response: Sources and comparison with
K. Yamashiro et al. the ‘‘on’’ and the ‘‘sustained’’ responses. Ear and Hearing, 17, 255– 265. Pfefferbaum, A., Buchsbaum, M., & Gips, J. (1971). Enhancement of the average evoked response to tone onset and cessation. Psychophysiology, 8, 332–339. Picton, T. W., Hillyard, S. A., Krausz, H. I., & Galambos, R. (1974). Human auditory evoked potentials. I. Evaluation of components. Electroencephalography and Clinical Neurophysiology, 36, 179–190. Ritter, W., Vaughan, H. G. Jr., & Costa, L. D. (1968). Orienting and habituation to auditory stimuli: A study of short term changes in average evoked responses. Electroencephalography and Clinical Neurophysiology, 25, 550–556. Rosburg, T. (2004). Effects of tone repetition on auditory evoked neuromagnetic fields. Clinical Neurophysiology, 115, 898–905. Rosburg, T., Trautner, P., Boutros, N. N., Korzyukov, O. A., Schaller, C., Elger, C. E., & Kurthen, M. (2006). Habituation of auditory evoked potentials in intracranial and extracranial recordings. Psychophysiology, 43, 137–144. Sams, M., Hari, R., Rif, J., & Knuutila, J. (1993). The human auditory sensory memory trace persists about 10 sec: Neuromagnetic evidence. Journal of Cognitive Neuroscience, 53, 363–370. Schro¨ger, E. (1997). On the detection of auditory deviations: A pre-attentive activation model. Psychophysiology, 34, 245–257. Schro¨ger, E. (2007). Mismatch negativity: A microphone into auditory memory. Journal of Psychophysiology, 21, 138–146. Sokolov, E. N. (1963). Higher nervous functions: The orienting reflex. Annual Review of Physiology, 25, 545–580. Tanaka, E., Inui, K., Kida, T., Miyazaki, T., Takeshima, Y., & Kakigi, R. (2008). A transition from unimodal to multimodal activations in four sensory modalities in humans: An electrophysiological study. BMC Neuroscience, 9, 116. Tanaka, E., Kida, T., Inui, K., & Kakigi, R. (2009). Change-driven cortical activation in multisensory environments: An MEG study. NeuroImage, 48, 464–474. Tiitinen, H., May, P., Reinikainen, K., & Na¨a¨ta¨nen, R. (1994). Attentive novelty detection in humans is governed by pre-attentive sensory memory. Nature, 372, 90–92. Wasaka, T., Nakata, H., Akatsuka, K., Kida, T., Inui, K., & Kakigi, R. (2005). Differential modulation in human primary and secondary somatosensory cortices during the preparatory period of self-initiated finger movement. European Journal of Neuroscience, 22, 1239–1247. Yabe, H., Asai, R., Hiruma, T., Sutoh, T., Koyama, S., Kakigi, R., et al. (2004). Sound perception affected by nonlinear variation of accuracy in memory trace. NeuroReport, 15, 2813–2817. Yabe, H., Matsuoka, T., Sato, Y., Hiruma, T., Sutoh, T., Koyama, S., et al. (2005). Time may be compressed in sound representation as replicated in sensory memory. NeuroReport, 16, 95–98. Yamashiro, K., Inui, K., Otsuru, N., Kida, T., & Kakigi, R. (2009). Automatic auditory off-response in humans: An MEG study. European Journal of Neuroscience, 30, 125–131. Zouridakis, G., Simos, P. G., & Papanicolaou, A. C. (1998). Multiple bilaterally asymmetric cortical sources account for the auditory N1m component. Brain Topography, 10, 183–189.
(Received October 28, 2009; Accepted January 18, 2010)
Psychophysiology, 48 (2011), 31–43. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01037.x
Effects of concurrent working memory load on distractor and conflict processing in a name-face Stroop task
ELLEN M. M. JONGEN and LISA M. JONKMAN Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, The Netherlands
Abstract To examine the time course of effects of working memory (WM) load on interference control, ERPs were measured in a combined WM and Stroop task. A WM load of 0, 2, or 4 letters was imposed, and during the maintenance-interval Stroop trials were presented that required participants to classify names of famous people while ignoring faces that were either congruent or incongruent with the names. Behavioral interference was not modulated by WM load, but WM load led to an overall reduction of Stroop stimulus encoding as reflected by reduced N170 and N250 amplitudes independent of congruency. Incongruent distractor faces induced interference as shown by a delayed and reduced positivity between 480–600 ms (N450) and an enhanced positivity between 760–1000 ms (P600), indicating longer stimulus evaluation, conflict detection, and conflict resolution, respectively. WM load led to an increase of the P600 at frontal and parietal sites, possibly reflecting PFCdriven top-down control of posterior sites, necessary for conflict resolution. Descriptors: Stroop interference, WM load, ERPs
distractor processing when WM resources were reduced by experimental manipulations of WM load (for a review, see Lavie, 2005). For example, in the fMRI study by De Fockert, Rees, Frith, and Lavie (2001), a face-name Stroop ask was conducted in the maintenance interval of a concurrent WM task while WM load was either low or high. In the Stroop task, written names of famous politicians and pop stars were superimposed on pictures of faces from the same set of people. Participants were asked to categorize names as politicians or pop stars while ignoring distractor faces that were either identity-congruent (e.g., the name and face of ‘‘Bill Clinton,’’ the politician), or category-incongruent (e.g., the name ‘‘Bill Clinton’’ as a politician superimposed on the face of ‘‘Elvis Presley,’’ the singer). Distractor interference, representing the delay in reaction time to classify names superimposed on incongruent faces relative to congruent faces, was larger during high (73 ms) than low (32 ms) concurrent WM load. Furthermore, higher WM load was related to enhanced activity in frontal WM-related brain areas and in faceprocessing areas. The latter was suggested to indicate enhanced distractor processing. The authors concluded that high WM load had consumed the resources necessary for interference control in the Stroop task, leading to enhanced distractor processing in face-processing areas and larger interference. This conclusion is in line with Lavie’s WM load theory of selective attention (1995; Lavie, Hirst, De Fockert, & Viding, 2004). According to Load theory, an active top-down mechanism of attentional control mediated by prefrontal cortical areas depends on WM and plays an important role in the maintenance of goaldirected behavior in the presence of interference (Lavie et al.,
Attentional mechanisms are important to prioritize and select relevant information in the face of distracting information. Interference from distracting information occurs when inhibition of it fails. Recent studies have shown that working memory (WM) capacity is an important predictor of performance in tasks that place high demands on selective attention, such as flanker or Stroop paradigms (Heitz & Engle, 2007; Kane & Engle, 2003; Redick & Engle, 2006). These findings support WM models that have defined WM capacity as an executive construct (Baddeley, 1993; Conway, Cowan, & Bunting, 2001; Kane, Conway, Bleckley, & Engle, 2001; Kane & Engle, 2003), responsible for active maintenance of goal-relevant information in face of concurrent processing, interference, and conflict. Further evidence for the interrelatedness of WM and attention comes from different disciplines, such as studies showing overlap in neural substrates (e.g., Corbetta & Shulman, 2002; LaBar, Gitelman, Parrish, & Mesulam, 1999; Mayer, Bittner, Nikolic, Bledowski, Goebel, & Linden, 2007; McNab, Leroux, Strand, Thorell, Bergman, & Klingberg, 2008; Pessoa & Ungerleider, 2004; Pollmann & von Cramon, 2000), and studies showing behavioral and neurobiological evidence for enhanced We would like to thank Judith Peters and Valerie Goffaux for comments on an earlier version of this manuscript. Furthermore, we thank Ron Hellenbrand and Erik Bongaerts for technical assistance and Lies Vos for her help in data collection. Address correspondence to: Ellen M. M. Jongen, Department of Work and Social Psychology, Faculty of Psychology and Neuroscience, Maastricht University, P.O. Box 616, 6200 MD Maastricht, The Netherlands. E-mail:
[email protected] 31
32 2004). When WM is loaded, distractor interference in a Stroop task is suggested to increase because resources necessary for goal maintenance are consumed by concurrent WM processes. Evidence for this theory has been shown, mainly by Lavie and coworkers, in a number of studies (for a review, see Lavie, 2005). In addition to evidence in line with Load theory (De Fockert et al., 2001; Lavie & De Fockert, 2005; Pecchinenda & Heil, 2007), there are also behavioral studies that did not replicate WM load effects on interference control when using other paradigms (Kim, Kim, & Chun, 2005; Park, Chun, & Kim, 2007; Woodman, Vogel, & Luck, 2001). Furthermore, there is still lack of clarity about the brain mechanisms involved in such interactions of WM and interference control. Whereas De Fockert et al. (2001) showed higher frontal and face-processing activation during Stroop processing under high as compared to low WM load, these effects were based on a condition comparison of faceabsent and face-present Stroop stimuli. Effects of WM-load on brain activation related to distractor interference processing can, however, only be investigated directly by comparing name-face congruent and name-face incongruent stimuli. To the best of our knowledge, there are no studies that directly investigated brain mechanisms involved in WM-load effects on distractor interference processing by comparing brain activity in congruent and incongruent Stroop conditions, in which target and distractors are presented simultaneously. Finally, due to poor temporal resolution, fMRI studies have not provided information about when in time processes of Stroop interference are affected by processes of WM. An increase in WM load and resulting reduction in cognitive control may affect early processes of stimulus encoding and recognition, or later processes of stimulus identification and response planning. The aim of the present study is to gain more insight in the brain mechanisms involved in the effects of WM on interference control that have been reported in the behavioral literature as reviewed above. More specifically, the temporal locus of the effects of WM on interference control will be examined using event-related potential (ERP) measures. To this aim, a manual name-face Stroop task similar to that used in other studies that have reported reliable distractor interference effects will be used (De Fockert et al., 2001; Egner & Hirsch, 2005; Pecchinenda & Heil, 2007). The advantage of face distractors is that they are hard to ignore compared to other types of distractors (e.g., Lavie et al., 2003). Furthermore, specific face-processing components in the ERP literature have been related to the different stages of face processing, enabling the investigation of load effects on different stages of face-distractor processing. To investigate effects of WM load, Stroop stimuli will be presented in the maintenance interval of a concurrent WM task, and WM load will be manipulated parametrically asking participants to memorize no letters (0-load condition), 2 letters (2-load condition), or 4 letters (4-load condition). Some hypotheses can be derived from the ERP face-processing literature and the ERP Stroop literature and will be outlined below. The time course of name-face interference. In the name-face Stroop task, interference only occurs when the identity of distractor face stimuli interferes with the superimposed target name stimuli that participants are asked to classify. Therefore, distractor interference in the ERP signal is hypothesized to occur when or after face identity is processed. For face recognition to take place, individual face features and their spatial relation are analyzed first during an encoding stage before faces are recog-
E. M. M. Jongen & L. M. Jonkman nized, with recognition occurring in separate subsequent stages; whereas visually-derived semantic information such as gender becomes available regardless of face-familiarity, semantic information such as occupation is retrieved in the next processing stage (Bruce & Young, 1986). In the ERP, the earliest component associated with face processing is the N170, a negative component with a latency of about 170 ms that is elicited at lateral posterior temporal sites (e.g., Allison, Ginter, McCarthy, Nobre, Puce, et al., 1994; Bentin, Allison, Puce, Perez, & McCarthy, 1996; Bo¨tzel, Schulze, & Stodieck, 1995). As it is elicited by the perception of a face regardless of face-familiarity or task relevance (Bentin & Deouell, 2000; Eimer, 2000a, b; Rossion, Campanella, Gomez, Delinte, Debatisse, et al., 1999; Tanaka, Curran, Porterfield, & Collins, 2006), it is assumed to reflect processes of structural encoding that occur before face identification (Bentin et al., 1996; Itier & Taylor, 2004; Latinus & Taylor, 2006). The earliest face processing ERP component related to face recognition occurs at temporal-occipital sites around 250 ms after stimulus onset (Pfu¨tze, Sommer, & Schweinberger, 2002; Schweinberger, Pfu¨tze, & Sommer, 1995). This N250 component is associated with face recognition because of its sensitivity to manipulations of face familiarity (Begleiter, Porjesz, & Wang, 1995; Herzmann, Schweinberger, Sommer, & Jentzsch, 2004; Itier & Taylor, 2004; Paller, Ranganath, Gonsalves, LaBar, Parrish, et al., 2003; Schweinberger et al., 1995; Tanaka et al., 2006). ERP differences occurring after 400 ms have been associated with the final stage of face recognition and identification that requires the retrieval of face-associated information from semantic memory, such as in our task (Bentin & Deouell, 2000; Eimer, 2000a, b; Paller, Gonsalves, Grabowecky, Bozic, & Yamada, 2000). Recognition-related activity at this stage generally has a broader scalp distribution (Boehm & Sommer, 2005; Paller, Bozic, Ranganath, Grabowecky, & Yamada, 1999; Paller et al., 2000, 2003) that is suggested to result from the rapid interactions between a broad network of frontal and temporal cortical areas linked to each other directly and via hippocampal networks (Paller et al., 2003). Based on the fact that interference in the name-face Stroop task used in the current study can only occur after a face has been identified as a pop star or politician, distractor interference effects are expected in the time range of the broadly distributed ERP components that start about 400 to 600 ms after face onset and not on the early occipital N170 and N250 components. This time range for name-face interference is also in line with the color-word Stroop ERP literature that consistently describes two interference components: the ‘‘N450’’ and the ‘‘P600’’ (e.g., Lansbergen, van Hell, & Kenemans, 2007; Liotti, Woldorff, Perez III, & Mayberg, 2000; Markela-Lerenc, Ille, Kaiser, Fiedler, Mundt, & Weisbrod, 2004; Qiu, Luo, Wang, Zhang, & Zhang, 2006; West, 2003; West, Bowry, & McConville, 2004). The N450 represents a broadly distributed reduced positive component in the incongruent condition relative to the congruent condition between 350–500 ms and has been associated with the process of conflict monitoring. The P600 represents a broadly distributed enhanced positive component in the incongruent condition relative to the congruent condition starting around 600 ms and has been attributed to conflict resolution. The time course of effects of WM on name-face interference. According to load theory, interference from incongruent distractor faces will increase when concurrent WM load increases
Working memory load effects on Stroop interference
33
due to depletion of resources that are necessary for goal maintenance in the Stroop task (Lavie et al., 2004). With respect to the time course of these effects, interference from incongruent distractor faces is expected to occur on the N450 and P600 Stroop interference ERP components, and increases are expected at the same latency. Since the N170 and N250 components represent early face encoding stages that are thought to originate from secondary visual areas (Schweinberger, Huddy, & Burton, 2004; Schweinberger, Pickering, Jentzsch, Burton, & Kaufman, 2002a) and precede the stage of face identification, no interference effects or interaction of WM and interference are expected around this time of stimulus processing. Still, an increase in WM load may have a main effect on early face encoding stages (N170 and N250) in the Stroop task, independent of whether faces are congruent or incongruent. As mentioned above, using a similar paradigm, De Fockert et al. (2001) reported enhanced fMRI activation in face-processing areas when concurrent WM load was imposed. This was, however, based on a comparison of face absent (word only) and face present (word superimposed on face) stimuli. Whereas ERP studies have also provided evidence for topdown influences of WM on the latency or amplitude of the N170 and N250 components (e.g., Gazzaley, Cooney, McEvoy, Knight, & D’Esposito, 2005; Morgan, Klein, Boehm, Shapiro, & Linden, 2008; Sreenivasan & Jha, 2007), these studies did not investigate effects of parametric WM-load manipulations and/or did not present target and distractor stimuli simultaneously, creating semantic conflict, such as in the name-face Stroop task. Based on Load theory and results by De Fockert et al. (2001), it can be predicted that, with an increase of WM load, there will be fewer resources available for target maintenance, leaving more room for distractor (face) processing. This might be reflected by enhanced amplitudes of the face-sensitive N170 and N250 components and reductions in N170 latency in lateral-occipital cortex in response to distracting face stimuli, independent of face-name congruency.
memory set was 3.8 cm horizontally ! 0.3 cm vertically in the 0-load condition, 1.5 cm horizontally ! 0.48 cm vertically in the 2-load condition, and 3.8 cm horizontally ! 0.48 cm vertically in the 4-load condition. Memory probe stimuli consisted of one letter in the 2-load and 4-load condition, or an arrow to the left (oo) or to the right (44) in the 0-load condition. Arrow size was 1 cm horizontally ! 0.4 cm vertically.
Method
Task Description The task was a combination of a Sternberg WM (item recognition) task and a name-face Stroop paradigm (see Figure 1). On every trial (see Figure 1A), after a fixation cross (300 ms), a memory set was presented (1500 ms), followed by an inter-stimulus interval (ISI; 850 ms) during which a fixation cross was presented. In the maintenance delay, a sequence of either two Stroop trials (in 33% of all cases) or three Stroop trials (in 67% of all cases) was presented. Each Stroop stimulus was presented for 1000 ms, and followed by an ISI (1500 ms) during which a fixation cross was presented. Finally, the memory probe was presented (1500 ms), followed by a fixation cross (200 ms). In the Sternberg WM task (see Figure 1B), the memory set consisted of a row of 2 (2-load condition) or 4 (4-load condition) randomly selected and randomly ordered consonant letters, or 4 star stimuli (n; 0-load condition). Participants were instructed only to perceive the star stimuli, and to memorize the letter stimuli during the maintenance interval. After the maintenance interval, the memory probe was presented, consisting of one letter that either had been part of the memory set (positive probe) or had not been part of the memory set (negative probe) in the 2load and 4-load condition, or an arrow to the left (oo) or to the right (44) in the 0-load condition. Participants were instructed to discriminate positive versus negative probes and arrows left versus arrows right by pressing the left-hand or the right-hand response button using their index-fingers. Left and right button
Participants Thirty volunteers (age 18.4–33.3, mean age 22.2, 23 female), all students from Maastricht University, participated in the study. All gave informed consent and received course-credits or were paid (h22.50) for participation. The experimental methods had ethical approval from the institutional ethics committee. An estimation of full-scale IQ was derived from the individual scores on two subtests (vocabulary and block design) of the Dutch version of the Wechsler Adult Intelligence Scale (WAIS-III); mean IQ-score was 123.1 (range 103–139, SD 10.2). As a measure of WM capacity, participants performed the digit span test (forward and backward), which is part of the WAIS-III. The standardized mean digit span score was 12.7 (range 6–18, SD 3.1). Stimuli Memory stimuli. The memory set consisted of a row of 2 (2load condition) or 4 (4-load condition) consonant letters (excluding Y and Q), or 4 star stimuli (n; 0-load condition). Letter size was 0.37 cm horizontally ! 0.48 cm vertically, and star size was 0.2 cm horizontally ! 0.3 cm vertically.1 The size of the 1 At a viewing distance of 57 cm, 1 cm on the display subtends 11 of visual angle.
Stroop stimuli. Stroop stimuli consisted of photographs and names of famous pop stars and well-known politicians. The photographs were derived from the World Wide Web. The selection of pop stars and politicians to be included in the nameface Stroop task was based on a screening. University students (n 5 66; Dutch native speakers) were presented with photographs of faces of 30 politicians and 38 pop stars and asked to write down the correct name below each photo. Based on recognition rates, 4 famous pop stars (between brackets: percentage of students that correctly recognized the face): Michael Jackson (95%), Elvis Presley (92%), Justin Timberlake (97%), and Robbie Williams (97%), and 4 well-known politicians: George Bush (98%), Bill Clinton (94%), Jan Peter Balkenende (82%; Dutch prime minister), Geert Wilders (82%; Dutch minister) were selected. Participants from the screening familiar with at least 7 of the 8 selected faces were invited to participate in the main study. All photographs were software-edited using Adobe Photoshop. Faces were first converted to grayscale, the background was set to gray (RGB; 131, 131, 131), and image height was adjusted to 250 pixels. In Matlab (version R2007a), average face luminance was adjusted to the average background luminance. In the task, the size of the faces ranged from 4.6–5.4 cm horizontally ! 7.3 cm vertically. Name stimuli consisted of first names and surnames, presented next to each other and below the eyes of the face, in dark gray color (RGB; 64, 64, 64), with size ranging from 3.0–5.3 cm horizontally ! 0.5 cm vertically.
34
E. M. M. Jongen & L. M. Jonkman
Figure 1. (A) Schematic illustration of a trial. In this example of a 2-load memory trial, 2 letters are followed by a category-incongruent Stroop stimulus, a category-congruent Stroop stimulus, and a negative memory probe stimulus. Stimuli are not to scale. Subjects were instructed to memorize the letter stimuli, and subsequently classify written name stimuli (while ignoring face stimuli) as either a pop star or a politician. Finally, participants were to decide if the letter stimulus had been part or had not been part of the to-be-memorized sequence of letters. In both the memory task and the Stroop task, participants were asked to respond as fast and accurately as possible by pressing the correct response button (two-choice button response). (B) Schematic representation of memory trials in the 0-load, 2-load, and 4-load condition. (C) Schematic representation of the category-congruent and categoryincongruent conditions for the two categories that were used in the task: pop stars and politicians. In the category-congruent condition, written name stimuli and face stimuli were from the same category; in the category-incongruent condition, written name stimuli and face stimuli were from opposite categories.
allocation for positive and negative probes (but not for left and right arrows) was balanced between subjects. In the name-face Stroop paradigm (see Figure 1C), written names of famous pop stars and well-known politicians were superimposed on faces from the same set of pop stars or politicians. The faces were equally likely to be category-congruent with the target name (e.g., a pop star’s face and another pop star’s name), or category-incongruent with the target name (a pop star’s face and a politician’s name or vice versa). No face stimulus was combined with its own name. The task was thus slightly different from the task used by De Fockert et al. (2001), in which the congruent condition consisted of faces paired with their own name (i.e., identity-congruency). The advantage of categorycongruency (Egner & Hirsch, 2005; Pecchinenda & Heil, 2007) is the larger number of unique congruent stimuli relative to identity-congruency which compensates the larger number of trials and repetition of stimuli in an EEG study, and more importantly results in a comparable number of unique stimuli in the congruent and incongruent condition, thereby controlling factors such as stimulus novelty between conditions that might otherwise confound the results. Participants were instructed to classify the names as either pop star or politician while ignoring distractor faces by pressing the left-hand or the right-hand response button using their index fingers. Left and right button allocation for the two response categories was balanced between subjects.
Corrective feedback (short text message) was given on misses, false alarms, and on responses that were too fast (o120 ms), or too slow (41750 ms Stroop stimuli; 41500 ms memory probes). The experimental session comprised 432 Stroop trials and 162 WM trials, presented in 9 separate blocks (3 blocks for each memory load) of 48 Stroop trials and 18 memory trials. An additional warming-up trial was presented at the start of each block and not included in the analyses. In each block, positive and negative probes were equiprobable and presented randomly, and category-congruent and category-incongruent stimuli were equiprobable and presented randomly. The blocks were presented in pseudo-random order following the restriction that each of the three load conditions was presented once before any of the load conditions were repeated (e.g., 024 042 420). Six different orders were used across subjects. Between blocks, participants could take a short break. Procedure The experiment was conducted in a dimly lit, sound-attenuated room, on a Samsung SyncMaster 940BF monitor that was placed at a viewing distance of 57 cm. ERTSVIPL V3.37b (Beringer, 1987) controlled the tasks. After the preparations for the electroencephalographic (EEG) recordings, participants performed a blink calibration task (Jongen, Smulders, & van Breukelen, 2006; Jongen, Smulders, & van der Heiden, 2007), in which spontaneous blinks were promoted by demanding con-
Working memory load effects on Stroop interference stant fixation to detect slow color changes of a fixation cross. The blink correction factor was derived from this task and used for offline correction of trials with eyeblinks in the main task (see below). The main task session was then presented.2 The experimental session was preceded by an extensive practice session3 that served to ensure face-familiarity and name-face association, to practice the Stroop task, the working memory task, and the combined task. After removal of the EEG cap, three subtests of the WAIS-III were performed. EEG Recording and Analyses EEG activity was recorded continuously, via NeuroScan 4.3 (Compumedics, Hamburg, Germany), from 62 channels, using tin electrodes mounted on an elastic cap (Easycap) and positioned according to the 10–20 System. The left mastoid (A1) was used as the reference for all electrodes, and AFz functioned as the ground. Tin electrodes were also used to bipolarly record vertical and horizontal electrooculograms (EOGs). Electrode impedance was kept below 5 ko during recording, amplifier bandpass was 0.05–100 Hz, and the digitization rate was 500 Hz. ERP analysis was done in Neuroscan 4.3.1. EEG data were re-referenced off-line to the average of the right and left mastoids. Eyeblink activity was corrected with a regression procedure (Semlitsch, Anderer, Schuster, & Presslich, 1986) using the blink correction coefficients derived from the blink calibration task. Data were filtered with a low pass filter of 30 Hz (48 dB/oct.) and then separated into epochs of 1200 ms, starting 200 ms before Stroop stimulus onset. Incorrect Stroop trials, and Stroop trials with a voltage exceeding !100 mV were excluded from the analyses. Furthermore, to reliably examine the effect of WM load on processes of Stroop interference, only Stroop trials within the WM delay of correct WM trials were included. Averages were computed relative to the 200 ms baseline for each subject, for each of the twelve conditions (WM Load conditions (0, 2, 4) " Congruency (2: congruent, incongruent) " Stimulus type (2: politicians, pop stars)). Grand averages were then computed for each of the six (WM Load " Congruency) conditions, disregarding Stimulus type. After exclusion of trials with a voltage exceeding !100 mV and error trials, a trial-average (range, S.D.) of 68.6 (61–72, 3.1) in the Congruent 0-load condition; 66.2 (58–72, 4.2) in the Incongruent 0-load condition; 64.5 (52–72, 4.9) in the Congruent 2-load condition; 62.6 (52–72, 4.7) in the Incongruent 2-load condition; 64.2 (48–72, 5.5) in the Congruent 4-load condition; 61.9 (46–70, 6.1) in the Incongruent 4-load condition remained for analyses. Distractor (face) processing at PO7/PO8: N170 and N250. Based on the literature discussed in the introduction and inspection of grand averages (see Figure 2, Figure 3), the mean amplitude of the N170 (180–220 ms) and N250 (280–340 ms) were computed at electrodes PO7/PO8. Furthermore, for N170 peak latency, the amplitude minimum was determined in a 160–240 ms window after filtering the ERPs using an 8 Hz (12 dB/oct.) low pass filter. 2 Participants also performed another combined working memory and interference control task (task order was balanced across participants); these data will be discussed elsewhere. 3 More details about the practice session can be obtained from the first author.
35 Interference processing: broadly distributed positivity reduction (N450) and positivity enhancement (P600). Two interference effects were expected: a positivity reduction (N450) and a positivity enhancement (P600) for the incongruent condition relative to the congruent condition. Inspection of grand averages (see Figure 4) indeed revealed these two interference effects. The N450 was most pronounced between 480–600 ms and broadly distributed over midline and lateral fronto-central, centroparietal, parietal, and parieto-occipital sites (see voltage maps in Figure 4). Thus, for statistical analysis ERP mean amplitudes in the specified window (480–600 ms) in these scalp regions were selected using midline sites and two adjacent lateral sites (FCz/3/ 4; CPz/3/4; Pz/3/4; POz/3/4). As shown in Figure 3, the positivity showed a latency delay for incongruent trials in comparison to congruent trials. A similar P3 peak latency delay for incongruent relative to congruent stimuli has been shown in the colorword Stroop task (Lansbergen & Kenemans, 2008). It was interpreted as indicating longer stimulus identification and evaluation time in the incongruent relative to the congruent condition (Kutas, McCarthy, & Donchin, 1977). Positivity latency was therefore estimated at the same selection of channels as the N450 in the 400–1000 ms time window in single trials, after 3.4 Hz low pass filtering (Lansbergen & Kenemans, 2008; Smulders, Kenemans, & Kok, 1994). The P600 was most pronounced between 760–1000 ms, and broadly distributed over midline and lateral frontal, fronto-central, parietal, and occipital sites (see voltage maps in Figure 4). Thus, for statistical analysis ERP mean amplitudes in the specified window (760–1000 ms) in these scalp regions were selected using midline sites and two adjacent lateral sites (Fz/3/4, FCz/3/ 4, Pz/3/4, Oz/1/2). Behavioral Statistical Analyses Working memory. After exclusion of error WM trials, a WM trial-average (range, S.D.) of 53.7 (53.0–54.0, 0.45) in the 0-load condition, 50.3 (43.0–53.0, 2.3) in the 2-load condition, and 50.0 (37.0–53.0, 3.2) in the 4-load condition remained for analyses. Reaction time data and the square roots of percentages of misses, false alarms, and hits were analyzed in an analysis of variance (ANOVA) with Load (3: 0-load, 2-load, 4-load) as within-subjects factor. Stroop. Reaction time data and the square roots of percentages of misses, false alarms, and hits were analyzed using an overall Load (3: 0-load, 2-load, 4-load) " Congruency (2: congruent, incongruent) ANOVA. ERP Statistical Analysis Statistical ERP analysis was carried out by entering mean voltage values in the specified time windows (for every described component) and peak latency values (for the N170 and positivity delay (400–1000 ms)) into an ANOVA. In all analyses, withinsubjects factors Load (3: 0-load, 2-load, 4-load) and Congruency (2: congruent, incongruent) were included. Additional withinsubjects factors in the analyses of the positivity reduction (N450) and enhancement (P600) were Anterior-Posterior (N450: FC, CP, P, PO; P600: F, FC, P, O) and Laterality (3: left, midline, right). An additional within-subjects factor in the analyses of the N170 and the N250 was Hemisphere (2: left (PO7), right (PO8)). For all analyses, p-value was set at .05, corrected for deviations from sphericity (Greenhouse-Geisser epsilon correction).
36
E. M. M. Jongen & L. M. Jonkman
Figure 2. Grand-averaged ERPs elicited by name-face Stroop stimuli at PO7 and PO8, for the 0-load (black line), 2-load (gray line), and 4-load (dotted line) condition (pooled for congruency). The N170 (180–220 ms) and N250 (280–340 ms) are presented enlarged.
The corrected F- and probability values, and the uncorrected degrees of freedom are reported. Results Working Memory Task Performance Average reaction times and the average percentage of false alarms, misses, and hits to memory probe stimuli as a function of WM load are summarized in Table 1. Accuracy. Since the average percentage of misses was smaller than 1% (.54), these data were not further analyzed. False alarms increased with load (Load: F(2,58) 5 134.5, po.0005). Planned comparisons showed a significant increase in false alarms from 0load to 2-load (po.0005) and from 0-load to 4-load (po.0005), but not from 2-load to 4-load (p 5 1.0). Hits decreased when load increased (Load: F(2,58) 5 33.1, po.0005), and planned comparisons showed a significant decrease from 0-load to 2-load (po.0005), and from 0-load to 4-load (po.0005), but not from 2-load to 4-load (p 5 1.0). Reaction time. As expected, reaction times increased with load (Load: F(2,58) 5 195.0, po.0005). Planned comparisons
showed a significant increase in reaction time from 0-load to 2-load (po.0005), from 0-load to 4-load (po.0005), and from 2-load to 4-load (po.0005). In sum, behavioral data from the WM task confirm that the manipulation of WM load was successful as reaction times and percentage of false alarms increased and the percentage hits decreased when WM load increased. Name-Face Stroop Task Performance and Effects of WM Load Average reaction times and the average percentage of false alarms, misses, and hits to Stroop stimuli as a function of Congruency and WM load are summarized in Table 2. Accuracy. Since the average percentage of misses was smaller than 1% (.36), these data were not further analyzed. False alarms were higher for incongruent trials than for congruent trials (Congruency: F(1,29) 5 37.4, po.0005), but there was no main effect of Load (F(2,58) 5 1.9, p 5 .16), and no interaction of Load ! Congruency (F(2,58)o1, p 5 .56). Hits were lower for incongruent trials than for congruent trials (Congruency: F(1,29) 5 19.0, po.0005). Although there was no interaction of Load ! Congruency (F(2,58)o1.0, p 5 .39), a main effect of
Working memory load effects on Stroop interference Load (F(2,58) 5 29.6, po.0005) indicated an overall decrease in hits in the Stroop task when load increased. Planned comparisons showed a significant decrease in hits from 0-load to 2-load
37 (po.0005) and from 0-load to 4-load (po.0005), but not from 2load to 4-load (p 5 .49). Reaction time. As expected, reaction times were slower for incongruent trials than congruent trials (Congruency: F(1,29) 5 147.0, po.0005). Although there was no interaction of Load ! Congruency (F(2,58)o1, p 5 .46), a main effect of Load (F(2,58) 5 13.1, po.0005) indicated an overall increase in reaction time when load increased. Planned comparisons showed a significant increase in reaction time from 0-load to 2-load (p 5 .008) and from 0-load to 4-load (p 5 .001), and a trend from 2-load to 4-load (p 5 .06). In sum, the behavioral data provide evidence for interference in the name-face Stroop task as reaction times and the percentage of false alarms were higher and the percentage of hits was lower in incongruent trials than congruent trials. Although WM load led to an overall decrease in accuracy and an overall increase in reaction time, interference effects were not modulated by WM load. Event-Related Potentials N170 (180–220 ms). On N170 amplitude, there was no main effect of Congruency (F(1,29)o1, p 5 .81), and no interaction of Load ! Congruency (F(1,29) 5 1.6, p 5 .21), but as shown in Figure 2, N170 amplitude decreased with WM load . N170 voltage maps for every load condition are shown in Figure 4. This was confirmed by a main effect of Load (F(2,58) 5 4.7, p 5 .01), indicating a linear decrease in amplitude with Load, as confirmed by a significant linear (F(1,29) 5 9.3, p 5 .005) but not quadratic contrast (F(1,29)o1, p 5 .85). Planned comparisons showed no difference in N170 amplitude between the 2-load condition and the 0-load condition (p 5 .19), but there was a significant amplitude reduction in the 4-load condition relative to the 0-load condition (p 5 .005), and a trend reduction in the 4-load relative to the 2-load condition (p 5 .096). On N170 peak latency, there were no effects of Load (F(2,58) 5 1.6, p 5 .22), Congruency (F(1,29)o1, p 5 .82), or Load ! Congruency (F(1,29) 5 1.2, p 5 .29). N250 (280–340 ms). On N250 amplitude, there was no main effect of Congruency (F(1,29) 5 1.0, p 5 .32), or an interaction of Load ! Congruency (F(1,29) 5 1.7, p 5 .19), but as shown in Figure 2, N250 amplitude decreased with WM load. N250 voltage maps for every load condition are shown in Figure 4. This was confirmed by a main effect of Load (F(2,58) 5 6.1, p 5 .005), that was dependent on Hemisphere (Load ! Hemisphere: Figure 3. Grand average voltage maps of the N170 (180–220 ms), N250 (280–340 ms), N450 positivity reduction Stroop effect (480–600 ms), and P600 positivity enhancement Stroop effect (760–1000 ms), in the different load conditions. For every component, the first row represents the front distribution, and the second row represents the back distribution of the scalp. N170 and N250 effects were computed by averaging the congruent and incongruent condition in every load condition, and N450 and P600 Stroop effects were computed by subtracting the congruent from the incongruent condition in every load condition. These distributions illustrate that, whereas the N170 and N250 effects are mainly distributed over lateral occipital-temporal sites, both Stroop effects were broadly distributed over the scalp. Red regions indicate positive voltages and blue regions indicate negative voltages. The electrode positions are indicated by dots and the difference between contour lines corresponds to a voltage change of 0.50 mV for the N170 and N250, and a voltage change of 0.13 mV for the Stroop effects.
38
E. M. M. Jongen & L. M. Jonkman
Table 1. Working Memory Performance: The Means (M) and Standard Deviations (SD) of Reaction Time (in Milliseconds), and Percentages Hits, False Alarms, and Misses for the Different Memory Load Conditions (0, 2, 4) False alarms
Hits Memory load 0 2 4
Misses
Reaction time
M
SD
M
SD
M
SD
M
SD
99.5 93.1 92.7
0.83 4.3 5.9
0.2 6.4 6.5
0.6 4.1 4.8
0.2 0.6 0.8
0.6 1.1 1.8
507.4 689.1 744.2
55.2 103.1 109.2
F(2,58) 5 4.1, p 5 .03), and significant only in the left hemisphere (PO7: F(2,58) 5 9.3, po.0005; PO8: F(2,58) 5 1.7, p 5 .19). For the left-hemispheric Load effect, the linear contrast (F(1,29) 5 7.8, p 5 .009) and the quadratic contrast (F(1,29) 5 11.1, p 5 .002) were both significant. Planned comparisons showed an N250 reduction for the 2-load condition relative to the 0-load condition (po.0005), and for the 4-load condition relative to the 0-load condition (p 5 .009), but there was no difference between the 4-load and the 2-load condition (p 5 .22).
Positivity reduction (N450 effect: 480–600 ms). As shown in Figure 3 and 4, there was an N450 effect, a positivity amplitude reduction in incongruent trials relative to congruent trials around 500 ms (480–600 ms) that was distributed over frontocentral, centroparietal, parietal, and parieto-occipital sites. The effect is most clearly demonstrated by the difference waves of incongruent minus congruent trials, in Figure 3B, and the N450 voltage maps of these difference waves in Figure 4. Analyses confirmed the main effect of Congruency (F(1,29) 5 34.6, po.0005). The Congruency effect was not modulated by Load (Load ! Congruency: F(2,58)o1, p 5 .55), and there was no main effect of Load (F(2,58)o1, p 5 .62). There was an interaction of Congruency ! Anterior-Posterior (F(3,87) 5 10.5, p 5 .001), and of Congruency ! Anterior-Posterior ! Laterality (F(6,174) 5 2.7, p 5 .04). Inspection of means indicated that the amplitude reduction for the incongruent relative to the congruent condition was largest at midline centroparietal and parietal sites.
Table 2. Name-Face Stroop Performance: The Means (M) and Standard Deviations (SD) of Reaction Time (in Milliseconds), and Percentages Hits, False Alarms, and Misses for the Different Memory Load Conditions (0, 2, 4), and Congruency Conditions (Congruent, Incongruent)
Memory load 0-Load 2-Load 4-Load
Hits
False alarms
Misses
Reaction time M
Congruency
M
SD
M
SD
M
SD
Congruent Incongruent Congruent Incongruent Congruent Incongruent
97.1 93.1 91.3 88.8 90.5 87.6
2.5 4.3 5.5 5.6 6.6 6.6
2.4 5.9 2.0 4.4 2.5 5.5
2.2 3.9 1.8 3.2 2.3 3.8
0.2 0.3 0.2 0.5 0.5 0.5
0.5 0.6 0.5 1.1 0.8 0.9
SD
672.4 87.2 708.7 82.1 690.9 83.7 726.1 85.6 699.1 91.5 742.2 100.7
In sum, there was a load-independent positivity reduction (N450 Stroop effect) between 480–600 ms for incongruent trials relative to congruent trials that was broadly distributed and largest at (midline) centroparietal and parietal sites.
Positivity latency delay. As shown in Figure 3, the positivity showed a latency delay for incongruent trials (peak latency: 664 ms) in comparison to congruent trials (peak latency: 644 ms) (Congruency: F(1,29) 5 51.7, po.0005). There was an interaction of Congruency ! Anterior-Posterior (F(3,87) 5 21.8, po.0005), and follow-up analyses showed that the latency delay was stronger at posterior sites (FC: F(1,29) 5 9.6, p 5 .004; CP: F(1,29) 5 39.8, po.0005; P: F(1,29) 5 65.1, po.0005; PO: F(1,29) 5 80.2, po.0005). There were no effects of Load (F(2,58) 5 3.1, p 5 .07), or Load ! Congruency (F(2,58) 5 1.6, p 5 .22) on latency.
Positivity enhancement (P600 effect: 760–1000 ms). Later in time, there was a P600 effect, a positivity amplitude enhancement in incongruent trials relative to congruent trials that started around 760 ms and was distributed over frontal, fronto-central, parietal, and occipital sites, as shown in Figures 3 and 4. This positive amplitude difference is more clearly demonstrated by the difference waves of incongruent minus congruent trials, in Figure 3B, and the P600 voltage maps of these difference waves in Figure 4. Analyses confirmed the main effect of Congruency (F(1,29) 5 21.1, po.0005). In addition, there was an interaction of Load ! Congruency (F(2,58) 5 8.6, p 5 .001), Congruency ! Anterior-Posterior (F(3,87) 5 14.7, po.0005), and Load ! Congruency ! Anterior-Posterior (F(6,174) 5 2.9, p 5 .05). The three-way Load ! Congruency ! Anterior-Posterior interaction was further explored by testing for Load ! Congruency interactions at frontal, frontocentral, parietal, and occipital sites. There were Load ! Congruency interactions at frontal (F(2,58) 5 10.0, po.0005) fronto-central (F(2,58) 5 8.8, p 5 .001), and parietal sites (F(2,58) 5 4.7, p 5 .02), but at occipital sites there was a main effect of Congruency (F(1,29) 5 32.3, po.0005; Load ! Congruency: F(2,58) 5 1.2, p 5 .30). Follow-up analyses at frontal and fronto-central sites showed a Congruency effect in the 4-load condition (frontal: F(1,29) 5 20.8, po.0005; fronto-central: F(1,29) 5 23.4, po.0005), but not in the 0-load (frontal: F(1,29) 5 1.1, p 5 .29; fronto-central: (F(1,29)o1, p 5 .56) or 2-load (frontal: F(1,29) 5 1.5, p 5 .23; fronto-central (F(1,29)o1, p 5 .88) condition. At parietal sites, there was a Congruency effect in every load condition (0-load: F(1,29) 5 7.1, p 5 .01; 2-load: F(1,29) 5 10.8, p 5 .003; 4-load: F(1,29) 5 26.6, po.0005), and the Congruency effect linearly increased with Load as confirmed by a significant linear (F(1,29) 5 7.6, p 5 .01) but not quadratic (F(1,29) 5 1.8, p 5 .19) contrast. To verify that the frontal effect in the 4-load condition was not the result of volume conduction arising from enhanced activity of a common centroparietal source, Current Source Density (CSD) maps for the 4-load Congruency effect were computed in the 760–1000 ms time interval. These maps indicated different sources underlying the fronto-central and parietal effects. In sum, a positivity enhancement (P600 Stroop effect) between 760–1000 ms for incongruent trials relative to congruent trials at parietal sites increased with load, and at frontal and fronto-central sites was
Working memory load effects on Stroop interference
39
Figure 4. ERP responses elicited by name-face Stroop stimuli at FCz and Pz. Gray-colored bars indicate the N450 positivity reduction (480–600 ms) and P600 positivity enhancement (760–1000 ms). (A) Grand-averaged ERPs for congruent (c) and incongruent (ic) Stroop trials in the 0-load (black lines), 2load (red lines), and 4-load (blue lines) conditions. The latency delay of the positivity for incongruent trials in comparison to congruent trials is clearly visible. (B) Stroop difference waves of incongruent minus congruent trials in the 0-load (black lines), 2-load (red lines), and 4-load (blue lines) conditions. The effects were not limited to the electrodes shown here; see the text for the exact selection of electrodes used in statistical tests.
Discussion
has been shown to elicit reliable distractor interference effects (De Fockert et al., 2001; Egner & Hirsch, 2005; Pecchinenda & Heil, 2007). Below, first behavioral results are discussed, followed by a discussion of the ERP results.
Whereas prior studies have shown behavioral evidence for enhanced distractor interference when subjects have reduced capacity of WM, the present study for the first time examined the brain mechanisms involved in such WM capacity and interference control interactions over time. To this aim, ERPs were measured in a combined WM and name-face Stroop task that
Behavioral results. Stroop interference in the behavioral results was reflected by a reaction time delay and an increase in false alarms for incongruent trials relative to congruent trials. In addition, the WM manipulation was successful as response times to the memory probes increased and accuracy decreased when WM was loaded.
only present in the 4-load condition. Finally, at occipital sites the positivity enhancement was load-independent.
40 Contrary to our hypotheses and results in two prior studies using highly similar paradigms (De Fockert et al., 2001; Pecchinenda & Heil, 2007), the behavioral Stroop interference effect was not modulated by WM load. It is unlikely that this is caused by differences in processing demands of our Stroop or WM task. Stroop interference effects in the 0-load condition were similar to those reported in other studies using a comparable face-name Stroop task (De Fockert et al., 2001; Egner & Hirsch, 2005). Furthermore, the highest memory load of 4 letters compromised Stroop accuracy performance to the same extent in our study as in the study by De Fockert in which 5 digits were held in memory, pointing to a similar perceived load amount. Finally, lack of power cannot explain the absence of an interaction as our study included the largest number of subjects as compared to other studies using the same paradigm. Instead, a closer comparison between studies revealed that, whereas our results showed significant interference effects of category incongruent faces on reaction time and ERPs in the 0-load and 2load conditions, no reaction time Stroop effect was found in the low memory condition of the study by Pecchinenda and Heil (2007). Our finding of interference effects even when there was no WM load in the category face-name Stroop task replicates findings reported by Egner and Hirsch (2005; 36 ms and 41 ms, respectively). Also, our mean Stroop reaction times were comparable to those reported by De Fockert et al. (750 ms) and Egner and Hirsch (800 ms), whereas those reported by Pecchinenda and Heil were remarkably fast (330–440 ms). In our study, as in the studies by De Fockert et al. and Egner and Hirsch, there were interference effects for reaction time and accuracy, pleading against a speed-accuracy trade-off. Pecchinenda and Heil did not report Stroop accuracy results, so possibly their subjects traded speed for accuracy (explaining the fast response times) as a result of the subject-paced nature of their task. This might then explain the absence of interference effects in the low WM load condition in their task, causing the interaction effect between WM load and Stroop interference in their results. Another explanation for the absence of WM load effects on behavioral Stroop interference might be that the participants in our study prevented a further increase in behavioral interference with increasing WM load by enhancing top-down frontal cortical control. The ERP data that will now be discussed provide evidence for such an explanation. Possibly, participants on average had a higher WM capacity than participants in other studies as it has been shown that subjects with higher capacity show more frontal cortex recruitment in demanding WM tasks (Osaka et al., 2003). ERP results: Effects of WM load on distractor encoding. Based on WM load theory of selective attention (De Fockert et al., 2001; Lavie, 1995), our predictions were that, with an increase in WM load, top-down inhibitory control on distractor face processing would be reduced. During early processing stages of encoding, when face identification had not yet taken place, this reduced control was expected to lead to enhanced processing of all distractor faces, independent of (category-) congruency with the target name. Accordingly, main effects of WM load were found on the amplitude of the early occipitaltemporal N170 and N250 components that reflect early processes of face encoding and recognition, respectively, and have been localized to secondary visual areas (Bentin et al., 1996; Itier & Taylor, 2004; Latinus & Taylor, 2006; Pfu¨tze et al., 2002; Schweinberger & Burton, 2003; Schweinberger et al., 2002b).
E. M. M. Jongen & L. M. Jonkman However, instead of an increase, both components showed an amplitude reduction when load increased, suggesting reduced encoding of distractor face stimuli in secondary visual areas with higher WM load. This result is in contrast with results from fMRI studies that showed increased activation in visual cortical areas associated with distractor processing with increases of WM load (De Fockert et al., 2001; Rissman, Gazzaley, & D’Esposito, 2009). Due to limited time resolution of fMRI, it is, however, difficult to determine whether this increased fMRI activation is related to encoding or later stages of conscious recognition or identification represented by ERP components occurring after about 400 ms. Our time-sensitive ERP results show that during early perceptual encoding stages distractor processing is reduced with increases of WM load. These results are supported by the ERP dual-task literature that consistently showed that an increase in difficulty of a primary task (i.e., the WM task) leads to reduced processing of secondary task stimuli (i.e., Stroop stimuli) due to less availability of resources (Jonkman et al., 2000; Kok, 2001; Singhal & Fowler, 2004; Wickens, 1984). However, in contrast to these dual-task studies, the present study specifically investigated WM load effects on selective attention by investigating effects on distractor interference processing in a Stroop paradigm in which target and distractor stimuli were presented simultaneously. It has to be noted, however, that since in our Stroop stimuli faces and names were superimposed and the N170 and N250 are elicited by face and name stimuli (e.g., Mercure, Dick, Halit, Kaufman, & Johnson, 2008; Pfu¨tze et al., 2002; Schweinberger et al., 2002b), the early modulations of N170 and N250 amplitude in our results cannot unequivocally be related to the processing of distractor (face) stimuli. Still, the hypothesized topdown effect of memory load on early Stroop stimulus encoding occurred, but in another direction than WM load theory would predict. A recent delayed recognition ERP study reported a similar amplitude reduction of the face-sensitive N170 and N250 responses to memory face probes when participants retained an increasing number of faces in WM (Morgan et al., 2008). It was suggested that N170 and N250 processing resources necessary for face processing of the memory probe item were reduced because the same resources were used by WM maintenance of face stimuli. In our study, ERPs were not measured to memory probes, but to secondary (Stroop)-task stimuli that were presented in the maintenance interval and that were not part of the memory set. The present results thus suggest shared resources for WM maintenance of the letter stimuli and early processing of the Stroop stimuli, as reflected by N170 and N250 amplitude reductions. Such shared resources may originate in secondary visual cortex areas as it has been shown that maintenance of a memory set, especially during distraction, requires prefrontal cortex (PFC)-controlled updating of stimulus representations in the visual cortex (Johnson, Mitchell, Raye, D’Esposito, & Johnson, 2007; Yi, Turk-Browne, Chun, & Johnson, 2008). Since in our study the to-be-maintained stimuli were letters and early processing of letters has been shown to take place also in fusiform areas (Wong, Gauthier, Woroch, DeBuse, & Curran, 2005), the refreshing of memory letter presentations during distraction, especially in the high 4-load condition, may require so much fusiform activation that early distractor processing in overlapping areas is compromised. The fact that the N250 WM load effect was left lateralized further adds to this conclusion. This reduced early perceptual encoding and recognition of Stroop stimuli
Working memory load effects on Stroop interference might be related to the increase in false alarms and reaction time when WM load increased. ERP results: Effects of WM load on distractor interference processing. ERP findings confirmed our predictions that nameface Stroop interference occurs later in time, starting around 450 ms after face and name identification and recollection of semantic information regarding occupation (e.g., Bentin & Deouell, 2000; Eimer, 2000a, b; Paller et al. 2000; Pfu¨tze et al., 2002; Schweinberger et al., 2002b). First, delayed peak latencies of a broadly distributed positivity when face-identity was categoryincongruent with the to-be-categorized name indicate longer stimulus evaluation and identification time (Kutas et al., 1977) for incongruent than congruent stimuli. A similar positivity delay for incongruent trials in comparison with congruent trials has been shown in a color-word Stroop task (Lansbergen & Kenemans, 2008). Second, incongruent stimuli evoked an amplitude reduction of a broadly distributed positivity between 480–600 ms after stimulus onset. A similar amplitude reduction, the N450, has repeatedly been shown in color-word Stroop studies and has been related to conflict detection (e.g., Lansbergen et al., 2007; Liotti et al., 2000; Markela-Lerenc et al., 2004; Qiu et al., 2006; West, 2003; West et al., 2004). This is to our knowledge the first time it has been shown in a name-face Stroop task. The latency delay and N450 effects were of comparable strength in the three load conditions, suggesting that these processes of conflict detection proceed without WM involvement and that neural circuits involved in conflict detection and WM maintenance do not overlap or share resources. The N450 effect was followed by an interference effect between 760–1000 ms at frontal, fronto-central, parietal, and occipital sites, consisting of a positivity enhancement for incongruent trials relative to congruent trials (P600 effect). A similar broadly distributed P600 effects has consistently been demonstrated in color-word Stroop ERP studies and has been related to processes of conflict resolution and the processing of response-relevant (color) information that is used to guide response selection in incongruent trials (e.g., Jongen & Jonkman, 2008; Lansbergen et al., 2007; Liotti et al., 2000; West, 2003; West & Alain, 2000). Our results show that this interference effect is modulated by WM load at frontal, fronto-central, and parietal sites, but not at occipital sites. More specifically, it increased linearly with load at parietal sites, and at frontal and frontocentral sites, it was restricted to the highest WM load condition. These results are in line with models of cognitive control that assign an important role to the PFC in maintenance of goals and the means to achieve them (Duncan, 2001; Fuster, 2001; Miller, 2000; Miller & Cohen, 2001). According to these models, topdown biasing signals are sent from the PFC to different structures throughout the brain, thereby guiding behavior by affecting, for example, sensory modalities, systems responsible for response selection or execution, and systems for memory retrieval. This guiding activity is assumed to be especially important when, in a task such as a Stroop task, multiple responses are possible for a given stimulus, and the task-appropriate response must compete with stronger, more automatic alternatives (e.g., MacDonald, Cohen, Stenger, & Carter, 2000). Important for the interpretation of the present results, evidence for such PFC driven top-down control on posterior faceprocessing areas has been shown in an fMRI study using a similar face-name Stroop task (Egner & Hirsch, 2005). More specifically, in conditions in which participants exerted high cognitive
41 control, there was a behavioral decrease in interference that was accompanied by enhanced PFC activation and, more importantly, enhanced functional connectivity between PFC areas and posterior target processing areas. It was argued that conflict resolution thus was embodied by PFC-driven modulation of posterior areas, biasing processing of relevant information. Taking the above findings into account, it seems reasonable to assume that the present enhanced fronto-central activation in response to incongruent Stroop trials only in the 4-load condition indicates enhanced cognitive control that was necessary to reduce the extra interference resulting from resource depletion by the concurrent high WM load. This extra PFC control has presumably led to successful prevention of an increase in behavioral interference, explaining the absence of an interaction between WM load and interference in our reaction time results. The parietal interference effects that increased linearly with WM load are suggested to reflect enhanced processing of response-relevant information used to guide response selection incongruent trials (Jongen & Jonkman, 2008; Lansbergen et al., 2007; Liotti et al., 2000; West, 2003; West & Alain, 2000), possibly prompted by PFC. To conclude, our ERP results extend the current literature on the interaction between WM and selective attention by showing that a concurrent high WM load of 4 letters causes the strongest bottlenecks in a late stage of interference processing, associated with conflict resolution or response selection. Such late effects of WM load were also reported in another ERP study, in which targets and distractors were presented sequentially (SanMiguel, Corral, & Escera, 2008). Finally, it might be argued that the fact that the enhanced frontal activation to incongruent stimuli in the 4-load condition occurs at or after the average response times for incongruent trials (between 700 and 750 ms) complicates a functional relation with the process of conflict resolution. However, it has been noted in the mental chronometry ERP literature that ‘‘the respective components do not necessarily have to emerge in the ERP waveform at exactly the same times as the corresponding stages take place. A residual delay could intervene between execution of a stage and the occurrence of its associated ERP component. Other ancillary stages outside the mainstream of processing could also be immediate precursors of the components and lengthen their latencies’’ (Meyer, Osman, Irwin, & Yantis, 1988, p. 46). Therefore, with respect to the delay between average response times and the latency window of the positivity enhancement in our data, this does not exclude the functional relation of the positivity enhancement to conflict resolution. In sum, using ERP measures this study for the first time examined when, in time, name-face Stroop interference is modulated by WM demands. The first name-face interference effect occurred around 500 ms (N450 effect), during stages of stimulus evaluation and conflict detection, and was followed by a second interference effect between 760–1000 ms (P600 effect), related to conflict resolution. WM load only modulated the P600 effect: there was a linear increase of the P600 interference effect with WM load at parietal sites, and at fronto-central sites it was restricted to the highest WM-load condition. These effects are suggested to reflect enhanced PFC-driven top-down control of posterior sites in highly demanding situations when enhancement of target stimulus processing and suppression of distractor stimulus processing are necessary for conflict resolution. Successful conflict resolution by enhanced PFC recruitment probably explains the absence of modulations by WM load on behavioral interference.
42
E. M. M. Jongen & L. M. Jonkman REFERENCES
Allison, T., Ginter, H., McCarthy, G., Nobre, A. C., Puce, A., Luby, M., et al. (1994). Face recognition in human extrastriate cortex. Journal of Neurophysiology, 71, 821–825. Baddeley, A. (1993). Working memory or working attention? In A. Baddeley & L. Weiskrantz (Eds.), Attention: Selection, awareness, and control (pp. 152–170). Oxford, England: Oxford University Press. Begleiter, H., Porjesz, B., & Wang, W. (1995). Event-related brain potentials differentiate priming and recognition to familiar and unfamiliar faces. Electroencephalography and Clinical Neurophysiology, 94, 41–49. Beringer, J. (1987). Experimental Run Time System (Version 3.32c). Frankfurt: Berisoft Cooperation. Bentin, S., Allison, T., Puce, A., Perez, E., & McCarthy, G. (1996). Electrophysiological studies of face perception in humans. Journal of Cognitive Neuroscience, 8, 551–565. Bentin, S., & Deouell, L. Y. (2000). Structural encoding and identification in face processing: ERP evidence for separate mechanisms. Cognitive Neuropsychology, 17, 35–54. Boehm, S. G., & Sommer, W. (2005). Neural correlates of intentional and incidental recognition of famous faces. Cognitive Brain Research, 23, 153–163. Bo¨tzel, K., Schulze, S., & Stodieck, S. R. (1995). Scalp topography and analysis of intracranial sources of face-evoked potentials. Experimental Brain Research, 104, 135–143. Bruce, V., & Young, A. (1986). Understanding face recognition. British Journal of Psychology, 77(Pt 3), 305–327. Conway, A. R. A., Cowan, N., & Bunting, M. F. (2001). The cocktail party phenomenon revisited: The importance of working memory capacity. Psychonomic Bulletin & Review, 8, 331–335. Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven attention in the brain. Nature Reviews Neuroscience, 3, 201–215. De Fockert, J. W., Rees, G., Frith, C., & Lavie, N. (2001). The role of working memory in visual selective attention. Science, 291, 1803–1806. Duncan, J. (2001). An adaptive coding model of neural function in prefrontal cortex. Nature Reviews Neuroscience, 2, 820–829. Egner, T., & Hirsch, J. (2005). Cognitive control mechanisms resolve conflict through cortical amplification of task-relevant information. Nature Neuroscience, 8, 1784–1790. Eimer, M. (2000a). Effects of face inversion on the structural encoding and recognition of faces: Evidence from event-related brain potentials. Cognitive Brain Research, 10, 145–158. Eimer, M. (2000b). Event-related brain potentials distinguish processing stages involved in face perception and recognition. Clinical Neurophysiology, 111, 694–705. Fuster, J. M. (2001). The prefrontal cortexFan update: Time is of the essence. Neuron, 30, 319–333. Gazzaley, A., Cooney, J. W., McEvoy, K., Knight, R. T., & D’Esposito, M. (2005). Top-down enhancement and suppression of the magnitude and speed of neural activity. Journal of Cognitive Neuroscience, 17, 507–517. Heitz, R. P., & Engle, R. W. (2007). Focusing the spotlight: Individual differences in visual attention control. Journal of Experimental Psychology: General, 136, 217–240. Herzmann, G., Schweinberger, S. R., Sommer, W., & Jentzsch, I. (2004). What’s special about personally familiar faces? A multimodal approach. Psychophysiology, 41, 688–701. Itier, R. J., & Taylor, M. J. (2004). Effects of repetition learning on upright, inverted and contrast-reversed face processing using ERPs. NeuroImage, 21, 1518–1532. Johnson, M. R., Mitchell, K. J., Raye, C. L., D’Esposito, M., & Johnson, M. K. (2007). A brief thought can modulate activity in extrastriate visual areas: Top-down effects of refreshing just-seen visual stimuli. NeuroImage, 37, 290–299. Jongen, E. M. M., & Jonkman, L. M. (2008). The developmental pattern of stimulus and response interference in a color-object Stroop task: An ERP study. BMC Neuroscience, 9, 82. Jongen, E. M. M., Smulders, F. T. Y., & van Breukelen, G. J. P. (2006). Varieties of attention in neutral trials: Linking RT to ERPs and EEG frequencies. Psychophysiology, 43, 113–125. Jongen, E. M. M., Smulders, F. T. Y., & van der Heiden, J. S. H. (2007). Lateralized ERP components related to spatial orienting: Discrimi-
nating the direction of attention from processing sensory aspects of the cue. Psychophysiology, 44, 968–986. Jonkman, L. M., Kemner, C., Verbaten, M. N., Van Engeland, H., Camfferman, G., Buitelaar, J. K., & Koelega, H. S. (2000). Differences between children with attention-deficit hyperactivity disorder and normal control children and effects of methylphenidate. Psychophysiology, 37, 334–346. Kane, M. J., Conway, A. R. A., Bleckley, M. K., & Engle, R. W. (2001). A controlled-attention view of working-memory capacity. Journal of Experimental Psychology: General, 130, 169–183. Kane, M. J., & Engle, R. W. (2003). Working-memory capacity and the control of attention: The contributions of goal neglect, response competition, and task set to Stroop interference. Journal of Experimental Psychology: General, 132, 47–70. Kim, S.-Y., Kim, M. S., & Chun, M. M. (2005). Concurrent working memory load can reduce distraction. Proceedings of the National Academy of Sciences, 102, 16524–16529. Kok, A. (2001). On the utility of P3 amplitude as a measure of processing capacity. Psychophysiology, 38, 557–577. Kutas, M., McCarthy, G., & Donchin, E. (1977). Augmenting mental chronometry: The P300 as a measure of stimulus evaluation time. Science, 197, 792–795. LaBar, K. S., Gitelman, D. R., Parrish, T. B., & Mesulam, M. M. (1999). Neuroanatomic overlap of working memory and spatial attention networks: A functional MRI comparison within subjects. NeuroImage, 10, 695–704. Lansbergen, M. M., & Kenemans, J. L. (2008). Stroop interference and the timing of selective response activation. Clinical Neurophysiology, 119, 2247–2254. Lansbergen, M., Van Hell, E., & Kenemans, J. L. (2007). Impulsivity and conflict in the Stroop task. Journal of Psychophysiology, 21, 33–50. Latinus, M., & Taylor, M. J. (2006). Face processing stages: Impact of difficulty and the separation of effects. Brain Research, 1123, 179– 187. Lavie, N. (1995). Perceptual load as a necessary condition for selective attention. Journal of Experimental Psychology: Human Perception and Performance, 21, 451–468. Lavie, N. (2005). Distracted and confused?: Selective attention under load. Trends in Cognitive Sciences, 9, 75–82. Lavie, N., & De Fockert, J. (2005). The role of working memory in attentional capture. Psychonomic Bulletin and Review, 12, 669–674. Lavie, N., Hirst, A., De Fockert, J. W., & Viding, E. (2004). Load theory of selective attention and cognitive control. Journal of Experimental Psychology: General, 133, 339–354. Lavie, N., Ro, T., & Russell, P. (2003). The role of perceptual load in processing distractor faces. Psychological Science, 14, 510–515. Liotti, M., Woldorff, M. G., Perez, R. III, & Mayberg, H. S. (2000). An ERP study of the temporal course of the Stroop color-word interference effect. Neuropsychologia, 38, 701–711. MacDonald, A. W. III, Cohen, J. D., Stenger, V. A., & Carter, C. S. (2000). Dissociating the role of the dorsolateral prefrontal and anterior cingulate cortex in cognitive control. Science, 288, 1835–1838. Markela-Lerenc, J., Ille, N., Kaiser, S., Fiedler, P., Mundt, C., & Weisbrod, M. (2004). Prefrontal-cingulate activation during executive control: Which comes first? Cognitive Brain Research, 18, 278–287. Mayer, J. S., Bittner, R. A., Nikolic, D., Bledowski, C., Goebel, R., & Linden, D. E. J. (2007). Common neural substrates for visual working memory and attention. NeuroImage, 36, 441–453. McNab, F., Leroux, G., Strand, F., Thorell, L., Bergman, S., & Klingberg, T. (2008). Common and unique components of inhibition and working memory: An fMRI, within-subjects investigation. Neuropsychologia, 46, 2668–2682. Mercure, E., Dick, F., Halit, H., Kaufman, J., & Johnson, M. H. (2008). Differential lateralization for words and faces: Category or psychophysics? Journal of Cognitive Neuroscience, 20, 2070–2087. Meyer, D. E., Osman, A. M., Irwin, D. E., & Yantis, S. (1988). Modern mental chronometry. Biological Psychology, 26, 3–67. Miller, E. K. (2000). The prefrontal cortex and cognitive control. Nature Reviews Neuroscience, 1, 59–65. Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annual Reviews Neuroscience, 24, 167–202. Morgan, H. M., Klein, C., Boehm, S. G., Shapiro, K. L., & Linden, D. E. J. (2008). Working memory load for faces modulated P300,
Working memory load effects on Stroop interference N170, and N250r. Journal of Cognitive Neuroscience, 20, 989– 1002. Osaka, M., Osaka, N., Kondo, H., Morishita, M., Fukuyama, H., Aso, T., & Shibasaki, H. (2003). The neural basis of individual differences in working memory capacity: An fMRI study. NeuroImage, 18, 789– 797. Paller, K. A., Bozic, V. S., Ranganath, C., Grabowecky, M., & Yamada, S. (1999). Brain waves following remembered faces index conscious recollection. Cognitive Brain Research, 7, 519–531. Paller, K. A., Gonsalves, B., Grabowecky, M., Bozic, V. S., & Yamada, S. (2000). Electrophysiological correlates of recollecting faces of known and unknown individuals. NeuroImage, 11, 98–110. Paller, K. A., Ranganath, C., Gonsalves, B., LaBar, K. S., Parrish, T. B., Gitelman, D. R., et al. (2003). Neural correlates of person recognition. Learning and Memory, 10, 253–260. Park, S., Chun, M. M., & Kim, M. S. (2007). Concurrent working memory load can facilitate selective attention: Evidence for specialized load. Journal of Experimental Psychology: Human Perception and Performance, 33, 1062–1075. Pecchinenda, A., & Heil, M. (2007). Role of working memory load on selective attention to affectively valent information. European Journal of Cognitive Psychology, 19, 898–909. Pessoa, L., & Ungerleider, L. G. (2004). Top-down mechanisms for working memory and attentional processes. In M. S. Gazzaniga (Ed.), The new cognitive neurosciences (pp. 919–930. MIT Press. Pfu¨tze, E., Sommer, W., & Schweinberger, S. R. (2002). Age-related slowing in face and name recognition: Evidence from event-related brain potentials. Psychology and Aging, 17, 140–160. Pollmann, S., & von Cramon, D. Y. (2000). Object working memory and visuospatial processing: Functional neuroanatomy analyzed by eventrelated fMRI. Experimental Brain Research, 133, 12–22. Qiu, J., Luo, Y., Wang, Q., Zhang, F., & Zhang, Q. (2006). Brain mechanism of Stroop interference in Chinese characters. Brain Research, 1072, 186–193. Redick, T. S., & Engle, R. W. (2006). Working memory capacity and Attention Network Test performance. Applied Cognitive Psychology, 20, 713–721. Rissman, J., Gazzaley, A., & D’Esposito, M. (2009). The effects of nonvisual working memory load on top-down modulation of visual processing. Neuropsychologia, 47, 1637–1646. Rossion, B., Campanella, S., Gomez, C. M., Delinte, A., Debatisse, D., Liard, L., et al. (1999). Task modulation of brain activity related to familiar and unfamiliar face processing: An ERP study. Clinical Neurophysiology, 110, 449–462. SanMiguel, I., Corral, M., & Escera, C. (2008). When loading working memory reduces distraction: Behavioral and electrophysiological evidence from an auditory-visual distraction paradigm. Journal of Cognitive Neuroscience, 20, 1131–1145. Schweinberger, S. R., & Burton, M. (2003). Covert recognition and the neural system for face processing. Cortex, 39, 9–30. Schweinberger, S. R., Huddy, V., & Burton, M. (2004). N250r: A faceselective brain response to stimulus repetitions. NeuroReport, 15, 1501–1505.
43 Schweinberger, S. R., Pfu¨tze, E. M., & Sommer, W. (1995). Repetition priming and associative priming of face recognition. Evidence from event-related potentials. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 722–736. Schweinberger, S. R., Pickering, E. C., Jentzsch, I., Burton, M., & Kaufman, J. M. (2002a). Event-related brain potential evidence for a response of inferior temporal cortex to familiar face repetitions. Cognitive Brain Research, 14, 398–409. Schweinberger, S. R., Pickering, E. C., Burton, M., & Kaufman, J. M. (2002b). Human brain potential correlates of repetition priming in face and name recognition. Neuropsychologia, 40, 2057–2073. Semlitsch, H. V., Anderer, P., Schuster, P., & Presslich, O. (1986). A solution for reliable and valid reduction of ocular artifacts, applied to the P300 ERP. Psychophysiology, 23, 695–703. Singhal, A., & Fowler, B. (2004). The differential effects of Sternberg short- and long-term memory scanning on the late Nd and P300 in a dual-task paradigm. Cognitive Brain Research, 21, 121–134. Smulders, F. T. Y., Kenemans, J. L., & Kok, A. (1994). A comparison of different methods for estimating single-trial P300 latencies. Electroencephalography and Clinical Neurophysiology: Evoked Potentials, 92, 107–114. Sreenivasan, K. K., & Jha, A. P. (2007). Selective attention supports working memory maintenance by modulating perceptual processing of distractors. Journal of Cognitive Neuroscience, 19, 32–41. Tanaka, J. W., Curran, T., Porterfield, A. L., & Collins, D. (2006). Activation of preexisting and acquired face representations: The N250 event-related potential as an index of face familiarity. Journal of Cognitive Neuroscience, 18, 1488–1497. West, R. (2003). Neural correlates of cognitive control and conflict detection in the Stroop and digit-localisation tasks. Neuropsychologia, 41, 1122–1135. West, R., & Alain, C. (2000). Age-related decline in inhibitory control contributes to the increased Stroop effect observed in older adults. Psychophysiology, 37, 179–189. West, R., Bowry, R., & McConville, C. (2004). Sensitivity of medial frontal cortex to response and nonresponse conflict. Psychophysiology, 41, 739–748. Wickens, C. D. (1984). Processing resources in attention. In R. Parasuraman & R. Davies (Eds.), Varieties of attention (pp. 63–101). New York: Academic Press. Wong, A. C., Gauthier, I., Woroch, B., DeBuse, C., & Curran, T. (2005). An early electrophysiological response associated with expertise in letter perception. Cognitive, Affective, & Behavioral Neuroscience, 5, 306–318. Woodman, G. F., Vogel, E. K., & Luck, S. J. (2001). Visual search remains efficient when visual working memory is full. Psychological Science, 12, 219–224. Yi, D., Turk-Browne, N. B., Chun, M. M., & Johnson, M. K. (2008). When a thought equals a look: Refreshing enhances perceptual memory. Journal of Cognitive Neuroscience, 20, 1371–1380. (Received January 27, 2009; Accepted January 6, 2010)
Psychophysiology, 48 (2011), 44–54. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01039.x
Electrophysiological correlates of language switching in second language learners
MAARTJE VAN DER MEIJ,a,b FERNANDO CUETOS,b MANUEL CARREIRAS,c,d,e and HORACIO A. BARBERa,d a
Department of Cognitive Psychology, University of La Laguna, Tenerife, Spain Department of Psychology, University of Oviedo, Asturias, Spain Basque Center on Cognition Brain and Language, Guipu´zcoa, Spain d Instituto de Tecnologı´ as Biome´dicas, University of La Laguna, Tenerife, Spain e IKERBASQUE, Basque Foundation for Science, Vizcaya, Spain b c
Abstract This study analyzed the electrophysiological correlates of language switching in second language learners. Participants were native Spanish speakers classified in two groups according to English proficiency (high and low). Event-related potentials (ERPs) were recorded while they read English sentences, half of which contained an adjective in Spanish in the middle of the sentence. The ERP results show the time-course of language switch processing for both groups: an initial detection of the switch driven by language-specific orthography (left-occipital N250) followed by costs at the level of the lexico-semantic system (N400), and finally a late updating or reanalysis process (LPC). In the high proficiency group, effects in the N400 time window extended to left anterior electrodes and were followed by larger LPC amplitudes at posterior sites. These differences suggest that proficiency modulates the different processes triggered by language switches. Descriptors: ERP, Bilingualism, Code-switching, Second language learning
Revised Hierarchical Model (RHM) of Kroll and Stewart (1994), which offers an explanation for the changes produced by second language acquisition. The RHM assumes an independent separate lexicon for each language and an integrated shared semantic/conceptual system. The model also proposes that the link between the first language (L1) and conceptual knowledge is very strong, whereas the link between the lexicon of the second language (L2) and the semantic/conceptual system changes during the process of second language acquisition. In an early stage of learning, there will be strong links from the L2 lexicon to L1 and weak links between the L2 lexicon and the semantic/conceptual system, with a tendency to access the meaning of words in L2 via the equivalent in their L1 (lexical mediation). With increasing competence in L2, the links from the L2 lexicon to conceptual knowledge become stronger, and learners will be capable of directly accessing the meaning of words in L2 and depend less on the link between the two lexicons (conceptual mediation). The predictions of the RHM have been tested using behavioral measures (reviewed in Kroll & Tokowicz, 2005), and in some electrophysiological studies (e.g., Midgley, Holcomb, & Grainger, 2009a; Rodrı´ guez-Fornells, de Diego Balaguer, & Mu¨nte, 2006). Although more evidence is clearly needed (Brysbaert & Duyck, in press), the RHM could be an appropriate framework to interpret online measures of brain activity in second language learners. Event-related potentials (ERPs) have been a very useful tool to track the changes that take place in the brain when people are
The need to learn a new language after childhood has increased in recent years, due to the growing possibilities of travel and work in other countries. Therefore, studying how new languages are acquired and how they interact in the brain has become an important topic with social implications. In this investigation, we use online recordings to measure electrical brain activity when adult language learners read in their new language and, more specifically, we explore what happens when they encounter a switch between languages. Second Language Acquisition Most psycholinguistic models of bilingualism are models of second language processing without a developmental component (e.g., Dijkstra & Van Heuven, 1998, 2002). One exception is the We are grateful to Dr. Van Petten and two anonymous reviewers for all their helpful comments on a previous version of the manuscript. Maartje van der Meij and Fernando Cuetos were funded by the European Research and Training Network: Language and Brain (Marie-Curie Program). Manuel Carreiras was funded by the grants SEJ2006-09238 and CONSOLIDER-INGENIO 2010 (CSD2008-00048) of the Spanish Ministry of Science. Horacio A. Barber was funded by the ‘‘Ramo´n y Cajal’’ program and the grant SEJ2007-67364 of the Spanish Ministry of Science. Address correspondence to: Maartje van der Meij, University of La Laguna, Campus de Guajara s/n, Facultad de Psicologı´ a, 38205FS/C de Tenerife, Spain. E-mail:
[email protected] 44
Electrophysiological correlates of language switching learning a second language. One productive line of research has looked at changes in brain electrical activity during the learning of artificial languages (Bahlmann, Gunter, & Friederici; 2006; Friederici, Steinhauer, & Pfeifer, 2002). Other experiments have studied the impact of semantic or syntactic violations in second language learners during reading, either in longitudinal studies (Osterhout, McLaughlin, Pitka¨nen, Frenck-Mestre & Molinaro, 2006), or by comparing different groups with different levels of proficiency in their L2 (see review in van Hell & Tokowicz, 2010). In the present experiment, we studied the learning development in two groups of learners with different L2 proficiency, by analyzing the ERP correlates of language switching. ERP Associated with Language Switching In balanced bilingual populations, the substitution of an element (e.g., a word, phrase, or sentence) from one language with another from a different language is a common phenomenon, and is usually known as code switching. Language substitutions also happen in verbal exchanges between language learners, but probably for different reasons. Learners can use the L1 when they do not know the equivalent form in their L2. Switching between languages can be understood more as a compensatory strategy in the early stages of learning a second language, whereas for more balanced bilinguals switching may be associated with high competence. For this reason, we prefer to use the more general term of language switching instead of code switching when we refer to language learners who have not achieved a high level of proficiency in their L2 and do not use their second language in everyday life. Moreno, Federmeier, and Kutas (2002) carried out an ERP study with bilinguals who were proficient both in English and in Spanish. They compared switches between languages to withinlanguage, lexical switches in English sentences, which could end with the expected English word, its Spanish translation (code switch), or an English synonym (lexical switch). Lexical switches enhanced the N400 response (250–450 ms) maximum at right parietal sites, whereas code switches produced an increased negativity over left frontal sites (LAN), which was followed by a large posterior positivity (late positive complex or LPC) in the 450–850 ms time window. The N400 has been described as an index of the difficulty of meaning activation/integration processes in sentences; the more predictable a word, the smaller the N400 elicited (Kutas & Federmeier, 2000; Kutas & Hillyard, 1980; Molinaro, Conrad, Barber, & Carreiras, 2010). In contrast, LAN effects have been linked to working memory load and syntactic integration processing (Barber & Carreiras, 2005; Friederici, Pfeifer, & Hahne, 1993; Kluender & Kutas, 1993). The authors suggested that, for these bilinguals, the costs of switching between languages might be associated with decisionrelated processes more than lexico-semantic processing. Proverbio, Leoni, and Zani (2004) performed a similar study with native Italian simultaneous interpreters who had to read sentences for comprehension in Italian and in English with an English or Italian ending. In contrast to the previous study, they reported an N400 effect in response to code switching from L1 to L2 but not from L2 to L1. The authors claim that this asymmetric effect reflects difficulty in the semantic integration when an L2 word is encountered, because their participants acquired L2 after the consolidation of the conceptual system. Therefore, code switching in balanced bilinguals has resulted in ERP amplitude differences in the N400 time window, but, depending on the specific topographical distribution of the effect, they have
45 been sometimes attributed to lexico-semantic processing and sometimes not. According to the RHM, at early stages of second language acquisition, the link between L1 and L2 at the lexical level is stronger than at later stages, in which the L2 has established links with the conceptual system. Attending to this premise, we predict N400 effects associated with language switching in second language learners, which would indicate increased costs at the lexical level. Moreno et al. (2002) carried out a preliminary evaluation of the impact of proficiency on the ERP correlates of codeswitching. They reported a regression analysis on ERP measures and participant scores in the Boston Naming Test (Kaplan, Goodglass, & Weintraub, 1983) performed in English and Spanish. They found that increased Spanish vocabulary was predictive of both smaller mean amplitudes and earlier peak latencies of the LPC responses to code switches. However, it is important to note that the group of participants included both English and Spanish dominant bilinguals, therefore this data refers to proficiency in the switched language with respect to the base language, but not to the balance between L1 and L2. The present study has been designed to further explore the impact of L2 proficiency on the electrophysiological correlates of language switching, comparing two homogenous groups of Spanish speakers (L1) with different levels of proficiency in English (L2). Early Orthographic Effects During Language Switching Language switching in reading could also affect lower level processing related to the orthographic characteristic of each language, and that processing will also be examined in the present study. A relevant topic in second language processing is crosslanguage competition during word recognition. In addition to the described effects related to the ongoing integration of the switched words in the sentence context, some models propose that language selection of the input benefits from other sources of information, including the different orthographies of the languages (Dijkstra & Van Heuven, 2002). Detection of a language switch could take place at an early orthographic stage of word recognition that precedes meaning activation. Surprisingly, behavioral evidence has failed to support the role of orthography in language selection (e.g., Thomas & Allport, 2000). However, the time resolution of the ERP technique allows us to better test this prediction, looking at early effects associated with language switching. As we describe below, there are several ERP effects in the published literature that could index early detection of a change in orthographic form when a language switch is encountered. Many prior studies have reported early ERP effects before the range of the N400 component, usually related to orthographic and phonological processing (see review in Barber & Kutas, 2007). These include effects of letter transposition (Carreiras, Vergara, & Perea, 2009), effects of consonants and vowels when manipulated selectively (Carreiras, Dun˜abeitia, & Molinaro, 2009; Carreiras, Gillon-Dowens, Vergara, & Perea, 2009) and effects of phonological syllable frequency (Barber, Vergara, & Carreiras, 2004; Carreiras, Vergara, & Barber, 2005). Several studies have shown that the visual N1 component, peaking around 170 ms after word onset, is modulated at left occipital electrodes by orthographic variables (Maurer, Brandeis, & McCandliss, 2005). These so-called N170 effects associated with word reading have been linked to activity in the occipital temporal fusiform gyrus of the left hemisphere (Glezer, Jiang, & Riesenhuber, 2009; McCandliss, Cohen, & Dehaene, 2003; Price
46 and Devlin, 2003). At a slightly later latency, a negativity peaking at 250 ms with a central distribution (N250) has been observed in masking priming paradigms, and has also been claimed to reflect the mapping of sublexical information (e.g., ordered letter combinations) onto whole-word orthographic representations (Holcomb & Grainger, 2006). Interestingly, this N250 was found in response to language changes in two masked priming experiments when L2 words were preceded by unrelated L1 words (Chauncey, Grainger, & Holcomb, 2008), or by their L1 equivalent, but the time-course of this translation effect was somewhat laterFpeaking at 300 ms (Midgley, Holcomb, & Grainger, 2009b). In the ERP studies of code switching reviewed above, Moreno et al. (2002) did not report differences before the N400 time window in response to code switching, but Proverbio et al. (2004) described a modulation of the N1 component at left anterior electrodes. In the ERP studies of code switching in sentences reviewed above, neither Moreno et al. (2002) nor Proverbio et al. (2004) reported main effects of code switching before the N400 time window. However, Proverbio et al. (2004) described an interaction of semantic incongruence, code switching, and switching direction between 130 and 200 ms at some frontal electrodes, for which code switching from L2 to L1 produced amplitudes that are more negative and only with semantically incongruent words. In our study, the activation of the orthographic and phonological L1 and L2 patterns should be less balanced in second language learners as compared to bilinguals fluent in both languages, and this circumstance should enhance orthographic effects in response to language switching either in the N170 or in the N250 components. The Present Experiment Proficiency is a critical variable in the study of second language processing and one that has not been consistently controlled or manipulated in previous ERP studies of bilingualism in general (Kotz, 2008), and language switching in particular (van Hell & Witteman, 2009). The main goal of the present study is to investigate how proficiency in a second language affects the ERP correlates of language switches when second language learners switch from English (L2) to Spanish (L1). Following the RHM model, we hypothesize that, in contrast to the case of balanced bilinguals, switching across languages for second language learners can be understood more as lexical switching, because the connections between L1 and L2 words are still very strong, while connections between L2 lexical items and the conceptual system are still weak. Therefore, in contrast to Moreno et al. (2002), we predict an N400 effect in response to the code switches, and, in contrast with Proverbio et al. (2004), we expect to find this effect even when learners have to switch from their L2 to their L1. Moreover, if second language learners behave differently from balanced bilinguals in response to code switching, we can expect that the level of proficiency in L2 will modulate these differences. Therefore, we are also interested in the ERP changes that take place when competence in the second language increases. We tested two different groups of learners with different levels of English (L2) proficiency living within a monolingual Spanish environment. They were all Spanish speakers attending English courses, but since in Spain books, films, and other productions are usually translated, dubbed, or voiced over in Spanish, they had little exposure to English via television or other means. Therefore, our participants do not use English in their daily life and are very far from the competence level of balanced bilinguals (those with similar skills in L1 and L2). The difference between
M. van der Meij et al. the two groups was limited to L2 proficiency, and there were no significant differences in environmental exposure to English or in language learning methods. All of the participants share a similar socio-linguistic background. Comparing language-switching processing in these two groups, which differ only in their amount of training, will give us some insights about the development of second language processing. Method Participants To select high and low proficiency second language learners, we recruited 58 potential participants from three different language schools in Tenerife and administered an English aptitude test. The test, from the modern languages school of the University of La Laguna, Spain, contains 60 multiple-choice questions on vocabulary and grammar and yields a proficiency level of 1 to 4. Table 1 shows examples of the test questions. According to the standards of the Common European Framework of Reference for Languages published by the Council of Europe (2001), the international equivalent of these levels is as follows: Level 1 5 A1 (Breakthrough); Level 2 5 A2 (Waystage) and B1 (Threshold); Level 3 5 B2 (Vantage); Level 4 5 C1 (Effective Operational Proficiency) and C2 (Mastery). It is important to note, however, that the label ‘‘Mastery’’ is not intended to imply native-speaker or near native-speaker competence. Individuals scoring Level 2 and Level 4 were recruited to form the low and high proficiency groups, respectively. The 36 (22 male and 14 female) participants were aged 19 to 39 (mean age 27.4 years) and were all Spanish native speakers living in Spain. The participants were attending year-long English courses at the language schools when tested, and can therefore be considered active second language learners of English. Although they reported a mean age of acquisition (AoA) of 8.5 years at school, note that this average AoA could be misleading because it refers to the first classroom instruction in the Spanish education system, which usually involves a very superficial contact with the language. Most important for this study was their level of processing a written text in English. Therefore, we focused on reading skills and interpretation of written English and, after the objective English test, we administered a self-rating of English ability (LEAP-Q by Blumenfeld & Kaushanskaya, 2007). Self-reports were on a scale of 1 to 10, where 1 was almost none and 10 like a native speaker. Table 2 shows the ratings of each group on these tests. After the experiment, the experimenter had a brief chat in English with the subjects asking them about their difficulties during the experiment and their experience Table 1. Examples of the Questions Used to Assign Participants to the Two Groups Selection Items of Proficiency Pre-test of English as L2 Some people . . . Scotland speak a different language called Gaelic. Would it . . . you if we came on Thursday? A building which was many . . . high was first called a skyscraper in the United States at the end of the 19th century. I find the times of English meals very strangeFI’m not used . . . dinner at 6 pm.
Choices on-in-at agree-suit-like-fit stages-steps-storeyslevels to have-to havinghaving-have
Electrophysiological correlates of language switching
47
Table 2. Characteristics of the Participants Assigned to the Two Groups, Regarding Their Level of English Low proficiency Men:Women Age in years AoA in years English pre-test Self-rated speakingn Self-rated hearingn Self-rated readingn
(sd) (sd) (1–4) (sd) (sd) (sd)
12:6 27.3 8.2 2 5.3 5.9 6.8
(5.3) (2.7) 4 (1.6) (1.6) (1.7)
High proficiency
T-test (p)
10:8 27.5 (4.9) 8.7 (3.4) – 7.5 (1.3) 7.7 (1.5) 8.6 (1.3)
– Ns Ns 4.01 4.01 4.01
Note: Self-reports were on a scale of 1 to 10, where 1 was almost none and 10 was like a native speaker. AoA, age of acquisition; Ns, not significant.
with foreign languages. None of the participants reported frequent language switches in their everyday life, they only reported occasional language switches, especially in situations of explicit language learning. The participants also found it easier to read than to listen to or produce English, see Table 2. All participants had normal or corrected-to-normal eyesight and no neurological history, and were right-handed as assessed by a Spanish version of the Edinburgh inventory (Oldfield, 1971). Stimuli Stimuli were 160 English sentences of 9 to 12 words. All had a similar structure, namely, compound sentences that included a subordinate clause, starting with a noun phrase followed by either a participle phrase after the noun, or a clause introduced by a relative pronoun such as ‘‘that’’ or ‘‘which’’ (e.g., ‘‘The house that we rented was furnished and felt cozy’’). Each sentence contained an adjective that could occur in English (no switch condition, 80 sentences) or in Spanish (switch condition, 80 sentences, e.g., ‘‘The house that we rented was amueblada and felt cozy’’). The adjectives always referred to the first noun of the sentence and were semantically congruent with the rest of the sentence. Typical word order is not identical in Spanish and English, so that the word order used here was grammatically correct in both languages, although not always the most frequently used one. Adjectives with a similar orthographic or phonological form between languages (i.e., false friends, cognates, or homophones) were not included. Average frequency of the Spanish adjectives was 22 per million with an average word length of 7 letters and 2 orthographic neighbors, reported with BuscaPalabras (Davis & Perea, 2005). The English adjectives had a length of 4–10 letters, an average frequency of 64.95 (SD 5 136.17) per million, and an average of 2 orthographic neighbors according to the Celex lexical database (Baayen, Piepenbrock, & Van Rijn, 1993). However, these numbers must be interpreted with care since we cannot assume a high correlation with the frequency of use in the classroom environment. For this reason, all the sentences were checked by teachers of the language schools to make sure the participants were familiar with the vocabulary. The sentences were counterbalanced, so each adjective appeared in both conditions, thus creating two different lists. Procedure During online recording, each participant was seated in a soundproof, electrically shielded room at the University of La Laguna approximately 80 cm from a CRT computer. Sentences were presented one word at a time in a grey-green lower case font
against a black background via Presentation software (Version 0.70, http://www.neurobs.com). Prior to each sentence, there was a centered ‘‘1’’ sign for 1000 ms and then a blank screen for 500 ms. Each word was visible for 300 ms with a blank screen of 200 ms between words. To create an onset asynchrony between sentences, a blank screen appeared after each sentence with a variable duration of 500–1000 ms. Participants were instructed to read for comprehension, to blink only when there were no words on the screen, to relax their muscles, and to move as little as possible. There were two breaks during the experiment. After the experiment in L2 reported here, the participants were engaged in an unrelated 10-min-long experiment in Spanish. Therefore, the total length of the experiment was 2 h including electrode set up. The session started with a short practice in the presence of the researcher. At the end of each sentence, the participant either pressed a button to continue or, for a third of the sentences, answered a yes/no comprehension question (e.g., ‘‘Did I rent a flat?’’ after ‘‘The house that we rented was furnished and felt cozy’’). The questions were included to ensure that participants were reading the sentences for comprehension and were about the verb, the noun, or the adjective (one-quarter of the questions focused on the adjective). One third of the sentences were followed by a comprehension question, and half of the questions appeared after a sentence with a language switch and half after a sentence without a language switch. For the odd-numbered participants, the right hand was used to signal the ‘‘Yes’’ response and the left hand to the ‘‘No’’ response, and for the even-numbered participants, the order was reversed. EEG Recording and ERP Analyses The electroencephalogram (EEG) was recorded with 27 Ag/ AgCl electrodes embedded in an elastic cap (Easycap, www.easy cap.de) referenced to the left mastoid. Figure 1 shows a schematic representation of the electrode arrangement. Two pairs of electrodes above and below the right eye and on the outer canthi of
Figure 1. Schematic flat representation of the 27 electrode positions from which EEG activity was recorded. The electrodes analyzed in the ANOVAs are marked.
48 each eye registered vertical and horizontal eye movements (EOG). All electrical activity was recorded and amplified with a bandwidth of 0.01–100 Hz and a sampling rate of 500 Hz using battery-powered amplifiers (Brain Products, www.brainproducts. com). Impedance was equal to or less than 5 kO for all electrode sites except for the four eye channels, which were kept below 10 kO. EEG was stored and ERPs were later analyzed using BrainVision Analyzer 2.0 software (Brain Products). The offline filtering of the recordings consisted of a low cutoff filter of 0.1 Hz and a high cutoff of 30 Hz. Data was re-referenced to the algebraic mean of the right and the left mastoids. Blinks were corrected in the recording of only two participants that presented an excessive number of ocular artifacts following the procedure proposed by Gratton, Coles, and Donchin (1983). Artifacts were removed semi-automatically, with rejection values adjusted for each participant. This resulted in the exclusion of approximately 7% of the trials, which were evenly distributed across experimental conditions. The data were segmented relative to reference marker positions, 100 ms before and 1000 ms after onset of the adjective. Baseline correction was performed using the average EEG activity in the 100 ms preceding word onset. Mean amplitudes were obtained for different time windows selected after visual inspection of the grand average waveforms: 200–300, 300–450, 450–650, and 650–850 ms. Since in the 200– 300 ms time-window there can be effects with different polarity that overlap with the onset of later effects, prefrontal (Fp1, Fp2, F3, and F4) and parieto-occipital (P7, P8, O1, and O2) electrodes were analyzed separately. Mean voltage amplitudes relative to the start of the critical adjective were subjected to an omnibus analysis of variance (ANOVA) with Proficiency (low, high) as a between-group factor, Switch (switch, no switch) as a within-subject factor, and two topographical factors: Hemisphere (left, right), and Anterior Posterior (AP) (more anterior, more posterior). For the analyses of the other time windows (300–450, 450– 650 and 650–850), we organized the data from 20 electrodes (F3, Fc1, C3, Cp1, P3, F7, Fc5, T7, Cp5, P7, F4, Fc2, C4, Cp2, P4, F8, FC6, T8, Cp6, P8) into a grid-like scheme (see Figure 1) via
M. van der Meij et al. three topographic factors of Hemisphere, Distance to midline (DM) (one position from midline, two positions from midline), and an AP factor with five levels (frontal, frontal-central, central, central-parietal, parietal). Repeated measures ANOVAs included a Bonferroni correction to control for type I error in multiple comparisons and all p values, mean squared error (MSE), and partial eta squared (Z2p) are reported corresponding to Greenhouse-Geisser. When violating the sphericity assumption, we report the Greenhouse Geisser-epsilon (e) to correct for the degrees of freedom. To show the relation of a switch to English and its distribution by group, we include a polynomial contrast for the only factor with five levels, AP. The reported main effects or interactions are limited to those related to the experimental condition Switch. For all statistical analyses, we used the program R (http://www.r-project.org). Results ERPs time-locked to the onset presentation of the critical word are shown in Figure 2, after averaging the data of all the participants for the two experimental conditions (language switching versus non-language switching), plotted in four representative electrodes. Figures 3 and 4 show the same grand averages in a larger set of electrodes and separately for the low and high English proficiency groups. At posterior sites, the P1 and N1 components, which have been associated with the processing of visual stimuli, are clearly visible. Consistent with previous reports on the perception of linguistic stimuli, N1 amplitude is asymmetric across lateral sites, larger over the left than the right hemisphere. Relative to nonlanguage switches, language switches elicited an early negativity between 200 ms and 300 ms after word onset with a left occipitotemporal distribution (labeled as LO-N250 hereafter). In addition, starting also at 200 ms and lasting until the end of the analyzed segment, a prefrontal positivity (Fp1 and Fp2) distinguished language switches from no-language switches. After 300 ms post-target word presentation, a centro-parietal negativity, with a duration of 150 ms and peaking around 400 ms, shows
Figure 2. Grand average waveforms from data of all participants for the two experimental conditions (switching versus no switching), plotted in four representative electrodes, in which the main differences can be appreciated: Frontal Positivity, Left-Occipital N250, N400, and Late Positive Complex (LPC).
Electrophysiological correlates of language switching
49
Figure 3. Grand average waveforms of the two conditions (switching versus no switching) for the Low proficiency group.
more negative values for the language switch than the no-language switch condition. This negativity is followed by a positivity for language switches relative to no-language switches starting at 450 ms after critical word onset, with a duration of around 400 ms. Figure 5 shows the topographical distribution of these effects over the scalp, after subtracting ERP activity elicited by the correct sentences from that of the language-switching sentences. As mentioned above, differences between 200 and 300 ms after word onset are focal and localized in the left occipital-temporal electrodes. A second negativity between 300 and 450 ms shows a broader distribution maximum at parietal sites of the right hemisphere for the low proficiency group (upper panel), whereas for
the high proficiency group this effect is also maximum at the right-parietal sites but additionally visible at left-anterior sites (lower panel). Thus, the effect shows the typical N400 distribution in both groups, but additionally the distribution of the high proficiency group N400 extends to left frontal areas. The late positivity (between 450 and 850 ms), which is identified as an LPC, is divided in two different time windows. The first time window (between 450 and 640 ms) shows the early LPC with a broad anterior-posterior distribution, but maximum at the frontal sites for the low proficiency group, and at the posterior sites for the high proficiency group. The LPC continued in the second time window but now localized only over the posterior area in
Figure 4. Grand average waveforms of the two conditions (switching versus no switching) for the High proficiency group.
50
M. van der Meij et al.
Figure 5. Topographical distribution of the language-switching effects by group (high and low proficiency) in the four analyzed temporal windows: 200– 300, 300–450, 450–650 and 650–850 ms. Voltage maps were obtained for the averaged values of difference waves (switching minus no switching).
both groups. The difference waveforms in Figure 6 show that the changes in scalp distribution are due to the main differences between the effects in both groups: a) at frontal sites in the N400 time window (high proficient group producing a larger negativity than the low proficient group), and b) in the posterior sites in the first phase of the LPC (high proficient group producing a larger positivity than the low proficient group). Below, these observations about the various latency windows are tested to confirm their statistical significance. Time Window Between 200 and 300 ms The ANOVA (Proficiency ! Switch ! Hemisphere ! AP) on the mean amplitudes of four occipito-parietal electrodes (O1, O2, P7, and P8) resulted in a main effect of Switch
(F(1,34) 5 23.23; po.001; MSE 5 37.22; Z2p 5 0.41), an interaction between Switch and Hemisphere (F(1,34) 5 11.51; po.01; MSE 5 5.37; Z2p 5 0.25). This pattern of interaction confirms the specific topographic distribution of the effect, which is maximum at the left occipito-parietal electrodes. The frontal positivity for language switches, visible in the prefrontal electrodes, was analyzed with a similar ANOVA (Proficiency ! Switch ! Hemisphere ! AP) including four frontal and prefrontal electrodes (Fp1, Fp2, F3, and F4). This ANOVA resulted in a twoway interaction between Switch and AP (F(1,34) 5 12.71; po.01; MSE 5 6.01; Z2p 5 0.27). The absence of interactions with the factor Proficiency in this time window shows that code switching affected ERPs independently of the level of L2 proficiency.
Figure 6. Difference waveforms of the High proficiency group versus the Low proficiency group, obtained by subtracting the no switch condition from the switch condition.
Electrophysiological correlates of language switching
51
Time Window Between 300 and 450 ms The ANOVA (Proficiency ! Switch ! Hemisphere ! AP ! DM) showed a main effect of Switch (F(1,34) 5 5.50; po.05; MSE 5 62.03; Z2p 5 0.14), interaction effects of Switch with Hemisphere (F(1,34) 5 6.34; po.05; MSE 5 15.79; Z2p 5 0.16) Switch with AP (F(4,136) 5 9.81; po.001; e 5 0.35; MSE 5 23.20; Z2p 5 0.22). There are also three-way interactions of Switch with Hemisphere with AP (F(8,272) 5 16.90; po.001; e 5 0.48 MSE 5 6.32; Z2p 5 0.33), and Proficiency with Switch with AP (F(8,272) 5 5.60; po.05; e 5 0.35; MSE 5 13.24; Z2p 5 0.14). A test for linear trends (using polynomial contrasts) reveals a significant linear trend for the interaction between Proficiency, Switch, and AP (F(1,34) 5 6.38; po.05; MSE 5 17.57; Z2p 5 0.16). This last interaction reflects the fact that the Switch effect was nearly the same at anterior as posterior electrodes for the more proficient group, whereas the low proficient group showed a similar negativity only at posterior electrodes. See Table 3 for mean amplitudes of the Switch effect in the two groups, separately for six of the electrodes included in the analyses. In summary, both groups show differences between conditions in the N400 time window, but with the typical right posterior distribution for the low proficiency group, and a more widespread distribution for the high proficiency group. Note that the frontal positivity starting after 200 ms post-target word presentation remains also visible in this time window in frontal electrodes for the low proficiency group only. This positivity probably overlaps with the frontal negativity in the case of the high proficiency group. Time Window Between 450–650 ms The analyses of the amplitude means in this time window by ANOVA (Proficiency ! Switch ! Hemisphere ! AP ! DM) showed a main effect of Switch (F(1,34) 5 35.37; po.001; MSE 5 793.02; Z2p 5 0.51), interaction effects of Switch with AP (F(4,136) 5 4.27; po.05; e 5 0.35; MSE 5 20.01; Z2p 5 0.11), and Switch with DM (F(1,34) 5 27.29; po.001; MSE 5 66.87; Z2p 5 0.45). There are also three-way interactions of Switch with Hemisphere and AP (F(4,136) 5 3.65; po.05; e 5 0.61 MSE 5 1.05; Z2p 5 0.02), and Switch with AP and DM (F(4,136) 5 4.68; po.01; e 5 0.66; MSE 5 1.45; Z2p 5 0.12), as well as the four- way interaction of Switch with Hemisphere with AP with DM (F(4,136) 5 7.11; po.001; e 5 0.75; MSE 5 3.55; Z2p 5 0.17). These interactions of the factor Switch with the topographical factors are consistent with the larger amplitudes of the LPC at posterior electrodes, over the midline, and, in this time window, slightly lateralized to the left. Importantly, there was also a three-way interaction of Proficiency with Switch with
Table 3. Mean Differences (Switch Versus No Switch) for the Two Groups in Six Electrodes (FC1, FC2, C3, C4, CP1, CP2) in the Time Window of the N400 and the LPC Time window Proficiency FC1 FC2 C3 C4 CP1 CP2
300–450 ms
450–650 ms
Low
High
Low
High
0.23 0.01 " 0.41 " 0.84 " 0.56 " 0.91
" 0.69 " 0.74 " 0.70 " 0.84 " 0.68 " 0.89
2.26 1.84 1.65 1.10 1.92 1.46
2.00 1.91 2.02 2.04 2.76 2.66
AP (F(4,136) 5 5.18; po.05; e 5 0.13; MSE 5 24.25; Z2p 5 0.26). Polynomial contrasts showed a significant linear relation between Proficiency, AP, and Switch (F(1,34) 5 6.10; po.05; MSE 5 32.85; Z2p 5 0.15). This three-way interaction is explained by between-group differences in the size of the switch effect at posterior sites. Although both groups show the LPC both at anterior as at posterior sites, the high proficient group shows a larger LPC only at the posterior areas. See Table 3 for mean differences of both groups separately. Time Window Between 650–850 ms The ANOVA (Proficiency ! Switch ! Hemisphere ! AP ! DM) in this time window showed a main effect for Switch (F(1,34) 5 38.37; po.001; MSE 5 550.36; Z2p 5 0.53), and the interaction effects of Switch with Hemisphere (F(1,34) 5 9.77; po.01; MSE 5 20.58; Z2p 5 0.22), Switch with AP (F(4,136) 5 12.22; po.001; e 5 0.34; MSE 5 61.88; Z2p 5 0.26), and Switch with DM (F(1,34) 5 24.14; po.001; MSE 5 55.31; Z2p 5 0.42). Additionally, two triple interactions were revealed: Switch with Hemisphere with DM (F(1,34) 5 9.38; po.01; MSE 5 3.22; Z2p 5 0.22) and Switch with AP with DM (F(4,136) 5 3.32; po.05; e 5 0.60; MSE 5 1.31; Z2p 5 0.09). Linear contrasts reveal that code switching elicited a large positivity in this time window in comparison with no switches. These effects support the parietal distribution of the effect, which was larger around the midline and slightly lateralized to the right hemisphere. There is no interaction with the factor Proficiency, confirming the same magnitude and topographical distribution of this effect for both groups. Discussion This study investigated the processing of mixed language sentences in order to study ERP correlates of language switching in second language learners reading for comprehension. ERPs were obtained from Spanish speakers reading English sentences, half of which contained an adjective in Spanish in the middle of the sentences. We also explored the influence of second language learner proficiency on processing language switching, by including two groups: high and low level of English (L2). The language switch manipulation resulted in a pattern of sequential effects in the different time windows analyzed. In the time window of 200 to 300 ms after target word onset, switching to Spanish elicited an early negativity with a left occipital distribution in comparison with the English adjectives. Also starting at 200 ms, a long-lasting prefrontal positivity distinguished switches from the singlelanguage sentences. In the 300–450 ms time window, language switching yielded an N400 effect with a broad distribution but maximum at right centro-parietal scalp. Finally, between 450 and 850 ms, the code language condition showed a large positive waveform widely distributed and visible at almost all scalp sites. There were also group differences in the scalp distribution of some of the language-switching effects that depended on L2 proficiency. The participants in the high level group generated a left anterior negativity in addition to the N400 effect, and larger LPC amplitudes than the low proficient group only in the posterior areas. The first early negativity in response to language switching is visible between 200 and 300 ms, and one remarkable characteristic of this effect is that it is visible only at the left occipitotemporal electrodes. This focal scalp distribution is similar to that of the N170 response to words described in previous studies, which usually overlaps with and contributes to the visual N1
52 component (Maurer et al., 2005). Proverbio et al. (2004) reported ERP differences in the time range of the N1 component when code switching involved semantically incongruent words, but at frontal rather than posterior electrodes and only when switching from L1 to L2. In our data, the N1 component peaks at 170 ms and shows the classic leftward asymmetry attributed to orthographic processing (Nobre & McCarthy, 1994; Schendan, Giorgio, & Kutas, 1998). However, as language-switching effects do not start until 30 ms later and peak at 250 ms (see Figure 2), it is reasonable to speculate that this effect could reflect the activity of the same generator as the N170, and the delay in its onset would reflect subsequent processing after the initial lowlevel orthographic processing, i.e., the detection or switching to different patterns of orthographic regularities associated with a particular language (Dijkstra & Van Heuven, 1998, 2002). In this regard, it should be noted that N170 effects are typically obtained in word list experiments, whereas in the present study words were embedded in sentence contexts. N250 effects with a central scalp distribution have also been associated with orthographic processing (Holcomb & Grainger, 2006), but it is difficult to establish a functional relationship with the earlier N170 because this component has only been obtained with masked priming paradigms. In summary, our LO-N250 shows a similar latency as the central N250 obtained in masked priming paradigms, and shares its left occipital scalp distribution with the N170 effects reported with single word reading. Both N170 and N250 effects have been attributed to initial orthographic processing. Therefore, the left-occipital N250 described in the present study could reflect the detection or activation of a different set of orthographic/phonological rules, when changing from one language to another. This N250 takes place before the activation/ integration of word meaning, which is consistent with the fact that the left-occipital N250 was independent of the later N400 effect and the level of proficiency of the participants. Simultaneous with the onset of the left-occipital N250 (around 200 ms), a positivity in response to language switching starts at the prefrontal electrodes (see Figure 2). The reverse polarity and common onset of the anterior and posterior differences could suggest that both effects are the reflection of a single dipole. However, while the posterior effect lasts for 200 ms and its distribution is asymmetric across hemispheres, the frontal one is not lateralized and persists for several hundreds of milliseconds. This discrepancy could be due to an overlap with other effects, or alternatively it could mean that they are originated by different neural generators, and therefore associated with different cognitive operations. For example, frontal positivities have been reported in other language-switching studies, and have been linked to the executive control system (Rodrı´ guez-Fornells et al., 2006). However, a more plausible explanation is that this positivity is the early onset of the LPC, with its maximum amplitude peaking in the late time windows. The characteristics of the second negativity in response to language switching fit with a modulation of the classical N400 component. Semantically unexpected or difficult-to-integrate words in a given semantic context modulate this negativity between roughly 200 to 500 ms after target word onset with a right parietal distribution (Kutas & Federmeier, 2000). In the present experiment, meanings associated with the switched words should be as easy to integrate in the context as meanings of the nonswitched words, because they are mostly the same. The N400 component is also sensitive to bottom-up processes of word recognition, and correlates with the costs of meaning activation
M. van der Meij et al. (Barber & Kutas, 2007). For example, it is known that, without other constraints, lexical frequency inversely correlates with the N400 amplitude; the higher the frequency, the smaller the N400 (Barber et al., 2004; Van Petten & Kutas, 1990). Importantly, our results cannot be explained considering the frequency of use of the target words because participants switched from their L2 (less frequent words) to their L1 (with a higher subjective frequency). Proverbio et al. (2004) proposed that age of acquisition of L2 words, and not proficiency, was the key factor to explain their N400 effect when switching from L1 to L2, but again this explanation cannot be applied to our switching effects in the opposite direction. The most plausible account of the current result is that the N400 reflects the activation costs of the specific lexical forms in the less active language. This idea is consistent with models of bilingual processing that include separate lexicons with different levels of activation depending on the language in use, at least at some stages of second language learning. Converging evidence supporting the existence of separate lexicons with different access mechanisms in second language learners comes from a recent ERP study that reported larger N400 amplitudes for L1 words than L2 words in a block design without language switching (Midgley et al., 2009a). According to the RHM, in the first steps of second language acquisition, the lexicon of L2 does not have a direct link to the concept level and is therefore heavily dependent on the L1 lexicon. The link from L2 words to L1 words would be strong, facilitating the switching between languages at the lexical level, which is consistent with the N400 effect found in our study. Negativities in this same time window but with left frontal distributions (LAN) have been linked to working memory load and syntactic parsing operations (Barber & Carreiras, 2005; Friederici et al., 1993; Kluender & Kutas, 1993). Moreno et al. (2002) reported a left anterior negativity but not an N400 effect in response to code switching with balanced bilinguals. They interpreted this negativity as unrelated to semantic processing, but merely reflecting a working memory load due to the integration of a Spanish word into an English context. The left frontal negativity found only for the high proficiency group in our study is consistent with the report of such negativity in balanced bilinguals. In other words, language-switching processing in our group of high proficient learners seems to be closer to that of the balanced bilinguals. This negativity, if related with the syntactic LAN effects, would reveal a higher influence of the L2 grammar in the integration of the switched words, and could reflect the difficulty of integrating the different grammatical rules of both languages. Language switches also elicited a late positivity (LPC) peaking around 600 ms post word onset. This late positivity was also found in the study of Moreno et al. (2002) with a posterior distribution and was sensitive to the vocabulary level in the switched language independently of the dominant language of the participants. In our data, the LPC is observed both at anterior and at posterior sites, and proficiency in L2 modulated it at posterior sites. The LPC is usually interpreted as a late appearance of the P300 component, which increases in response to unexpected events or features that are relevant for categorization of the stimuli (Donchin, 1981; Polich, 2007; Verleger, 1988). This ERP component has been proposed to reflect brain activity related to the updating of mental representations, but it is composed of subcomponents that can be elicited separately by specific stimulus and task conditions. The P3a and ‘‘novelty P300’’ usually show a central/frontal distribution and have been linked to
Electrophysiological correlates of language switching
53
attentional processes triggered by the detection of unexpected stimuli. The parietal P3b is more sensitive to the relevance of the stimulus for the task, and has been related to context updating operations and subsequent memory storage (Polich, 2007). This model is consistent with the fact that in our data the frontal positivity starts at frontal sites as soon as the first orthographic mismatch is detected, and becomes maximum by the time that all lexical, semantic, and syntactic information is available. Additionally, while both groups showed similar effects at frontal areas, the magnitude of the posterior effect was larger when proficiency in L2 increased. This interaction between anterior and posterior positivities and the level of L2 proficiency suggests that our participants perceived language switches as rare or unexpected events in a similar way independently of their level of proficiency, but updating and integration processes differed depending on the level of competence in the second language. A different but not totally incompatible view might present this late positivity as being a syntactic P600, because the relation of the P600 and the LPC is still a matter of debate (Coulson, King, & Kutas, 1998; Osterhout & Hagoort, 1999). The LPC found in our study resembles the P600 that previous literature has described in response to syntactic violations or non-preferred syntactic structures (Barber & Carreiras, 2005; Barber, Salillas, & Carreiras, 2004; Carreiras, Salillas, & Barber, 2004). Two different time-phases of the P600 have also been proposed with different sensitivity to experimental manipulations: a P600a with a broad anterior-posterior distribution over the midline, followed by the P600b with a right posterior distribution (Barber & Carreiras, 2005). In a similar way, P600 effects have also been reported with frontal or posterior distributions depending on
experimental manipulations, but the exact cognitive meaning of these changes in the topographical distribution of the effect is still unclear (Filik, Sanford, & Leuthold, 2008; Kaan & Swaab, 2003). The enhancement of the late positivity at posterior sites in the high proficiency group, preceded by a left anterior negativity in the same group, can be interpreted as a LAN-P600 pattern. In a recent study, highly proficient late bilinguals showed similar LAN-P600 effects in response to syntactic agreement violations as native speakers, but the size of the P600 effect was larger for those agreement rules shared by L1 and L2 than those which were exclusive to L2 (Gillon-Dowens, Vergara, Barber, & Carreiras, 2009). Although the sentences in the present study did not contain syntactic violations per se, language switching could induce an interaction between incompatible grammatical rules in the two languages (e.g., gender agreement rules), leading to similar integration and reanalysis processes as those resulting from syntactic operations. Therefore, the effect of L2 proficiency found in the present study could indicate a greater implication of the L2 grammar in the processing of language switching in this group of learners as compared with those with lower levels of L2 competence. To sum up, in contrast to data that suggest that languageswitching costs take place only at a decision-related stage (e.g., Moreno et al., 2002; Thomas & Allport, 2000), the present results show that, at least at some stages of second language learning, language switching during sentence reading affects several levels of processing, including early orthographic/phonological processing and lexico-semantic processing. In addition, the present results show that integration processes change with the amount of training and the level of competence in the second language, probably because L2 grammar begins to play a role.
REFERENCES Baayen, H., Piepenbrock, R., & Van Rijn, H. (1993). The celex lexical database (CD-rom). University of Pennsylvania, PA: Linguistic Data Consortium. Bahlmann, J., Gunter, T. C., & Friederici, A. D. (2006). Hierarchical and linear sequence processing: An electrophysiological exploration of two different grammar types. Journal of Cognitive Neuroscience, 18, 1829–1842. Barber, H., & Carreiras, M. (2005). Grammatical gender and number agreement in Spanish: An ERP comparison. Journal of Cognitive Neuroscience, 17, 137–153. Barber, H., & Kutas, M. (2007). Interplay between computational models and cognitive electrophysiology in visual word recognition. Brain Research Reviews, 53, 98–123. Barber, H., Vergara, M., & Carreiras, M. (2004). Syllable-frequency effects in visual word recognition: Evidence from ERPs. NeuroReport, 15, 545–548. Barber, H., Salillas, E., & Carreiras, M. (2004). Gender or genders agreement? In Ch. Clifton & M. Carreiras (Eds.), On-line study of sentence comprehension; Eyetracking, ERP and beyond (pp. 309–328). London, UK: Psychology Press. Blumenfeld, M. V., & Kaushanskaya, M. (2007). The Language Experience and Proficiency Questionnaire (LEAP-Q): Assessing language profiles in bilinguals and multilinguals. Journal of Speech Language and Hearing Research, 50, 940–967. Brysbaert, M., & Duyck, W. (in press). Is it time to leave behind the Revised Hierarchical Model of bilingual language processing after 15 years of service? Bilingualism: Language and Cognition. Carreiras, M., Dun˜abeitia, J. A., & Molinaro, N. (2009). Consonants and vowels contribute differently to visual word recognition: ERPs of relative position priming. Cerebral Cortex, 19, 2659–2670. Carreiras, M., Gillon-Dowens, M., Vergara, M., & Perea, M. (2009). Are vowels and consonants processed differently? ERP evidence with a delayed letter paradigm. Journal of Cognitive Neuroscience, 21, 275–288.
Carreiras, M., Salillas, E., & Barber, H. (2004). Event related potentials elicited during parsing of ambiguous relative clauses in Spanish. Cognitive Brain Research, 20, 98–105. Carreiras, M., Vergara, M., & Barber, H. (2005). Early ERP effects of syllabic processing during visual word recognition. Journal of Cognitive Neuroscience, 17, 1803–1817. Carreiras, M., Vergara, M., & Perea, M. (2009). ERP correlates of transposed-letter priming effects: The role of vowels vs. consonants. Psychophysiology, 46, 34–42. Chauncey, K., Grainger, J., & Holcomb, P. J. (2008). Code-switching effects in bilingual word recognition: A masked priming study with event-related potentials. Brain and Language, 105, 161–174. Coulson, S., King, J., & Kutas, M. (1998). Expect the unexpected: Eventrelated brain response to morphosyntactic violations. Language and Cognitive Processes, 13, 21–58. Council of Europe. (2001). Common European framework of reference for languages: Learning, teaching, assessment. Cambridge University Press, Cambridge (8th printing, 2006). Davis, C., & Perea, M. (2005). BuscaPalabras: A program for deriving orthographic and phonological neighborhood statistics and other psycholinguistic indices in Spanish. Behavior Research Methods, 37, 665–671. Dijkstra, A. F. J., & Van Heuven, W. J. B. (1998). The BIA-model and bilingual word recognition. In J. Grainger & A. M. Jacobs (Eds.), Localist connectionist approaches to human cognition (pp. 189–225). Mahwah, NJ: Erlbaum. Dijkstra, T., & Van Heuven, W. J. B. (2002). The architecture of the bilingual word recognition system: From identification to decision. Bilingualism: Language and Cognition, 5, 175–197. Donchin, E. (1981). Surprise! . . . Surprise? Psychophysiology, 18, 493–513. Filik, R., Sanford, A. J., & Leuthold, H. (2008). Processing pronouns without antecedents: Evidence from event-related brain potentials. Journal of Cognitive Neuroscience, 20, 1315–1326.
54 Friederici, A. D., Pfeifer, E., & Hahne, A. (1993). Event-related brain potentials during natural speech processing: Effects of semantic, morphological and syntactic violations. Cognitive Brain Research, 1, 183–192. Friederici, A. D., Steinhauer, K., & Pfeifer, E. (2002). Brain signatures of artificial language processing: Evidence challenging the critical period hypothesis. Proceedings of the National Academy of Sciences, U.S.A., 99, 529–534. Gillon-Dowens, M., Vergara, M., Barber, H., & Carreiras, M. (2009). Morpho-syntactic processing in late L2 learners. Journal of Cognitive Neuroscience, 22, 1870–1887. Glezer, L. S., Jiang, X., & Riesenhuber, M. (2009). Evidence for highly selective neuronal tuning to whole words in the ‘‘visual word form area.’’ Neuron, 62, 161–162. Gratton, G., Coles, M. G. H., & Donchin, E. (1983). A new method for off-line removal of ocular artifact. Electroencephalography and Clinical Neurophysiology, 55, 468–484. Holcomb, P. J., & Grainger, J. (2006). On the time course of visual word recognition: An event-related potential investigation using masked repetition priming. Journal of Cognitive Neuroscience, 18, 1631–1643. Kaan, E., & Swaab, T. Y. (2003). Electrophysiological evidence for serial sentence processing: A comparison between non-preferred and ungrammatical continuations. Cognitive Brain Research, 17, 621–635. Kaplan, E., Gooidglass, H., & Weintraub, S. (1983). Boston Naming Test. Philadelphia: Lea and Febiger. Kluender, R., & Kutas, M. (1993). Bridging the gap: Evidence from ERPs on the processing of unbounded dependencies. Journal of Cognitive Neuroscience, 5, 196–214. Kotz, S. A. (2008). A critical review of ERP and fMRI evidence on L2 syntactic processing. Brain and Language, 109, 68–74. Kroll, J. F., & Stewart, E. (1994). Category interference in translation and picture naming: Evidence for asymmetric connections between bilingual memory representations. Journal of Memory and Language, 33, 149–174. Kroll, J. F., & Tokowicz, N. (2005). Models of bilingual representation and processing: Looking back and to the future. In J. F. Kroll & A. M. de Groot (Eds.), Handbook of bilingualism: Psycholinguistic approaches (pp. 531–553). New York: Oxford University Press. Kutas, M., & Federmeier, K. D. (2000). Electrophysiology reveals semantic memory use in language comprehension. Trends in Cognitive Sciences, 4, 463–470. Kutas, M., & Hillyard, S. A. (1980). Reading senseless sentences: Brain potentials reflect semantic incongruity. Science, 207, 203–205. Maurer, U., Brandeis, D., & McCandliss, B. D. (2005). Fast, visual specialization for reading in English revealed by the topography of the N170 ERP response. Behavioral and Brain Functions, 1, 13. McCandliss, B. D., Cohen, L., & Dehaene, S. (2003). The visual word form area: Expertise for reading in the fusiform gyrus. Trends in Cognitive Sciences, 7, 293–299. Midgley, K. J., Holcomb, P. J., & Grainger, J. (2009a). Language effects in second language learners and proficient bilinguals investigated with event-related potentials. Journal of Neurolinguistics, 22, 281–300. Midgley, K. J., Holcomb, P. J., & Grainger, J. (2009b). Masked repetition and translation priming in second language learners: A window on the time-course of form and meaning activation using ERPs. Psychophysiology, 46, 551–565.
M. van der Meij et al. Molinaro, N., Conrad, M., Barber, H. A., & Carreiras, M. (2010). On the functional nature of the N400: Contrasting effects related to visual word recognition and contextual semantic integration. Cognitive Neuroscience, 1, 1–7. Moreno, E., Federmeier, K., & Kutas, M. (2002). Switching languages, switching palabras (words): An electrophysiological study of code switching. Brain and Language, 80, 188–207. Nobre, A. C., & McCarthy, G. (1994). Language-related ERPs: Scalp distributions and modulations by word type and semantic priming. Journal of Cognitive Neuroscience, 6, 233–255. Oldfield, R. C. (1971). The assessment and analysis of handedness: The Edinburgh inventory. Neuropsychologia, 9, 97–113. Osterhout, L., & Hagoort, P. (1999). A superficial resemblance does not necessarily mean you are part of the family: Counterarguments to Coulson, King and Kutas (1998) in the P600/SPS–P300 debate. Language and Cognitive Processes, 14, 1–14. Osterhout, L., McLaughlin, J., Pitka¨nen, I., Frenck-Mestre, C., & Molinaro, N. (2006). Novice learners, longitudinal designs, and eventrelated potentials: A paradigm for exploring the neurocognition of second-language processing. Language Learning, 56, 199–230. Polich, J. (2007). Updating P300: An integrative theory of P3a and P3b. Clinical Neurophysiology, 118, 2128–2148. Price, C. J., & Devlin, J. T. (2003). The myth of the visual word form area. NeuroImage, 19, 473–481. Proverbio, A. M., Leoni, G., & Zani, A. (2004). Language switching mechanisms in simultaneous interpreters: An ERP study. Neuropsychologia, 42, 1636–1656. Rodrı´ guez-Fornells, A., de Diego Balaguer, R., & Mu¨nte, T. F. (2006). Executive functions in bilingual language processing. In M. Gullberg & P. Indefrey (Eds.), The Cognitive Neuroscience of Second Language Acquisition (pp. 133–190). Malden, MA: Blackwell. Schendan, H. E., Giorgio, G., & Kutas, M. (1998). Neurophysiological evidence for visual perceptual categorization of words and faces within 150 ms. Psychophysiology, 35, 240–251. Thomas, M. S. C., & Allport, A. (2000). Language switching costs in bilingual visual word recognition. Journal of Memory and Language, 43, 44–66. van Hell, J. G., & Witteman, M. J. (2009). The neurocognition of switching between languages: A review of electrophysiological studies. In L. Isurin, D. Winford, & K. de Bot (Eds.), Multidisciplinary approaches to code switching (pp. 53–84). Amsterdam: John Benjamins. van Hell, J. G., & Tokowicz, N. (2010). Event-related brain potentials and second language learning: Syntactic processing in late L2 learners at different L2 proficiency levels. Second Language Research, 26, 43– 74. Van Petten, C., & Kutas, M. (1990). Interactions between sentence context and word frequency in event-related brain potentials. Memory and Cognition, 18, 380–393. Verleger, R. (1988). Event-related potentials and cognition: A critique of the context updating hypothesis and an alternative interpretation of P3. Behavioural Brain Sciences, 11, 343–356.
(Received July 12, 2009; Accepted January 14, 2010)
Psychophysiology, 48 (2011), 55–63. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01042.x
When a child errs: The ERN and the Pe complex in children
YAEL ARBELa and EMANUEL DONCHINb a
Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida, USA Department of Psychology, University of South Florida, Tampa, Florida, USA
b
Abstract We report an analysis of the componential structure of the event-related potentials (ERPs) elicited when 8–10-year-old children err. We demonstrated previously that the positive deflection that follows the error-related negativity (ERN) in young adults is a combination of two ERP components, a fronto-central positive component and a P300. As these findings affect the interpretation of error-related ERP data, it is essential to determine if the componential structure of the ERPs elicited by children’s errors is similar to that found in young adults. The results of the current study confirm that, as is the case in adults, both an ERN and a fronto-central positivity are elicited by errors committed by children. In contrast to what has been previously found in adults, errors committed by children elicited a central positivity in addition to a parietal negativity that was elicited by correct responses. Descriptors: EEG/ERP, Children/infants, Error processing, Principal component analysis, ERN, Pe
Posner, & Tucker, 1994; Holroyd et al., 1998; Ladouceur, Dahl, & Carter, 2007; van Veen & Carter, 2002; Vlamings, Jonkman, Hoeksma, van Engeland, & Kemner, 2008). Involvement of the ACC in the processing of errors has also been supported by functional neuroimaging studies (e.g., Carter et al., 1998; Critchley, Tang, Glaser, Butterworth, & Dolana, 2005; Kiehl, Liddle, & Hopfinger, 2000; Mars et al., 2005; Mathalon, Whitfield, & Ford, 2003; Menon, Adleman, White, Glover, & Reiss, 2001). The growing interest in the error-related ERP components elicited in children is motivated largely by evidence that ability to complete tasks requiring executive control improves with age (e.g., Bunch, Andrews, & Halford, 2007; Huizinga, Dolan, & van der Molen, 2006; Luna, Garver, Urban, Lazar, & Sweeney, 2004; Simonds, Kieras, Rueda, & Rothbart, 2007; Zelazo, Muller, Frye, & Marcovitch, 2003), coupled with evidence that the ACC, which is associated with executive control and performance monitoring, undergoes considerable maturational changes from childhood into early adulthood (e.g., Adleman et al., 2002; Casey et al., 1997; Cunningham, Bhattacharyya, & Benes, 2002; Rubia, Smith, Taylor, & Brammer, 2007). The developmental changes found in the amplitude of the ERN (e.g., Davies et al., 2004; Ladouceur et al., 2007; Santesso et al., 2006; Segalowitz & Davies, 2004; Wiersema et al., 2007) are consistent with the current theory of the developmental nature of executive control functions and the evidence of a late maturation of the ACC. Segalowitz and Davies, who pioneered the study of ERN in children, reported that, whereas the ERN amplitude increases with age, the magnitude of the Pe elicited in children is not different from that elicited in adults. This pattern was confirmed by other investigators (e.g., Hogan et al., 2005; Santesso et al., 2006; Wiersema et al., 2007). The common interpretation of
A growing number of researchers study the event-related potentials (ERPs) associated with the commission of errors by children (e.g., Davies, Segalowitz, & Gavin, 2004; Hogan, VarghaKhadem, Kirkham, & Baldeweg, 2005; Kim, Iwaki, Imashioya, Uno, & Fujita, 2007; Santesso, Segalowitz, & Schmidt, 2006; Segalowitz & Davies, 2004; Wiersema, van der Meere, & Roeyers, 2007). In these reports, much attention is given to the errorrelated negativity (ERN), a well-studied component whose attributes have been examined under different conditions, and its ties to error processing have been well established (e.g., Coles, Scheffers, & Holroyd, 2001; Falkenstein, Hohnsbein, Hoormann, & Blanke, 1990; Gehring, Goss, Coles, Meyer, & Donchin, 1993; Holroyd & Coles, 2002; Holroyd, Dien, & Coles, 1998; van Veen & Carter, 2002). These studies often report the elicitation of an ‘‘error positivity’’ (Pe), a component that is assumed to be related to error processing (e.g., Boksem, Tops, Wester, Lorist, & Meijman, 2006; Falkenstein, Hoormann, Christ, & Hohnsbein, 2000; Falkenstein, Willemssen, Hohnsbein, & Hielscher, 2005; Ullsperger & von Cramon, 2006), although its specific ties to error processing are yet to be firmly established. Although the functional significance of the ERN is a matter of some dispute, there is much agreement that it is a product of an error-processing system associated with the executive control processes and mediated by frontal areas of the brain. Support for this assumption is provided by converging evidence from studies using source localization techniques, which identify the anterior cingulate cortex (ACC) as the source of the ERN (e.g., Dehaene, Address correspondence to: Yael Arbel, Department of Communication Sciences and Disorders, University of South Florida, PCD 4008, 4202 E. Fowler Ave., Tampa, FL 33620, USA. E-mail: yarbel@ mail.usf.edu or
[email protected] 55
56 these results is that, whereas the ERN represents a developing function, the Pe is reflecting a different cognitive function that matures earlier. Arbel and Donchin’s (2009) findings that the socalled Pe is actually a combination of two components, a P300 and a fronto-central positivity, call for a reexamination of the positive-going ERP components elicited by children’s errors. Separating the two positive-going components becomes crucial in the study of error processing in children, as the ERN and Pe are not only a means of studying maturation of error processing but also a vehicle for understanding various disorders. The error-related ERP components have been studied in children with various disorders, particularly those involving impaired executive functions, such as Attention Deficit Hyperactive Disorder (ADHD) (e.g., Albrecht et al., 2008; Burgio-Murphy et al., 2007; Jonkman, van Melis, Kemner, & Markus, 2007; Ladouceur, Dahl, Birmaher, Axelson, & Ryan, 2006; Liotti, Pliszka, Perez, Kothmann, & Woldorff, 2005; Wiersema, van der Meere, & Roeyers, 2005; Zhang, Wang, Cai, & Yan, 2009) and autism (e.g., Groen et al., 2008; Vlamings et al., 2008). Table 1 summarizes some of the recent reports in which the ERN and the Pe were examined in children and adolescents in comparison to adults and to children with developmental disorders. In these reports, the ERN is presented as reflecting early error processing, whereas the so-called Pe is interpreted as a manifestation of ‘‘later error monitoring processes associated with subjective/conscious evaluation of errors’’ (Wiersema et al., 2005, p. 372). It is evident from the studies reported above that, although it is often mentioned that the functional significance of the socalled Pe is still to be elucidated, the Pe is treated as a distinct component that is uniquely related to error processing. The amplitude of this component in different ages and different disorders is being used to develop, support, or reject theories about the error processing abilities of the investigated individuals. Because no componential analysis was performed in any of the above mentioned studies, no separation between possibly overlapping components was achieved. In light of Arbel and Donchin’s (2009) findings that the alleged Pe component is composed from two different components when elicited by errors committed by young adults, the purpose of the present report is to elucidate the componential structure of the ERP components that follow error commission in children. To achieve this goal, principal component analysis (PCA) was used on a data set obtained by children aged 8–10 performing a flanker task to determine the temporal and spatial characteristics of the positive deflection(s) following the ERN. This study is the first to utilize PCA to analyze the componential structure of error related ERPs in children. It is important to note that the reported study was not designed to examine developmental processes, although its results may serve as a basis for such evaluation.
Methods Participants Seventeen children (7 girls; 10 boys) participated in this experiment. The participants, ranging in age from 8 to 10 years (M 5 8 years 10 months, SD 5 9 months), received payment for their participation. All tests were performed in the Cognitive Psychophysiology Laboratory in the Psychology Department at the University of South Florida. Written consent to participate was given by participants and their parent(s). Children were right-
Y. Arbel and E. Donchin handed, free of medications, and with no history of neurological deficits according to parental report. Parents completed the National Initiative for Children’s Healthcare Quality (NICHQ) Vanderbilt Assessment ScaleFParent Informant, to determine ADD/ADHD symptoms. None of the children in the sample scored within the clinical range on this assessment scale. Task and Procedure Overt responses, as well as electrophysiological measures, were recorded while participants performed the Eriksen flanker task (Eriksen & Eriksen, 1974). Stimulus presentation and response recording were controlled using E-Prime software (Version 1.2, http://www.pstnet.com/products/e-prime/). Participants sat in front of a 17-in. computer monitor and were required to respond by pressing a left or right button on a serial response box according to the identity (H or S) of the letter at the center of a fiveletter array. In two arrays, all five letters were identical (SSSSS, HHHHH). In the other two arrays, the central letter was different from the other four (SSHSS, HHSHH). Arrays in which all five letters are the same are referred to as compatible whereas the two arrays in which the central letter is different from the other four are considered incompatible. The four different letter arrays were presented in a random sequence and with equal probabilities. A fixation appeared on the screen for 500 ms, after which a five-letter array (stimulus) was presented for 200 ms. Response time was limited to 1000 ms. The response was followed by a blank screen for 800 ms. Participants were presented with 500 trials (10 blocks of 50 trials). They were instructed to respond as fast as they could and to try to avoid making errors. Participants were given a short break after the completion of each block. Electroencephalogram (EEG) epochs were time-locked to the participants’ responses. Epochs started 200 ms before the response (button press) and ended 600 ms following the response. EEG Recording Parameters The Electrical Geodesics, Inc. (EGI) System 200 was used to acquire and analyze dense-array EEG data. The EEG was recorded using 128-channel Ag/AgCl electrode nets from EGI. EEG was continually recorded at a 250-Hz sampling rate using a 0.1–100-Hz bandpass. The electrode impedances were kept below 50 kO. The continuous EEG data were filtered using an offline 40-Hz low-pass filter. The filtered data were then segmented into epochs, each starting 200 ms before the onset of a response and ending 600 ms after the response onset. An algorithm developed by Gratton, Coles, and Donchin (1983) for off-line removal of ocular artifacts was used to correct for eye movements and blinks. Averages of the ocular-corrected epochs were calculated for each type of response (correct, incorrect) after subtraction of the first 100-ms preresponse baseline. The averaged EEG epochs were re-referenced to linked mastoid. Data Analysis Accuracy and reaction time data were collected from each participant. To analyze the EEG data, a spatiotemporal principal component analysis as described by Spencer, Dien, and Donchin (2001) was utilized. This analysis reduces the dimensionality of a large data set and disentangles overlapping ERP components. This analysis allowed us to avoid making presumptions about the spatial and temporal distribution of the error-related components in children and to separate the possibly overlapping components, particularly the positive-going component that follows the ERN. The data set for the PCA was comprised of the ERPs,
Amplitude was enhanced in the late adolescents No differences in amplitude
Typically developing children and young adults (n 5 171) aged Visual flanker 7–25 years.
Early adolescents (n 5 15) vs. late adolescents (n 5 15) and adults (n 5 16)
Davies et al. (2004)
Ladouceur et al. (2007)
a
Children diagnosed with an anxiety disorder (n 5 9) vs. agematched peers (n 5 10)
Method of measurement is presented in parentheses.
Ladouceur et al. (2006)
Santesso et al. (2006)
Vlamings et al. (2008)
Burgio-Murphy et al. (2007)
Albrecht et al. (2008)
Zhang et al. (2009)
A variation of the visual flanker task (arrowheads)
Smaller amplitude in children with ASD Larger amplitude in children with higher rate of OC behaviors Larger amplitude in children with an anxiety disorder
Smaller amplitude in children with ADHD and related disorders
Smaller amplitude in children with ADHD
Amplitude increased with age
Visual flanker task
Typically developing 10-year-old children (n 5 39) vs. young adults (n 5 28)
Santesso et al. (2006)
A variation of the visual flanker task (arrowheads) Children with ADHD (n 5 16), aged 7–11 years, vs. typically Go/no-go developing children with no ADHD (n 5 16) & young adults, aged 21–37 (n 5 15) Children (aged 8–15) with ADHD (n 5 22) vs. nonaffected A variation of the siblings (n 5 68) and age-matched peers (n 5 18) visual flanker task (arrowheads) Children (aged 7–13.5 years; n 5 319) varying with respect to 2 two-choice reaction DSM-IV subtypes of ADHD and comorbid conditions: time tasks oppositional defiant disorder (ODD), reading disorders, and math disorders Children with autism spectrum disorder (ASD; n 5 17) vs. Auditory decision typically developing peers (n 5 10) task Children (n 5 37) divided into two groups based on rate of Visual flanker task maternal-reported obsessive compulsive (OC) behaviors
No amplitude differences between children and adults Smaller amplitude in children
Two choice reponse time mapping task
Typically developing children, aged 10–12 (n 5 17) vs. young adults (n 5 18), aged 19–24.
Wiersema et al. (2007)
Eppinger et al. (2009)
ERN amplitude Smaller amplitude in adolescents (in the 4CR task) Amplitude increased with age
Task
Adolescents, aged 12.2–18.8 (n 5 12), vs. adults (n 5 11), aged Two-choice response 18.8–22.1 task & 4-choice response (4-CR) task Typically developing children, aged 7–8 (n 5 13), adolescents, Go/no-go aged 13–14 (n 5 14), and young adults (n 5 17)
Participants
Hogan et al. (2005)
Authors
Smaller amplitude in children with ASD Larger amplitude in children with higher rate of OC behaviors No difference in Pe amplitude between groups
No difference between ADHD and controls No difference in Pe amplitude between groups
Smaller amplitude in children with ADHD
No change in amplitude as a function of age Larger amplitude in children than in adults No difference in amplitude between adults and children No changes in amplitude as a function of age No age differences in Pe amplitude
No difference between age groups
Pe amplitude
Table 1. Summary of Results of a Sample of Studies in Which ERN and Pe Were Measured in Children with Typical and Atypical Development
Cz, Pz (mean area measured in the range of 250–500 ms postresponse) Cz (second positive peak in the time window of 200–500 ms after an incorrect response) Pz (maximum positive peak amplitude in the window 100–500 ms following response onset)
Maximal at Pz (mean voltage, relative to the pre-EMG baseline, for the interval 305–500 ms after EMG onset)
Cz, Pz (mean amplitude 200–500 ms)
Cz, Pz (the largest positive wave before 400 ms)
Cz, Pz (peak to peak)
Cz (peak to peak)
Cz (the second positive peak in the time window of 200–500 ms)
Cz (mean amplitude measured between 300 and 400 ms postresponse)
Cz (maximal positive peak in the 200–400ms postresponse range)
FCz (peak to peak)
Spatial distribution of the Pea
The ERN and the Pe complex in children 57
58
Y. Arbel and E. Donchin
for all participants, electrodes, and conditions, over epochs extending from 200 ms before a response onset to 600 ms following the response. With a sampling rate of 250 Hz, each epoch consisted of 200 time points. First a ‘‘spatial’’ PCA was performed, then a ‘‘temporal’’ PCA (Spencer et al., 2001). The spatial PCA was performed by computing the covariance between electrode sites across the time points of the averages for each of the response types and participants, yielding a set of spatial factors. In the next step of the analysis, the ‘‘factor scores’’ for each of the original participants, conditions, and electrodes were used to create ‘‘virtual ERPs’’ (Spencer et al., 2001), which were submitted to a temporal PCA, analyzing the covariance among time points for each of the spatial factors, response types, and participants. The resulting temporal factor scores for each spatial factor were used to measure the activity in the ERP with the morphology and scalp distributions of interest. For both spatial and temporal PCAs, the factors that were required to account for 95% of the variance in the input data set were retained for Varimax rotation. MATLAB version 4.6 (R2007a) was utilized to run the spatiotemporal PCA. Results Overt Responses On average, children committed errors on 22.6% of the trials and made immediate corrections following 36% of their errors. Repeated-measures analyses of participants’ error rates and reaction times on incompatible versus compatible trials were computed to validate the flanker effect. The analysis of error rates revealed a main effect of compatibility, F(1,16) 5 42.68, po.0001, Cohen’s f 5 1.58, validating that participants made more errors on incompatible versus compatible trials. Participants responded more slowly on incompatible (M 5 580.96 ms, SD 5 120.49) versus compatible (M 5 542.09 ms, SD 5 97.56) trials. Repeated-measures analysis of reaction times revealed a main effect of compatibility, F(1,16) 5 10.75, p 5 .004, Cohen’s f 5 0.79. The data show that errors were made with shorter latencies (M 5 501.97 ms, SD 5 91.77) than correct responses (M 5 594.78 ms, SD 5 117.24). Repeated measure analysis confirmed a main effect of response type, F(1,16) 5 62.17, po.0001, Cohen’s f 5 1.91. Electrophysiological Data Visual inspection of the ERPs elicited by correct and incorrect responses (Figure 1) reveals fronto-central negativity with maximal amplitude at FCz and a positive deflection that becomes broader at Pz, both associated with incorrect responses. A parietal negative deflection, which is associated with correct responses, is also apparent. A more detailed examination of the data is provided by the spatiotemporal PCA. The analysis began with a spatial PCA to reduce the spatial dimensionality of the data set. Twenty-five spatial factors (SF) were retained for Varimax rotation. Each of the original ERPs was ‘‘filtered’’ by each of the rotated spatialfactor patterns allowing the conversion of each ERP into as many distinct ERPs (also known as ‘‘virtual ERPs’’) as there are spatial factors. The virtual ERPs were averaged across participants for each response type (correct and incorrect). From examination of the virtual ERPs of the 25 spatial factors, it appears that SF1 (fronto-central), SF2 (centro-parietal), and SF3 (central) display a clear difference between the ERPs elicited by correct responses and those elicited by errors. SF1 seems to capture
Figure 1. Grand average ERPs elicited by correct (solid line) and incorrect responses (dashed line). Presented are ERPs recorded at frontocentral (FCz), central (Cz), and posterior (Pz) recording sites.
the ERN and a subsequent positive deflection (Figure 2). SF2 appears to capture a negative deflection that is elicited by correct responses (Figure 3), and SF3 (central) shows a positive deflection that is associated with error commission and a late negativity associated with correct responses (Figure 4). To disentangle temporally overlapping ERP activity, temporal PCA was conducted on the entire set of virtual ERPs across conditions, participants, and the 25 spatial factors, yielding eight temporal factors (TF; Figure 5), three of which seem to represent the time points in which the two experimental conditions differ in the virtual ERPs of SF1, SF2, and SF3. TF1 is maximal at around 450 ms following the response. TF4 loads highly in the 250–350-ms range of the epoch. TF6 (peaks at 40 ms) has the latency of an ERN component. Three ERP components associated with incorrect responses were identified: An ERN component was identified at SF1-TF6 (fronto-central, peaks at 40 ms), a positive-going component that
a
b SF 1
−1
SF1 − Virtual ERPs Correct Error
−0.5
0
0.5
1 –200
0
200 400 Time (ms)
600
Figure 2. Left: Factor loading map for SF1. Right: Virtual ERPs obtained from the factor scores of SF1. SF: spatial factor.
The ERN and the Pe complex in children
a
b
SF2 − Virtual ERPs
–1.5
SF 2
59
–1 –0.5 0 Correct Error
0.5 –200
0
200 400 Time (ms)
600
Figure 3. Left: Factor loading map for SF2. Right: Virtual ERPs obtained from the factor scores of SF2. SF: spatial factor.
a
b –1.5 SF 3
SF3 − Virtual ERPs
–1 −0.5 0 0.5 1
Correct Error
1.5 –200
0
200 400 Time (ms)
600
Figure 4. Left: Factor loading map for SF3. Right: Virtual ERPs obtained from the factor scores of SF3. SF: spatial factor.
follows the ERN was observed at SF1-TF1 (fronto-central, with a latency of 450 ms), and a central positivity with a latency of about 300 ms was detected at SF3-TF4. An ERP component associated with correct responses was identified at SF2-TF4 (centro-parietal with a latency of 300 ms). To examine the extent to which an ERN was elicited, a repeated measures analysis of variance (ANOVA) with response type (correct, error) as a within-subject variable was performed
1.5
TF1
1
1
TF2
1
TF3
1
0.5
0.5
0.5
0
0
0
TF4
0.5 0
−0.5 −0.5 −0.5 −0.5 −200 200 600 −200 200 600 −200 200 600 −200 200 600 1
TF5
1
TF6
1
TF7
1
0.5
0.5
0.5
0.5
0
0
0
0
TF8
−0.5 −0.5 −0.5 −0.5 −200 200 600 −200 200 600 −200 200 600 −200 200 600 Time (ms)
Figure 5. Factor loadings for each of the temporal factors plotted as a function of the time point in the epoch.
on the factor scores of SF1-TF6 (Figure 6). There was a significant response-type main effect, F(1,16) 5 24.07, p 5 .0002, Cohen’s f 5 1.18, substantiating that amplitude was more negative for error trials than for correct trials in the fronto-central recording sites around 40 ms after response. To examine the fronto-central positive component, a repeated measures ANOVA with response type (correct, error) as a within-subject variable was performed on the factor scores of SF1-TF1 (Figure 7). There was a significant response-type main effect, F(1,16) 5 10.73, p 5 .0048, Cohen’s f 5 0.79, confirming that amplitude in the fronto-central recording sites at a latency of about 450 ms postresponse was more positive for error trials than for correct trials. The two fronto-central components, namely, the ERN and the fronto-central positivity, were previously reported to be elicited by errors committed by adults (e.g., Arbel & Donchin, 2009). However, the latency of the fronto-central positivity observed in children in the current report (around 450 ms postresponse) seems longer than that measured in adults (about 300 ms postresponse). To examine the centro-parietal component, a repeated measures ANOVA with response type (correct, error) as a withinsubject variable was performed on the factor scores of SF2-TF4 (Figure 8). Our data suggest that the amplitude of this ERP component was more negative for correct trials than for incorrect trials. These observations are supported by a significant response-type main effect, F(1,16) 5 5.56, p 5 .036, Cohen’s f 5 0.57. These results imply that the parietal component was driven more by correct responses than by errors. It is worth mentioning that this negativity associated with correct responses has been previously reported by Vidal, Hasbroucq, Grapperon, and Bonnet (2000) and labeled the N300. In some reports, in which this negative component is not reported or discussed, the published grand-averaged ERP waveforms, elicited by correct responses, show a clear negative-going deflection at Cz and Pz (e.g., Davies et al., 2004; Hogan et al., 2005; Ladouceur et al., 2007; Wiersema et al., 2007). To examine the central positive component, a repeated measures ANOVA with response type (correct, error) as a withinsubject variable was performed on the factor scores of SF3-TF4 (Figure 9). There was a significant response-type main effect, F(1,16) 5 4.61, p 5 .049, Cohen’s f 5 0.52, confirming that amplitude was more positive for error trials than for correct trials. It is unclear whether this component is a P300, as it is more central than the typical P300. In addition, the latency of this component appears shorter than what is expected from a P300. However, as the P300 is elicited by events, one needs to define the onset of the event to be able to estimate latency. It is possible that, should the EEG were time-locked to EMG instead of a button press, the latency of this component would have resembled that of a typical P300. It appears that this central positive-going ERP component is consistent with a novelty P3 (Courchense, Hillyard, & Galambos, 1975), also known as the P3a. This component is reported to show maximal amplitude at Cz in children within the age range of those whose data were collected for the reported study. For example, Brinkman and Stauder (2008) reported a change in the topography of the late novelty P3 with age. Their data suggested that the amplitude of the novelty P3 was maximal at Fz and Cz in 5–7-year-old children, at Cz at the ages of 8–9 years, Cz and Pz at the ages of 10–12 years, and Fz in adults. There are similar reports of a central novelty P3 is children within the age range of the sample used in the current report (e.g., Gumenyuk et al.,
60
Y. Arbel and E. Donchin
a
SF1 − Virtual ERPs
−1
b
c
Temporal Factor 6
−0.5
SF1−TF6 (ERN)
−1
−0.5
0
0
0.5 0.5 1 −200
0
200 400 Time (ms)
600 −200
0
200 400 Time (ms)
600
Correct
Error
Figure 6. The ERN component. From left to right: Virtual ERPs obtained from the factor scores of SF1 (fronto-central), temporal factor loadings of temporal factor 6 (maximal at 40 ms), factor scores of TF6 at SF1. SF: spatial factor; TF: temporal factor.
a
b
c Temporal Factor 1
SF1 − Virtual ERPs
−1 −0.5
SF1−TF1
−1
(Fronto−central positivity)
−0.5
0
0
0.5
0.5
1 −200
0
200
400
600
−200
0
Time (ms)
200
400
Correct
600
Error
Time (ms)
Figure 7. The fronto-central positive component. From left to right: Virtual ERPs obtained from the factor scores of SF1 (fronto-central), temporal factor loadings of temporal factor 1 (maximal at 450–500 ms), factor scores of TF1 at SF1. SF: spatial factor; TF: temporal factor.
2001; Gumenyuk, Korzyukov, Alho, Escera, & Na¨a¨ta¨nen, 2004; Ma¨a¨tta¨ et al., 2005; Wetzel & Schro¨ger, 2007).
Discussion The present report presents a componential analysis of the ERPs elicited when children aged 8–10 commit errors in a speeded reaction time task. Our data suggest that errors committed by children elicit an ERN that is similar in morphology as well as spatial and temporal distribution to the ERN reported to be elicited by adults. The PCA applied to our data allowed the examination of other ERP components elicited by correct and incorrect re-
a
sponses. One such presumed component is the so-called Pe, which is typically measured at centro-parietal recording sites (CZ, PZ). The results of the present report suggest that the positive deflection that is elicited when children commit errors has a fronto-central scalp distribution and a latency of 350–450 ms following an incorrect response. The spatial distribution of this positive-going component is similar to that found by Arbel and Donchin (2009) in adults. As in adults, this positive component follows the ERN in time. However, the latency of this component found for children in the present report appears longer than that reported in adults. There is evidence from the P300 literature to suggest that latency may reflect the time it takes to assess/process information (e.g., Donchin, Gratton, Dupree, & Coles, 1988; Duncan-Johnson & Donchin, 1981; McCarthy & Donchin,
b
−1.5
SF2 − Virtual ERPs
c Temporal Factor 4
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6
−1 −0.5 0 0.5 –200
0 200 400 600 Time (ms)
−200
0
200 400 Time (ms)
600
SF2−TF4 (Centro−parietal negativity)
Correct
Error
Figure 8. The centro-parietal component. From left to right: Virtual ERPs obtained from the factor scores of SF2 (centro-parietal), temporal factor loadings of temporal factor 4 (maximal at 300 ms), factor scores of TF4 at SF2. SF: spatial factor; TF: temporal factor.
The ERN and the Pe complex in children
a
61
b
SF3 − Virtual ERPs
−1.5
c
Temporal Factor 4
SF3−TF4
−1
−1 −0.5
−0.5 0
0
0.5 0.5
1 1.5
−200
0
200
Time (ms)
400
600
−200
0
200 400 Time (ms)
600
1
Correct
Error
Figure 9. The central positive component. From left to right: Virtual ERPs obtained from the factor scores of SF3 (central), temporal factor loadings of temporal factor 4 (maximal at 300 ms), factor scores of TF4 at SF3. SF: spatial factor; TF: temporal factor.
1981). The relatively long latency of the fronto-central positive component in children may suggest that this component is a manifestation of an evaluation process that may be related to the evaluation of the committed error. The extent to which this ERP component is uniquely related to the processing of errors is still to be elucidated. Another positive-going component with a central scalp distribution and a latency of 250–300 ms after incorrect responses was observed. This component is different from the centro-parietal positive component described in adults both in spatial distribution and in latency. As no manipulations of the experimental conditions have been made in the present report, it is unclear whether this component is behaving as a P300 (i.e., whether it is sensitive to the same experimental conditions as the P300). It is possible that this central positive-going ERP component is a late novelty P3. The latency and scalp distribution of this component as it was elicited in the present report are consistent with those of the late novelty P3 reported to be elicited by children within the same age range of the reported sample. There are several possible accounts for the functional significance of the novelty P3. One suggestion is that the novelty P3 reflects involuntary switching of attention to deviant events (e.g., Escera, Alho, Schro¨ger, & Winkler, 2000; Na¨a¨ta¨nen, 1990). Another suggestion is that the novelty P3 may reflect inhibitory process triggered by a deviant event at the level of the executive control system (Goldstein, Spencer, & Donchin, 2002). If the elicited component is indeed a novelty P3, it is possible that each error committed by children is processed as a novel, deviant event.
Our data suggest that a negative-going component with a centro-parietal scalp distribution is elicited by correct responses at about 350 ms postresponse. A similar negative deflection is also reported to be elicited by correct responses made by adults (e.g., Arbel & Donchin, 2009; Vidal et al., 2000). In these reports, a centro-parietal positivity (P300) that is absent in our data is reported to be elicited by incorrect responses. The results of the present report suggest that errors committed by children elicit some ERP components that are similar to those reported in the adult studies, namely, the ERN and the frontocentral positivity. Similar to what is reported in adults, correct responses elicited a parietal negative-going component. A P300 was not elicited by incorrect responses committed by children. However, a central positive-going component, which was associated with incorrect responses, was found. These results strengthen previous findings that the ERN and the positivity that follows it share the same spatial distribution (fronto-central), and that the positive deflection that follows the ERN is a combination of two separate ERP components. These findings challenge previous reports in which the positive component was treated as a single parietal component and question the reliability of the reported stability of this component from childhood to early adulthood. Reexamination of the fronto-central positive component among children of different ages could shed light on the functional significance of this component and on the developmental nature of the underlying process that is responsible for its elicitation.
REFERENCES Adleman, N. E., Menon, V., Blasey, C. M., White, C. D., Warsofsky, I. S., Glover, G. H., et al. (2002). A developmental fMRI study of the Stroop color-word task. NeuroImage, 16, 61–75. Albrecht, B., Brandeis, D., Uebel, H., Heinrich, H., Mueller, U. C., Hasselhorn, M., et al. (2008). Action monitoring in boys with Attention-Deficit/Hyperactivity Disorder, their nonaffected siblings, and normal control subjects: Evidence for an endophenotype. Biological Psychiatry, 64, 615–625. Arbel, Y., & Donchin, E. (2009). Parsing the componential structure of post-error ERPs: A principal component analysis of ERPs following errors. Psychophysiology, 46, 1288–1298. Boksem, M. A. S., Tops, M., Wester, A. E., Lorist, M. M., & Meijman, T. F. (2006). Error related ERP components and individual differences in punishment and reward sensitivity. Brain Research, 1101, 92– 101. Brinkman, M. J., & Stauder, J. E. (2008). The development of passive auditory novelty processing. International Journal of Psychophysiology, 70, 33–39.
Bunch, K. M., Andrews, G., & Halford, G. S. (2007). Complexity effects on the children’s gambling task. Cognitive Development, 22, 376–383. Burgio-Murphy, A., Klorman, R., Shaywitz, S. E., Shaywitz, J. M., Marchione, K. E., Holahan, J., et al. (2007). Error-related eventrelated potentials in children with attention-deficit hyperactivity disorder, oppositional defiant disorder, reading disorder, and math disorder. Biological Psychology, 75, 75–86. Carter, C. S., Braver, T. S., Barch, D. M., Botvinick, M. M., Noll, D., & Cohen, J. D. (1998). Anterior cingulate cortex, error detection, and the online monitoring of performance. Science, 280, 747–749. Casey, B. J., Trainor, R. J., Orendi, J. L., Schubert, A. B., Nystrom, L. E., Giedd, J. N., et al. (1997). A developmental functional MRI study of prefrontal activation during performance of a go–no-go task. Journal of Cognitive Neuroscience, 9, 835–847. Coles, M. G. H., Scheffers, M. K., & Holroyd, C. B. (2001). Why is there an ERN/Ne on correct trials? Response representations, stimulusrelated components, and the theory of error-processing. Biological Psychology, 56, 173–189.
62 Courchense, E., Hillyard, S. A., & Galambos, R. (1975). Stimulus novelty, task relevance, and the visual evoked potential in man. Electroencephalography and Clinical Neurophysiology, 39, 131–143. Critchley, H. D., Tang, J., Glaser, D., Butterworth, B., & Dolana, R. J. (2005). Anterior cingulate activity during error and autonomic response. NeuroImage, 27, 885–895. Cunningham, M. G., Bhattacharyya, S., & Benes, F. M. (2002). Amygdalo-cortical sprouting continues into early adulthood: Implications for the development of normal and abnormal function during adolescence. Journal of Comparative Neurology, 453, 116–130. Davies, P. L., Segalowitz, S. J., & Gavin, W. J. (2004). Development of response-monitoring ERPs in 7- to 25-year-olds. Developmental Neuropsychology, 25, 355–376. Dehaene, S., Posner, M. I., & Tucker, D. M. (1994). Localization of a neural system for error detection and compensation. Psychological Science, 5, 303–305. Donchin, E., Gratton, G., Dupree, D., & Coles, M. G. H. (1988). After a rash action: Latency and amplitude of the P300 following fast guesses. In G. C. Galbraith, M. L. Kietzman, & E. Donchin (Eds.), Neurophysiology and psychophysiology: Experimental and clinical applications (pp. 173–188). Hillsdale, NJ: Erlbaum. Duncan-Johnson, C., & Donchin, E. (1981). The relation of P300 latency to reaction time as function of expectancy. In H. H. Kornhuber & L. Deecke (Eds.), Motivation, motor and sensory processes of the brain: Electrical potentials, behavior and clinical use. Progress in Brain Research (pp. 717–722). Amsterdam: Elsevier-North Holland. Eriksen, B. A., & Eriksen, C. W. (1974). Effects of noise letters on the identification of a target letter in a nonsearch task. Perception & Psychophysics, 16, 143–149. Escera, C., Alho, K., Schro¨ger, E., & Winkler, I. (2000). Involuntary attention and distractibility as evaluated with event-related brain potentials. Audiology and Neuro-Otology, 5, 151–166. Falkenstein, M., Hohnsbein, J., Hoormann, J., & Blanke, L. (1990). Effects of errors in choice reaction tasks on the ERP under focused and divided attention. In C. H. M. Brunia, A. W. K. Gaillard, & A. Kok (Eds.), Psychophysiological brain research (pp. 192–195). Tilburg, the Netherlands: Tilburg University Press. Falkenstein, M., Hoormann, J., Christ, S., & Hohnsbein, J. (2000). ERP components on reaction errors and their functional significance: A tutorial. Biological Psychology, 51, 87–107. Falkenstein, M., Willemssen, R., Hohnsbein, J., & Hielscher, H. (2005). Error processing in Parkinson’s disease: The error positivity (Pe). Journal of Psychophysiology, 19, 305–310. Gehring, W. J., Goss, B., Coles, M. G. H., Meyer, D. E., & Donchin, E. (1993). A neural system for error detection and compensation. Psychological Science, 4, 385–390. Goldstein, A., Spencer, K. M., & Donchin, E. (2002). The influence of stimulus deviance and novelty on the P300 and novelty P3. Psychophysiology, 39, 781–790. Gratton, G., Coles, M. G. H., & Donchin, E. (1983). A new method for off-line removal of ocular artifact. Electroencephalography and Clinical Neurophysiology, 55, 468–484. Groen, Y., Wijers, A. A., Mulder, L. J. M., Waggeveld, B., Minderaa, R. B., & Althaus, M. (2008). Error and feedback processing in children with ADHD and children with autistic spectrum disorder: An EEG event-related potential study. Clinical Neurophysiology, 119, 2476– 2493. Gumenyuk, V., Korzyukov, O., Alho, K., Escera, C., & Na¨a¨ta¨nen, R. (2004). Effects of auditory distraction on electrophysiological brain activity and performance in children aged 8-13 years. Psychophysiology, 41, 30–36. Gumenyuk, V., Korzyukov, O., Alho, K., Escera, C., Schro¨ger, E., Ilmoniemi, R. J., et al. (2001). Brain activity index of distractibility in normal school-age children. Neuroscience Letters, 314, 147–150. Hogan, A. M., Vargha-Khadem, F., Kirkham, F. J., & Baldeweg, T. (2005). Maturation of action monitoring from adolescence to adulthood: An ERP study. Developmental Science, 8, 525–534. Holroyd, C. B., & Coles, M. G. (2002). The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679–709. Holroyd, C. B., Dien, J., & Coles, M. G. H. (1998). Error-related scalp potentials elicited by hand and foot movements: Evidence for an output independent error processing system in humans. Neuroscience Letters, 242, 65–68.
Y. Arbel and E. Donchin Huizinga, M., Dolan, C. V., & van der Molen, M. W. (2006). Age-related change in executive function: Developmental trends and a latent variable analysis. Neuropsychologia, 44, 2017–2036. Jonkman, L. M., van Melis, J. J. M., Kemner, C., & Markus, C. R. (2007). Methylphenidate improves deficient error evaluation in children with ADHD: An event-related brain potential study. Biological Psychology, 76, 217–229. Kiehl, K. A., Liddle, P. F., & Hopfinger, J. B. (2000). Error processing and the rostral anterior cingulate: An event-related fMRI study. Psychophysiology, 37, 216–223. Kim, E. Y., Iwaki, N., Imashioya, H., Uno, H., & Fujita, T. (2007). Error-related negativity in a visual go/no-go task: Children vs. adults. Developmental Neuropsychology, 31, 181–191. Ladouceur, C. D., Dahl, R. E., Birmaher, B., Axelson, D. A., & Ryan, N. D. (2006). Increased error-related negativity (ERN) in childhood anxiety disorders: ERP and source localization. Journal of Child Psychology and Psychiatry, 47, 1073–1082. Ladouceur, C. D., Dahl, R. E., & Carter, C. S. (2007). Development of action monitoring through adolescence into adulthood: ERP and source localization. Developmental Science, 10, 874–891. Liotti, M., Pliszka, S., Perez, R., Kothmann, D., & Woldorff, M. (2005). Abnormal brain activity related to performance monitoring and error detection in children with ADHD. Cortex, 42(3), 1–12. Luna, B., Garver, K. E., Urban, T. A., Lazar, N. A., & Sweeney, J. A. (2004). Maturation of cognitive processes from late childhood to adulthood. Child Development, 75, 1357–1372. Ma¨a¨tta¨, S., Saavalainen, P., Ko¨no¨nen, M., Pa¨a¨kko¨nen, A., MurajaMurro, A., & Partanen, J. (2005). Processing of highly novel auditory events in children and adults: An event-related potential study. NeuroReport, 16, 1443–1446. Mars, R. B., Coles, M. G. H., Grol, M. J., Holroyd, C. B., Nieuwenhuis, S., Hulstijn, W., et al. (2005). Neural dynamics of error processing in medial frontal cortex. NeuroImage, 28, 1007–1013. Mathalon, D. H., Whitfield, S. L., & Ford, J. M. (2003). Anatomy of an error: ERP and fMRI. Biological Psychology, 64, 119–141. McCarthy, G., & Donchin, E. (1981). A metric for thought: A comparison of P300 latency and reaction time. Science, 211, 22–80. Menon, V., Adleman, N. E., White, C. D., Glover, G. H., & Reiss, A. L. (2001). Error-related brain activation during a go/nogo response inhibition task. Human Brain Mapping, 12, 131–143. Na¨a¨ta¨nen, R. (1990). The role of attention in auditory information processing as revealed by event-related potentials and other measures of cognitive function. Behavioral and Brain Sciences, 13, 201–288. Rubia, K., Smith, A. B., Taylor, E., & Brammer, M. (2007). Linear agecorrelated functional development of right inferior fronto-striato-cerebellar networks during response inhibition and anterior cingulate during error-related processes. Human Brain Mapping, 28, 1163– 1177. Santesso, D. L., Segalowitz, S. J., & Schmidt, L. A. (2006). Error-related electrocortical responses in 10-year-old children and young adults. Developmental Science, 9, 473–481. Segalowitz, S. J., & Davies, P. L. (2004). Charting the maturation of the frontal lobe: An electrophysiological strategy. Brain and Cognition, 55, 116–133. Simonds, J., Kieras, J. E., Rueda, M. R., & Rothbart, M. K. (2007). Effortful control, executive attention, and emotional regulation in 7– 10-year-old children. Cognitive Development, 22, 474–488. Spencer, K., Dien, J., & Donchin, E. (2001). Spatiotemporal analysis of the late ERP responses to deviant stimuli. Psychophysiology, 38, 343– 358. Ullsperger, M, & von Cramon, D. Y. (2006). The role of intact frontostriatal circuits in error processing. Journal of Cognitive Neuroscience, 18, 651–664. van Veen, V., & Carter, C. S. (2002). The timing of action-monitoring processes in the anterior cingulate cortex. Journal of Cognitive Neuroscience, 14, 593–602. Vidal, F., Hasbroucq, T., Grapperon, J., & Bonnet, M. (2000). Is the ‘error negativity’ specific of errors? Biological Psychology, 51, 109– 128. Vlamings, P. H. J. M., Jonkman, L. M., Hoeksma, M. R., van Engeland, H., & Kemner, C. (2008). Reduced error monitoring in children with autism spectrum disorder: An ERP study. European Journal of Neuroscience, 28, 399–406.
The ERN and the Pe complex in children Wetzel, N., & Schro¨ger, E. (2007). Modulation of involuntary attention by the duration of novel and pitch deviant sounds in children and adolescents. Biological Psychology, 75, 24–31. Wiersema, J. R., van der Meere, J. J., & Roeyers, H. (2005). ERP correlates of impaired error monitoring in children with ADHD. Journal of Neural Transmission, 112, 1417–1430. Wiersema, J. R., van der Meere, J. J., & Roeyers, H. (2007). Developmental changes in error monitoring: An event-related potential study. Neuropsychologia, 45, 1649–1657. Zelazo, P. D., Muller, U., Frye, D., & Marcovitch, S. (2003). The development of executive function in early childhood. Monographs
63 of the Society for Research in Child Development, 68(3), Serial No. 274. Zhang, J. S., Wang, Y., Cai, R. G., & Yan, C. H. (2009). The brain regulation mechanism of error monitoring in impulsive children with ADHDFAn analysis of error related potentials. Neuroscience Letters, 460, 11–15.
(Received November 25, 2009; Accepted January 19, 2010)
Psychophysiology, 48 (2011), 64–73. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01047.x
Operationalizing proneness to externalizing psychopathology as a multivariate psychophysiological phenotype
LINDSAY D. NELSON, CHRISTOPHER J. PATRICK, and EDWARD M. BERNAT Department of Psychology, Florida State University, Tallahassee, Florida, USA
Abstract The externalizing dimension is viewed as a broad dispositional factor underlying risk for numerous disinhibitory disorders. Prior work has documented deficits in event-related brain potential (ERP) responses in individuals prone to externalizing problems. Here, we constructed a direct physiological index of externalizing vulnerability from three ERP indicators and evaluated its validity in relation to criterion measures in two distinct domains: psychometric and physiological. The index was derived from three ERP measures that covaried in their relations with externalizing pronenessFthe error-related negativity and two variants of the P3. Scores on this ERP composite predicted psychometric criterion variables and accounted for externalizing-related variance in P3 response from a separate task. These findings illustrate how a diagnostic construct can be operationalized as a composite (multivariate) psychophysiological variable (phenotype). Descriptors: Externalizing, Disinhibition, Feedback-related negativity, P300, Event-related potential
2002). Specifically, a common factor was extracted reflecting the shared variance among differing brain response indicators of externalizing proneness, and the validity of this physiologically based composite for predicting external criterion measures of interest was evaluated. A secondary aim was to illustrate a general research strategy for developing stable neurobiological indices of individual difference constructs relevant to psychopathology.
Experts in the mental health field have called for systematic efforts to integrate neurobiological concepts and findings into systems for diagnosing mental disorders (Hyman, 2007), toward the aim of enhancing the effectiveness of assessment, prevention, and treatment of such disorders (Iacono, 1998; Insel & Scolnick, 2006). One effort in this direction entails developing reliable neurobiological indicators (biomarkers) of psychopathology constructs. Most research of this kind has focused on identifying individual indicators of specific disorders. However, little work has been done to evaluate patterns of relations among varying physiological indicators of differing disorders. Should separate physiological (e.g., event-related potential) indicators demonstrate convergence indicative of a common neural substrate, their joint consideration may be important for identifying individuals at risk prior to the emergence of active pathology and for elucidating the neurobehavioral mechanisms underlying such disorders. With this prospect in mind, the current study examined convergence among multiple psychophysiologic indicators of general proneness to externalizing disordersFa spectrum of psychopathology marked by deficient impulse control (Krueger et al.,
The Externalizing Construct The construct of externalizing has been proposed as a common dispositional factor underlying the spectrum of disorders marked by deficient impulse control (also known as ‘‘disinihibition’’; Gorenstein & Newman, 1980; Sher & Trull, 1994)Fincluding child and adult antisocial deviance and substance-related disorders. Evidence for the existence of this broad factor emerged out of structural analyses of diagnostic data in adult epidemiologic samples. For example, Krueger (1999) reported that the covariance among various Diagnostic and Statistical Manual-defined disorders could be accounted for by two broad factors: internalizing, encompassing mood and anxiety disorders, and externalizing, encompassing antisocial personality disorder and alcohol and drug dependence. These broad factors can be viewed as reflecting general dispositional vulnerabilities to disorders of each type (Krueger et al., 2002; Mineka, Watson, & Clark, 1998). Consistent with this perspective, available data indicate that scores on the general externalizing factor are highly (480%) heritable (Kendler, Prescott, Myers, & Neale, 2003; Krueger et al., 2002; Young, Stallings, Corley, Krauter, & Hewitt, 2000).
This manuscript is based on work completed by the first author in partial fulfillment of the requirements for the degree of Master of Arts at the University of Minnesota, under the supervision of the second author. The work was supported by grants MH65137, MH17069, MH072850, MH080239, MH089727, and AA12164 from the National Institutes of Health. Address correspondence to: Lindsay D. Nelson or Christopher J. Patrick, Department of Psychology, Florida State University, 1107 West Call Street, Tallahassee, FL 32306-4301, USA. E-mail: nelson@ psy.fsu.edu or
[email protected] 64
Multivariate ERP assessment of externalizing Personality traits in the domains of impulsivity, aggression, and sensation seeking have also been identified as indicators of the broad externalizing factor (Krueger, Markon, Patrick, Benning, & Kramer, 2007; Krueger, McGue, & Iacono, 2001). The implication is that the externalizing construct encompasses normal-range personality traits as well as pathological behavioral tendencies along a common vulnerability continuum. This conceptualization inspired the development of the Externalizing Spectrum Inventory (ESI; Krueger et al., 2007), a 415-item selfreport questionnaire that indexes externalizing vulnerability comprehensively in terms of scores on 23 unidimensional subscales. The ESI was developed using factor analysis and itemresponse theory techniques to optimize the psychometric properties and structural coherence of its subscales. The subscales of the ESI index a range of distinctive but interrelated trait-dispositional and behavioral constructs in domains of impulsiveness, sensation seeking, irresponsibility, blame externalization, dishonesty, aggression, and substance abuse.
Psychophysiological Indicators of Externalizing Proneness As noted, scores on the broad externalizing factor appear highly heritableFmore heritable, in fact, than individual disorders with which it is associated (Krueger et al., 2002)Fmaking it a compelling target for studies aimed at identifying neurobiological mechanisms of impulse control problems. The most extensively documented neurobiological indicator of externalizing proneness is the P300/P3, a positive-going event-related potential (ERP), maximal at parietal scalp sites, that occurs following the presentation of attended stimuli. Reductions in P3 amplitude have been documented in relation to disorders including alcohol dependence, drug dependence, conduct disorder, adult antisocial personality, and attention-deficit hyperactivity disorder (e.g., Bauer & Hesselbrock, 1999; Biggins, MacKay, Clark, & Fein, 1997; Costa et al., 2000; Kim, Kim, & Kwon, 2001; Porjesz, Begleiter, & Garozzo, 1980), and recent studies have linked the P3 to the broad externalizing factor that these disorders share (Patrick et al., 2006; Venables et al., 2005). Subsequent work demonstrating that the relationship between the externalizing dimension and diminished P3 is primarily attributable to genetic influence (Hicks et al., 2007) lends supports to the idea that P3 is a biomarker of externalizing proneness. Although multiple variants of the P3 exist, the most extensively studied has been the P3 response to target stimuli in frequent–infrequent (oddball) tasks, commonly termed the P300 or P3b. Another is the novelty P3 (P3a), a P3 response to unexpected novel events that exhibits a somewhat earlier latency and a more anterior scalp distribution. Other variants of the P3 occur in tasks in which the familiarity or meaningfulness of stimuli is varied. We use the term ‘‘P3’’ in the current paper to refer to this broad family of ERP components, which includes the P3a and P3b. Available data indicate that differing variants of the P3 overlap in terms of their underlying neural generators, with structures including the inferior parietal lobe, temporoparietal junction, anterior cingulate cortex, and prefrontal cortex (PFC) playing some role in each (see Linden, 2005). However, the relative contribution of particular brain regions to the P3 can differ as a function of stimulus and task parameters. For example, the topography of the novelty P3 (P3a) tends to be more frontocentral than that of the oddball-target P3 (P3b) and is thought to engage frontal brain regions such as lateral PFC more so than the P3b.
65 In addition to variants of the P3, externalizing and other constructs involving disinhibition have been linked to the response-locked ERN, a negative-going brain potential, maximal at frontocentral electrode sites, that follows performance errors in speeded response tasks (Falkenstein, Hohnsbein, & Hoormann, 1991; Gehring, Goss, Coles, Meyer, & Donchin, 1993). In terms of underlying neural sources, substantial evidence points to the anterior cingulate cortex (ACC; Dehaene, Posner, & Tucker, 1994), as well as the supplementary motor area, as primary sources of the ERN, with other structures, including the PFC (Gehring & Knight, 2000), playing a supporting role. Reduced ERN amplitude has been documented for individuals scoring low on socialization (reflecting rebelliousness, impulsivity, and aggression; Dikman & Allen, 2000) and conscientiousness (a Big Five personality dimension reflecting tendencies toward responsibility, reliability, and dutifulness; Pailing & Segalowitz, 2004), as well as for individuals scoring highly on disinhibitory traits such as impulsiveness (Pailing, Segalowitz, Dywan, & Davies, 2002; Potts, George, Martin, & Barratt, 2006) and psychoticism (Santesso, Segalowitz, & Schmidt, 2005). Hall, Bernat, and Patrick (2007) extended this prior work by testing the hypothesis that the ERN would be related to the general externalizing factor that, as mentioned previously, reflects proneness to problems of impulse control and affiliated traits (e.g., impulsivity, aggression, and irresponsibility). Consistent with prediction, Hall et al. found that individuals high in externalizing proneness (as measured by an abbreviated version of the ESI) showed reduced amplitude of the ERN over frontocentral scalp locations where the ERN tends to be maximal. Important questions that have yet to be addressed are whether these differing ERP measures (P3, ERN) represent overlapping or unrelated indicators of externalizing proneness and whether they index some neural process in common that accounts for their individual relations with the externalizing construct. Despite differing scalp topographies, some indirect evidence exists to link P3 and ERN responses as indicators of externalizing tendencies. As noted earlier, frontal brain regions, including ACC and PFC, are known to be involved in the generation of each (Dehaene et al., 1994; Dien, Spencer, & Donchin, 2003; Miltner, Braun, & Coles, 1997; Nieuwenhuis, Aston-Jones, & Cohen, 2005), and frontal brain dysfunction has also been implicated in differing forms of disinhibitory psychopathology (Morgan & Lilienfeld, 2000; Peterson & Pihl, 1990). Based on these lines of evidence, we hypothesized, as described below, that some overlap would be evident in the bivariate relations of P3 and ERN responses with the externalizing construct.
Present Study Aims and Hypotheses A primary aim of the current study was to evaluate relationships among differing psychophysiological indicators of externalizing proneness using data from a preexisting sample (for prior reports of findings from this sample, see Bernat, Nelson, Steele, Gehring, & Patrick, 2010; Hall et al., 2007; Venables et al., 2005). As detailed above, externalizing proneness has been related to various ERP components in past work. However, it has not been clear whether these observed relations reflect deviations in distinctive cognitive processes associated with each component (e.g., in the case of P3, deficits in context updating; in the case of ERN, deficits in performance monitoring) or whether amplitude reductions in these differing components reflect some more basic process (or set of processes) that spans tasks.
66 We addressed this question by directly examining relations between P3 and ERN responses in the current sample and evaluating the extent to which these brain response components overlap in their relations with externalizing proneness. Specifically, we evaluated whether differing ERP indicators would evidence a sufficient degree of convergence to permit a common factor to be derived reflecting their covariance. In addition, we evaluated whether the observed covariance among indicators reflected externalizing proneness or not by examining the association of the common ERP factor with an omnibus index of externalizing (i.e., the ESI) that had evidenced associations with each individual ERP indicator. We also evaluated the validity of this shared ERP-based factor in relation to separate criterion measures of externalizing proneness from two distinct measurement domains: psychometric (self-report) assessment and physiological (ERP) measurement. We included physiological criterion variables along with more traditional diagnostic variables because we were interested in comparing predictive relations for criteria in the same domain versus a different measurement domain. Data were available for three tasks: a flanker discrimination task, a gambling feedback task, and a visual oddball task. The following three measures from the flanker and gambling tasks were utilized as primary indicators in analyses aimed at delineating a common neurophysiological factor: P3 response to target stimuli in the flanker task, P3 response to feedback stimuli in the gambling task, and ERN response following performance errors in the flanker task. P3 responses to stimuli (target, novel) in the oddball task were reserved as criterion measures in followup validation analyses. Oddball P3 responses were utilized as criterion variables because extensive research documents diminished oddball task P3 as an indicator of externalizing proneness and because these responses were measured in a separate task from the primary ERP indicators. Our primary study hypothesis, based on extensive prior research examining P3 response to oddball task stimuli, was that P3 responses to flanker and feedback stimuli would, along with ERN response as previously reported by Hall et al. (2007), evidence significant negative relations with externalizing proneness as indexed by the ESI. Findings in line with this prediction would indicate that the P3–externalizing relationship generalizes across differing stimuli and experimental conditions. Our additional hypotheses, pertaining to coherence among ERP indices of externalizing proneness, were predicated on this primary hypothesis and thus were somewhat more tentative. First, in view of data indicating a role for anterior brain structures in the generation of both ERN and P3, we postulated some degree of overlap in the psychophysiological process(es) tapped by each individual ERP indicator. Specifically, we hypothesized that scores on the three primary ERP indicators (gambling feedback–P3, flanker target– P3, flanker response–ERN) would correlate with one another, as a function of overlap in associated processes. We posited further that variance in common among these differing electrocortical indicators would reflect, at least in part, psychophysiological processes related to externalizing proneness. Based on this presumption, we hypothesized that scores on a common ERP factor, reflecting the overlap among primary P3 and ERN indicators, would significantly predict scores on the ESI as well as scores on separate criterion measures of externalizing proneness representing psychometric and physiological assessment domains, namely, scores on self-report measures of disinhibitory problems/traits and reductions in amplitude of P3 responses measured within an
L.D. Nelson, C.J. Patrick, & E.M. Bernat oddball task. Regarding relations of the ERP-based factor with criterion measures, we expected that correlations for oddball P3 measures would exceed correlations for self-report measures as a function of same versus differing assessment domains (cf. Campbell & Fiske, 1959). Method Participants Participants were undergraduates preselected from a larger sample of students (N 5 1,637) based on their scores on the ESI. Individuals were selected to represent the full range of scores, with participants scoring in the upper and lower quartiles of the distribution of ESI scores oversampled to ensure strong representation of high- and low-scoring individuals. Data for the three study tasks (gambling, flanker, oddball; Bernat et al., 2010; Hall et al., 2007; Venables et al., 2005) were available for the 92 participants included in the Hall et al. ERN study. Two of these participants were dropped from the analyses due to excessive ERP signal artifact in the oddball task, and two others were dropped due to excessive artifact in the gambling task, yielding a final N of 88 (55 women; mean age 5 20.47 years, SD 5 2.57). Questionnaire Measures Externalizing Spectrum Inventory (ESI). Participants completed an abbreviated (100-item) version of the ESI (ESI-100; Krueger et al., 2007), a self-report measure developed to assess a range of behavioral and personality characteristics associated with externalizing spectrum psychopathology. Higher ESI-100 scores indicate greater externalizing tendencies. Internal consistency reliability (Cronbach’s a) for the ESI-100 in the current sample was .95. Participants also completed other self-report questionnaires that served as separate criterion measures of externalizing tendencies; descriptions of these measures, with a coefficients for the current sample noted in parentheses after scale abbreviations, are as follows. Alcohol Dependence Scale (ADS; a 5 .88). The ADS (Skinner & Allen, 1982) is a 29-item measure with questions related to alcohol use, abuse, and dependence. The ADS yields a total score such that higher scores indicate more extreme alcohol-related problems. Short Drug Abuse Screening Test (SDAST; a 5 .77). The SDAST (Skinner, 1982) is a 20-item questionnaire that indexes problems involving drug use, including drug abuse and dependence. High SDAST total scores indicate more severe drug-related problems. Behavior Report on Rule-Breaking (BHR; a 5 .92). The BHR is a questionnaire of adolescent and adult antisocial behaviors composed of items from several other published measures (Clark & Tifft, 1966; Hindelang, Hirschi, & Weis, 1981; Nye & Short, 1957). The measure includes 33 items about unlawful or inappropriate behavior, and each item requests a rating for both adolescence (before age 18) and adulthood (age 18 and up) behavior. Socialization Scale (So; a 5 .84). The So Scale (Gough, 1960) is a 52-item self-report measure that indexes socialization, a construct with similarities to the externalizing construct. Low
Multivariate ERP assessment of externalizing scores indicated higher levels of rebelliousness, aggression, and impulsivity. Procedure Experimental stimuli were presented centrally on a 21-in. Dell high-definition CRT color monitor, using E-Prime version 1.1 software (Psychology Software Tools, Inc.). Behavioral responses were made using the PST Serial Response Box from the same company. During a single physiological recording session, participants completed the following three tasks sequentially. Flanker discrimination task. This task, consisting of six 100trial blocks, was a variant of the Eriksen flanker task (Eriksen & Eriksen, 1974). As described by Hall et al. (2007), participants viewed target letter arrays (HHHHH, SSSSS, HHSHH, and SSHSS; 86% of trials) and pressed a button (left or right) to indicate the central letter (‘‘H’’ or ‘‘S’’) in the array. The task also included nontarget stimuli (XXXXX, SSXSS, HHXHH; 14% of trials) to which no response was made. Each stimulus was presented for 150 ms, followed by a 1000-ms response window and a 1500–2500-ms (M 5 2000 ms) fixation point prior to the onset of the next trial. To enhance task difficulty and increase performance errors, hand–letter assignment was reversed prior to the start of each new block of trials. Gambling feedback task. This task, consisting of twelve 32trial blocks, was modified from the procedure of Gehring and Willoughby (2002). On each trial, participants selected between two numeric options (5–5, 25–25, 5–25, 25–5) and then received feedback indicating whether their choice resulted in a gain or a loss of money. Outcomes were signaled by changes in the color of boxes enclosing the two numeric options: The box around the chosen option turned red or green to indicate either a win or loss, and the box enclosing the unchosen box turned red or green to indicate what the outcome would have been had the participant made the other choice. Color–outcome mapping was counterbalanced across participants. The choice stimulus remained on the screen until a selection was made, after which a blank screen appeared for 100 ms. The feedback stimulus appeared for 1000 ms, followed by a blank screen for 1500 ms preceding the onset of the next trial. Oddball task. This task, consisting of 240 trials, was a threestimulus variant of the ‘‘rotated-heads’’ visual oddball task (Begleiter, Porjesz, Bihari, & Kissin, 1984). Task stimuli included nontarget ovals (70% of trials), target ‘‘heads’’ (15% of trials), containing a nose and one ear, and (3) novel nontarget stimuli (15% of trials) consisting of pleasant, neutral, and unpleasant pictures from the International Affective Picture System (Center for Study of Emotion and Attention, 1999). Participants responded to target heads with a right or left button press to indicate the side on which the ear appeared. Stimuli appeared for 100 ms each and were separated by intertrial intervals (with central fixation) of 4000 to 5000 ms. Psychophysiological Data Acquisition and Reduction Electroencephalogram (EEG) activity was recorded using a 64channel Neuroscan Synamps2 amplifier system. EEG electrodes (sintered Ag-AgCl) were positioned in accordance with the International 10–20 system (Jasper, 1958) using a Quick-Cap electrode array. Impedances at all sites were below 10 kO. Ocular activity was recorded from above and below the left eye. EEG signals were referenced online to electrode site CPz and digitized
67 at 1000 Hz and then re-referenced off-line to linked mastoids and resampled to 128 Hz. The response-locked ERN was epoched from 1000 ms before to 1000 ms after response onset; all stimulus-locked P3 measures were epoched from 1000 ms before to 2000 ms after stimulus onset. Trial-level EEG data were corrected for ocular and movement artifacts using an algorithm developed by Semlitsch, Anderer, Schuster, and Presslich (1986), as implemented in the Neuroscan EDIT software (version 4.3). For the response-locked ERN, a 1-Hz high-pass filter was also applied to reduce the effect of slow-wave motor potentials that can contaminate response-locked signals. The stimulus- and response-locked ERPs from the flanker task (P3 and ERN, respectively) were averaged across all target stimulus trials on which a response occurred. The feedback-locked P3 (gambling task) was averaged across stimulus trials involving gain and loss outcomes. Because the ERP components were measured from varying tasks with differing procedural parameters, measurement windows for each ERP variable were defined according to taskspecific waveforms (Picton et al., 2000), resulting in variations in the time windows employed across tasks. The flanker ERN was defined as the maximum negative-voltage peak, relative to a ! 250 to ! 50-ms preresponse baseline, occurring within a window beginning with the onset of an incorrect button-press response and terminating at 125 ms postresponse. (To facilitate comparisons with the P3 response variables, raw ERN scores were inverted such that higher positive values reflected larger ERN amplitudes.) P3 components were computed as maximum voltage peaks relative to a prestimulus baseline within designated time windows as follows: flanker P3, peak 320.31 to 500 ms poststimulus relative to ! 148.44 to ! 7.81 ms prestimulus baseline; feedback P3, 296.88 to 500 ms relative to ! 101.56 to ! 7.81 ms baseline;1 and oddball target and novelty P3, 250 to 562.5 ms relative to ! 148.44 to ! 7.81 ms baseline. (Note: Window onset and offset times contain decimals as they represent bins of 128 Hz resampled data.) For each ERP measure, data from the frontocentral (FCz) electrode location were used in the analyses reported here in order to facilitate comparisons and because associations with externalizing scores tended to be maximal at this scalp location. In this regard, Hall et al. (2007) reported that the ERN/externalizing association was distributed frontocentrally on the scalp and focused their analyses of the ERN on electrode site FCz, where the magnitude of the correlation with externalizing scores was r 5 .29. Mirroring Hall et al., we operationalized the ERN in terms of (inverted) amplitude for error trials at FCz. For the P3 measures, we evaluated associations for each with externalizing at representative frontal, central, and parietal electrode sites. Consistent with the idea that externalizing tendencies entail deficits in anterior brain function (and consistent with the topography of ERN effects), we found that externalizing-related amplitude reductions for each P3 measure were more pronounced at frontocentral as compared to parietal sites. For example, the correlation between ESI-100 externalizing scores and 1 In other work using this task and data set (Bernat et al., 2010), the feedback P3 was operationalized as a time–frequency component to separate it from a somewhat overlapping, negative-polarity component of higher frequency. In the current study, to simplify presentation and facilitate comparisons with other ERP variables, we scored the feedback P3 using the more common time–domain peak approach. Peak scores correlated very highly with scores based on the time–frequency approach, r 5 .92.
68
L.D. Nelson, C.J. Patrick, & E.M. Bernat Response ERN
Feedback P3
Flanker P3
Amplitude (µV)
FCz 0
10
–5
5
–10
0 –200
0
200
400
Low Externalizing High Externalizing
20
10
0 0
300 600 Time (ms)
900
0
300
600
900
Peak Amplitude + – High v. Low EXT Difference
Figure 1. Average ERN (on error trials), flanker P3, and feedback P3 response waveforms for subgroups low and high in externalizing tendencies (top and bottom 25% of scorers on the ESI-100) at electrode site FCz. Color topographic maps below the waveform plots depict (1) the overall peak amplitude of each ERP measure (upper row of topographic plots) and (2) the relative magnitude and directionality of group differences (low minus high externalizing) across scalp sites for each ERP response measure (bottom row).
flanker-stimulus P3 was ! .37 at FCz but only ! .25 at Pz; similarly, the association of feedback-stimulus P3 with externalizing was ! .24 at FCz compared with ! .17 at Pz. For P3 responses to target and novel stimuli in the oddball task, rs at electrode site FCz were ! .31 and ! .32, respectively, compared with ! .11 and ! .13 at Pz. Notably, these patterns contrasted with the topography of P3 response across participants in the sample as a whole, where amplitudes tended to be maximal at parietal locationsFas is typical of the P3 (e.g., Katayama & Polich, 1999).
the ERP-based composite account for externalizing-related variance in these measures. Supplemental analyses were performed to test for possible moderating effects of age and gender. Neither variable showed any evidence of a moderating effect on the relationship between externalizing proneness (indexed by scores on the ESI-100) and any of the physiological (ERP) measures included in the analyses. Thus, results are presented without inclusion of these demographic variables in the analysis.
Data Analyses The analyses are described in several parts below. First, we present the three ERP-based indicators of externalizing and show correlational and exploratory factor analyses demonstrating their coherence. Second, we describe follow-up analyses demonstrating that the coherence among these ERP variables reflects a common externalizing-related process rather than some other common brain process unrelated to externalizing. Finally, we present correlations between a composite variable, derived from the three ERP measures using principal axis factor analysis,2 and an array of criterion measures (representing self-report diagnostic and physiological response domains) to examine the validity of this ERP composite in relation to other known indicators of externalizing proneness. In addition, for the physiological (oddball task P3) criterion measures, hierarchical regression analyses are presented to directly evaluate the extent to which scores on
Results
2 Principal axis factor analysis was used rather than principal components analysis because the focus of our interest was on evaluating the coherence among indicators and extracting a composite reflecting this coherence (i.e., we explicitly wanted to capture the shared variance attributable to a common factor underlying the differing indicators and exclude variance unique to each indicator).
Correlations among ERP indicators and derivation of a multivariate ERP composite. As shown in Table 1, the three primary ERP variables correlated significantly with one another (rs 5 ! .24 to ! .27). To evaluate the possibility that these measures index some process or processes in common, a principal
Constructing a Multivariate ERP-Based Index of Externalizing Proneness Bivariate relations of individual ERP variables with externalizing tendencies. To illustrate the primary ERP response variables on which our analyses focused (flanker response ERN, flanker stimulus P3, feedback stimulus P3), Figure 1 presents waveforms for these three variables at electrode site FCz for participants high (top quartile) versus low (bottom quartile) on the ESI-100. As indicated in Table 1, the correlation between continuous ESI-100 externalizing scores and (inverted) ERN amplitude in the sample as a whole (N 5 88) was ! .29. Mirroring findings from prior studies using conventional oddball task P3s, the flanker and feedback P3 responses also evidenced significant negative associations with externalizing scores, rs 5 ! .37 and ! .24.
Multivariate ERP assessment of externalizing
69
Table 1. Correlations among Physiological and Questionnaire Indicators of Externalizing Proneness
criterion measures tended to be higher than correlations for the individual ERP indicators.
Response ERNa
Flanker P3
Feedback P3
.27 .24n ! .29nn
.26n ! .37nn
! .24n
Prediction of oddball task P3 amplitude. We also evaluated the ability of ERP factor scores to predict separate brain-based indices of externalizing proneness, namely, P3 responses to target and novel stimuli from the oddball task measured at electrode site FCz. Oddball–target and oddball–novelty P3 responses were utilized as criterion measures because they came from a task separate from the response variables that contributed to the ERP-based composite; further, oddball task P3 has a well-established status as an indicator of externalizing proneness. Across participants in the current sample, oddball–target P3 and oddball–novelty P3 responses were highly correlated with one another (r 5 .76) and showed correlations of ! .31 and ! .32, respectively, with ESI-100 scores. As shown in Table 2 (lower part), scores on the common factor reflecting the overlap among primary ERP indicators (from flanker and gambling tasks) strongly predicted P3 responses to both target and novel stimuli in the oddball task. Data in the lower part of the table also show that correlations with both oddball P3 responses tended to be stronger for the ERP composite variable than for the individual ERP indicators that went into the composite. To quantitatively evaluate the extent to which the composite outperformed individual ERP indicators in predicting ESI-100 externalizing scores, hierarchical regression analyses were performed in which scores on the ERP factor were entered as a predictor in Step 2, following entry of one or the other oddball P3 variable in step 1. For both oddball–target and oddball–novelty P3, the addition of the ERP-based factor as a predictor in the second step (following entry of oddball P3 in the first step) of the model (a) reduced the predictive (beta) coefficient for the oddball P3 variable to nonsignificance (betas in Steps 1 and 2 were, respectively, oddball–target P3, Bs 5 ! .31 and ! .03, ps 5 .004 and .847; oddball–novelty P3, Bs 5 ! .32 and ! .05, ps 5 .002 and .717) and (b) produced a significant increase in R2 for the overall model (for oddball–target P3, R2 increased from .31 in Step 1, F(1,86) 5 8.86, p 5 .004, to .43 in Step 2, F(2,85) 5 9.70, po.001, R2 change F(1,85) 5 9.65, p 5 .003; for oddball-novelty P3, R2 increased from .32 in Step 1, F(1,86) 5 10.01, p 5 .002, to .43 in Step 2, F(2,85) 5 9.76, po.001, R2 change F(1,85)) 5 8.63, p 5 .004). Thus, the ERP-based composite variable significantly outperformed individual comparison ERP variables in predicting externalizing proneness.
Flanker P3 Feedback P3 ESI-100 questionnaire
n
a ERN scores are inverted such that higher scores reflect larger ERN amplitudes. n po.05; nnpo.01.
axis exploratory factor analysis of these measures was performed. This analysis yielded evidence of a single dominant factor accounting for covariance among the three ERP indicators (Figure 2, left plot). This one-factor solution was evident both by visual inspection of the scree plot and by parallel analysis, a technique for determining the number of factors to retain by comparing the eigenvalues of the sample data with those of randomly generated data (Horn, 1965).3 Each ERP indicator loaded appreciably and to a comparable degree on the shared factor (range 5 0.48 to 0.55), indicating that the three ERP variables index something in common and that each contributes similarly to the shared factor. Although the factor analysis indicated that these differing ERP variables index something in common, further analysis was required to determine whether this covariance reflected an externalizing-related brain process, as opposed to overlap in brain activity unrelated to externalizing proneness. To this end, a second factor analysis was performed in which scores on the selfreport ESI-100 measure were included together with scores on the three ERP indicators. The rationale was that if the ERP variables covaried due to externalizing-related variance, then a factor analysis of these variables along with scores on the ESI100 should yield a solution in which all four variables load appreciably on a common factor. This is indeed what was found (see Figure 2, right plot). Thus, despite the fact that one of the variables included in this analysis was from a different measurement domain (self-report) than the others (physiological), all variables appear to index something in common that relates to externalizing proneness. Predictive Validity of the Multivariate ERP Composite To evaluate the predictive validity of the ERP factor in relation to its individual brain response indicators, scores on the common factor derived from the three ERP measures (Figure 2, left plot) were computed using the regression method and examined as predictors of criterion measures known to be related to externalizing proneness. Criterion measures of substance problems, antisocial behavior, and disinhibitory tendencies. The upper part of Table 2 presents correlations for the ERP-based factor and its individual indicators with scores on available self-report measures of alcohol dependence, drug abuse, antisocial behavior, and disinhibitory tendencies. Scores on the ERP-based factor were correlated with each of the self-report criterion variables in the predicted direction, with five of the seven correlations achieving significance. Also notable is the finding correlations of the ERP factor with 3 Here, eigenvalues were computed from the similated data and compared to those of the empirical data. In the current study, eigenvalues for 100 random data samples were computed and averaged. Similar results were found using the 95th percentile of the random data eigenvalues.
Discussion Prior work has documented relations between differing indices of physiological response and externalizing proneness, a broad dispositional factor encompassing tendencies toward impulsivity, antisocial behavior, and alcohol and drug problems that has been conceptualized as reflecting a general vulnerability to problems of impulse control. In particular, amplitude reductions in oddball task P3 and response–ERN have been found in relation to externalizing tendencies. The current study provided an initial demonstration of associations with externalizing proneness for two other variants of the P3. One of these consisted of P3 response to target flanker stimuli in a procedure in which the typical phenomenon of interest is the ERN response that follows performance errors. In this procedure, the target stimulus was an array of letters, and the task involved discriminating the central letter from flanking letters to determine whether to make a left or
70
L.D. Nelson, C.J. Patrick, & E.M. Bernat ERP Indicators Only
Eigenvalue
1.6
ERP Indicators Plus ESI−100 Variable 1.8
Empirical Data Parallel Analysis
1.4
1.6
1.2
1.4 1.2
1.0 1.0 0.8
0.8
0.6
0.6 1
2 Factors
3
1
2
Factor Loadings Response ERN Flanker P3 Feedback P3
3
4
Factors
.50 .55 .48
Factor Loadings Response ERN .49 Flanker P3 .60 Feedback P3 .44 ESI−100 (self-report) –.60
Figure 2. Scree plots and variable loadings for two factor analyses. On the left side is an analysis incorporating the three primary ERP indicators of externalizing vulnerability (response ERN, flanker P3, and feedback P3). ERN scores are inverted such that higher values reflect larger ERN amplitudes. On the right is an analysis incorporating the three ERP indicators along with the self-report externalizing (ESI-100) variable. In each plot, actual eigenvalues (solid line) are accompanied by eigenvalues estimated from a parallel analysis (dashed line) based on 100 random samples.
right button response. The second variant consisted of P3 response to gain and loss feedback in a simulated gambling task in which individuals selected one of two monetary options and then processed feedback as to whether their choice resulted in a gain or a loss of money. Deficits in the amplitude of P3 response to taskrelevant stimuli have been interpreted as reflecting impairment of some kind in postperceptual processing of stimulus input across differing tasks. That these non-oddball variants of the P3 demonstrated associations with externalizing tendencies implies that the P3–externalizing relationship may generalize across a wide range of stimuli and task conditions.
Another key finding involved the topography of the externalizing-related reduction in P3 amplitude relative to the topography of the P3 component itself. As is typical of the P3 component, peak amplitude in the current study tended to be maximal at parietal electrode sites, yet the P3 amplitude reduction associated with externalizing proneness was largest at frontocentral sites. This dissociation is consistent with the notion that the externalizing-related cognitive processing deficit indexed by P3 amplitude reduction involves anterior brain structures, despite the role of more posterior structures in the generation of the P3 response overall. Given evidence (noted
Table 2. Individual ERP Indicators and Multivariate ERP Composite: Correlations with Externalizing-Related Criterion Variables Criterion Variable Psychometric ESI-100 ADS SDAST BHR Total Adult Adolescent Socialization scale Physiological Oddball–target P3 Oddball–novelty P3
N
Response ERN
Flanker P3
88 87 86
! .29nn ! .31nn ! .03
! .37nnn ! .29nn ! .07
! .24n ! .24n ! .14
! .43nnn ! .40nnn ! .11
87 87 87 87
! .28nn ! .25n ! .26n .16
! .29nn ! .23n ! .31nn .06
! .24n ! .24n ! .19 .09
! .38nnn ! .33nn ! .36nn .15
88 88
.42nnn .48nnn
.58nnn .53nnn
Feedback P3
.43nnn .46nnn
Composite
.68nnn .69nnn
Note: ERN scores are inverted such that higher scores reflect larger (more negative) ERN amplitudes. ESI-100 5 100-item Externalizing Spectrum Inventory; ADS 5 Alcohol Dependence Scale; SDAST 5 Short Drug Abuse Screening Test; BHR 5 Behavior Report on Rule-Breaking. n po.05; nnpo.01; nnnpo.001.
Multivariate ERP assessment of externalizing earlier) that the P3 reflects activity in a range of underlying brain regions, including anterior as well as posterior regions, the current findings suggest that the basis of the reduction in P3 amplitude associated with externalizing proneness may lie more in anterior brain structures (e.g., ACC, PFC) that contribute to P3. Notably, the P3 is not the only ERP component that has evidenced relations with externalizing tendencies. As demonstrated by Hall et al. (2007), the ERN response is also negatively associated with tendencies toward externalization. Unique to the current study, however, is the finding that the externalizing-related processing impairments indexed by P3 and ERN appear to be overlapping, despite theorized differences in the mechanisms that underlie ERN and P3. Specifically, ERN response correlated significantly with both feedback P3 and flanker P3 and loaded to a comparable degree with these two variants of P3 on a common factor that, in turn, predicted criterion measures relevant to externalizing psychopathology. The implication is that reductions in ostensibly distinct brain measures assessed in differing tasks contexts may reflect common or intersecting deficits associated with externalizing proneness. A challenge in future research will be to identify exactly what externalizing-related processing deficit may be tapped by these ERP indicators. The topography of externalizing-related effects for these differing indicators supports the idea that it is a frontally driven process, but the extent to which the process in question is one commonly presumed to be indexed by ERN or P3 (e.g., recognition of errors or other performance-related outcomes; incorporation of perceptual input into a mental model of an ongoing task) remains unclear. This question can be addressed in future research by developing hybrid P3/ERN procedures and manipulating task parameters to test alternative hypotheses regarding processes underlying convergence of differing ERP indicators with externalizing measures. Regarding specific brain mechanisms, a plausible hypothesis is that overlapping externalizing-related impairments in these differing ERP indicators reflect dysfunction in anterior brain circuitry including ACC and/ or PFCFstructures known to contribute to the ERN as well as the P3. Studies using other neuroimaging methods in conjunction with EEG/ERP measurement will be valuable in addressing this hypothesis. Given that ERN and differing variants of P3 in the current study overlapped in terms of their relations with ESI-100 externalizing scores, we sought to create an aggregate physiological index of externalizing proneness from these measures. Specifically, we extracted a common factor reflecting the covariance among the three primary ERP components and evaluated the predictive validity of this factor in relation to self-report and physiological criterion measures. In this regard, some limitations of the current study warrant mention. Although it could be argued that the sample size was acceptable in terms of number of subjects per indicator variable (i.e., 425), the sample size was relatively modest for a factor analytic investigation. Similarly, the number of indicators available was too limited to provide for a compelling evaluation of the underlying factor structure of externalizing-related brain measures. Future studies of this type would benefit from larger samples, a wider array of brain response measures, and use of confirmatory factor analytic methods to evaluate alternative models of structure. In particular, it would be desirable to include other ERP components besides the ERN that have been localized to particular neuroanatomic locations. Furthermore, utilization of data from clinical samples
71 would extend the generalizability of the current findings to populations with more severe psychopathology. Another important issue involves the specificity of ERP measures such as P3 and ERN as indicators of externalizing proneness, in view of findings indicating relations with other common disorders outside the externalizing spectrum. For example, anxiety disorder symptoms have been associated with enhancements in both the ERN and P3 (Bruder et al., 2002; Gehring, Himle, & Nisenson, 2000; Hajcak, Franklin, Foa, & Simons, 2008), and depression has been associated with reductions in the amplitude of the P3 response (Bruder et al., 1995; Yanai, Fujikawa, Osada, Yamawaki, & Touhouda, 1997). Findings for ERN in relation to depression have been more mixed (Chiu & Deldin, 2007; Ruchsow et al., 2006). Nonetheless, it will be important in follow-up studies to concurrently assess for symptoms of other disorders (in particular, commonly occurring conditions such as mood- and anxiety-related disorders) in order to establish the specificity of composite ERP variables as biomarkers for externalizing proneness.4 Notwithstanding these limitations, our factor analysis of ERP indicators yielded a number of intriguing findings. Consistent with prediction, scores on the ERP factor composite related in predictable ways to differing self-report indices of disinhibitory tendencies. Correlations with measures of adolescent and adult antisocial deviance and alcohol dependence were most robust. Correlations for measures of socialization and drug abuse, although in predicted directions (negative and positive, respectively), were nonsignificant. The implication is that these specific psychometric indices of externalizing proneness were less reflective of neural processing deviations than other psychometric indices within the current sample. In part, this may reflect unreliability of measurement for narrow manifest indicators (cf. Vaidyanathan, Patrick, & Bernat, 2009); in line with this, it is notable that the highest observed validity coefficient was for prediction of broad ESI-100 scores using the ERP composite. Another factor that may have contributed to weaker associations for some criteria in the current study is limitations associated with self-report measurement. To address this point, it will be valuable in future studies to include interview-based criterion variables along with measures of disinhibitory behaviors and traits derived from self-report. Notably, the ERP common factor estimate generally outperformed constituent ERP indicators (ERN, flanker P3, feedback P3) in the prediction of externalizing-related criterion measures. Further, the ERP composite outperformed individual P3 indicators from a separate task (oddball–target and oddball–novelty P3) in the prediction of ESI-100 externalizing scores, such that ERP composite scores contributed significantly to prediction over and above these alternative psychophysiological indicators. This makes sense from psychometric perspective, insofar as aggregation across indicators enhances reliability of measurement and proportion of ‘‘true score’’ variance available for prediction. From this standpoint, scores on the common ERP factor rep4 Although we did not systematically assess for symptoms of mood and anxiety disorders in the current study, global self-report measures of depression and trait anxiousnessFconsisting of the Self-Rating Depression Scale (SDS; Zung, 1965) and the State-Trait Anxiety Inventory (STAI; Spielberger, 1985)Fwere collected as a supplement to criterion measures of externalizing proneness. No significant correlations were evident for either the SDS or the STAI with any of the available ERP measures (rs 5 .00 to ! .19, n.s.). Further, predictive relations for each ERP measure with ESI-100 externalizing proneness remained significant after controlling for SDS and STAI scores.
72
L.D. Nelson, C.J. Patrick, & E.M. Bernat
resented a purer index of externalizing tendencies than scores on any individual ERP indicator, presumably because factor scores more purely reflected the externalizing-related process tapped by each individual indicator. The broader implication is that multivariate psychometric techniques such as factor analysis, which have long been utilized in the self-report domain to refine measurement of psychological constructs, might similarly be applied to physiological response measures to develop reliable physiologically based protocols for assessing dispositional constructs relevant to mental disorders. Just as questionnaire items are evaluated in terms of their psycho-
metric properties, so too might ERP (or other physiological) response indicators be evaluated quantitatively in terms of their utility in the assessment of individual difference constructs. Following this approach, it should be possible to develop physiologically based measures of individual difference constructs relevant to psychopathology that possess sound psychometric properties (e.g., high internal consistency, high test–retest reliability, stable convergent and discriminant validity). Assessment measures of this type would be of substantial value both for neurobiological research studies and clinical prevention and treatment efforts that emphasize underlying neurobiological mechanisms.
REFERENCES Bauer, L. O., & Hesselbrock, V. M. (1999). Subtypes of family history and conduct disorder: Effects on P300 during the Stroop test. Neuropsychopharmacology, 21, 51–62. Begleiter, H., Porjesz, B., Bihari, B., & Kissin, B. (1984). Event-related brain potentials in boys at risk for alcoholism. Science, 225, 1493–1496. Bernat, E. M., Nelson, L. D., Steele, V. R., Gehring, W. J., & Patrick, C. J. (2010). Externalizing psychopathology and brain responses to gain/ loss feedback in a simulated gambling task: Dissociable components of brain response revealed by time-frequency analysis. Manuscript under review. Biggins, C. A., MacKay, S., Clark, W., & Fein, G. (1997). Event-related potential evidence for frontal cortex effects of chronic cocaine dependence. Biological Psychiatry, 42, 472–485. Bruder, G. E., Kayser, J., Tenke, C. E., Leite, P., Schneier, F. R., Stewart, J. W., et al. (2002). Cognitive ERPs in depressive and anxiety disorders during tonal and phonetic oddball tasks. Clinical Electroencephalography, 33, 119–124. Bruder, G. E., Tenke, C. E., Stewart, J. W., Towey, J. P., Leite, P., Voglmaier, M., et al. (1995). Brain event-related potentials to complex tones in depressed patients: Relations to perceptual asymmetry and clinical features. Psychophysiology, 32, 373–381. Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105. Center for the Study of Emotion and Attention [CSEA-NIMH] (1999). The international affective picture system: Digitized photographs. Gainesville, FL: Author. Chiu, P. H., & Deldin, P. J. (2007). Neural evidence for enhanced error detection in major depressive disorder. American Journal of Psychiatry, 164, 608–616. Clark, J. P., & Tifft, L. L. (1966). Polygraph and interview validation of self-reported deviant behavior. American Sociological Review, 31, 516–523. Costa, L., Bauer, L., Kuperman, S., Porjesz, B., O’Connor, S., Hesselbrock, V., et al. (2000). Frontal P300 decrements, alcohol dependence, and antisocial personality disorder. Biological Psychiatry, 47, 1064–1071. Dehaene, S., Posner, M. I., & Tucker, D. M. (1994). Localization of a neural system for error detection and compensation. Psychological Science, 5, 303–305. Dien, J., Spencer, K. M., & Donchin, E. (2003). Localization of the event-related potential novelty response as defined by principal components analysis. Cognitive Brain Research, 17, 637–650. Dikman, Z. V., & Allen, J. J. B. (2000). Error monitoring during reward and avoidance learning in high- and low-socialized individuals. Psychophysiology, 37, 43–54. Eriksen, B., & Eriksen, C. (1974). Effects of noise letters upon the identification of a target letter in a non-search task. Perception and Psychophysics, 16, 143–149. Falkenstein, M., Hohnsbein, J., & Hoormann, J. (1991). Effects of crossmodal divided attention on late ERP components: II. Error processing in choice reaction tasks. Electroencephalography and Clinical Neurophysiology, 78, 447–455. Gehring, W. J., Goss, B., Coles, M. G. H., Meyer, D. E., & Donchin, E. (1993). A neural system for error detection and compensation. Psychological Science, 4, 385–390.
Gehring, W. J., Himle, J., & Nisenson, L. G. (2000). Action-monitoring dysfunction in obsessive-compulsive disorder. Psychological Science, 11, 1–6. Gehring, W. J., & Knight, R. T. (2000). Prefrontal-cingulate interactions in action monitoring. Nature Neuroscience, 3, 516–520. Gehring, W. J., & Willoughby, A. R. (2002). The medial frontal cortex and the rapid processing of monetary gains and losses. Science, 295, 2279–2282. Gorenstein, E. E., & Newman, J. P. (1980). Disinhibitory psychopathology: A new perspective and a model for research. Psychological Review, 87, 301–315. Gough, H. G. (1960). Theory and measurement of socialization. Journal of Consulting Psychology, 24, 23–30. Hajcak, G., Franklin, M. E., Foa, E. B., & Simons, R. F. (2008). Increased error-related brain activity in pediatric obsessive-compulsive disorder before and after treatment. American Journal of Psychiatry, 165, 116–123. Hall, J. R., Bernat, E. M., & Patrick, C. J. (2007). Externalizing psychopathology and the error-related negativity. Psychological Science, 18, 326–333. Hicks, B. M., Bernat, E., Malone, S. M., Iacono, W. G., Patrick, C. J., Krueger, R. F., et al. (2007). Genes mediate the association between P3 amplitude and externalizing disorders. Psychophysiology, 44, 98–105. Hindelang, M. J., Hirschi, T., & Weis, J. G. (1981). Measuring delinquency. Beverly Hills, CA: Sage. Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30, 179–185. Hyman, S. M. (2007). Can neuroscience be integrated into the DSM-V? Nature Reviews: Neuroscience, 8, 725–732. Iacono, W. G. (1998). Identifying psychophysiological risk for psychopathology: Examples from substance abuse and schizophrenia research. Psychophysiology, 35, 621–637. Insel, T. R., & Scolnick, E. M. (2006). Cure therapeutics and strategic prevention: Raising the bar for mental health research. Molecular Psychiatry, 11, 11–17. Jasper, H. H. (1958). The ten-twenty electrode system of the International Federation. Electroencephalography and Clinical Neurophysiology, 10, 371–375. Katayama, J., & Polich, J. (1999). Auditory and visual P300 topography from a 3 stimulus paradigm. Clinical Neurophysiology, 110, 463–468. Kendler, K., Prescott, C., Myers, J., & Neale, M. (2003). The structure of genetic and environmental risk factors for common psychiatric and substance use disorders in men and women. Archives of General Psychiatry, 60, 929–937. Kim, M. S., Kim, J. J., & Kwon, J. S. (2001). Frontal P300 decrement and executive dysfunction in adolescents with conduct problems. Child Psychiatry and Human Development, 32, 93–106. Krueger, R. F. (1999). The structure of common mental disorders. Archives of General Psychiatry, 56, 921–926. Krueger, R. F., Hicks, B. M., Patrick, C. J., Carlson, S. R., Iacono, W. G., & McGue, M. (2002). Etiologic connections among substance dependence, antisocial behavior, and personality: Modeling the externalizing spectrum. Journal of Abnormal Psychology, 111, 411–424.
Multivariate ERP assessment of externalizing Krueger, R. F., Markon, K. E., Patrick, C. J., Benning, S., & Kramer, M. (2007). Linking antisocial behavior, substance use, and personality: An integrative quantitative model of the adult externalizing spectrum. Journal of Abnormal Psychology, 116, 645–666. Krueger, R. F., McGue, M., & Iacono, W. G. (2001). The higher-order structure of common DSM mental disorders: Internalization, externalization, and their connections to personality. Personality and Individual Differences, 30, 1245–1259. Linden, D. E. J. (2005). The P300: Where in the brain is it produced and what does it tell us? Neuroscientist, 11, 563–576. Miltner, W. H. R., Braun, C. H., & Coles, M. G. H. (1997). Eventrelated brain potentials following incorrect feedback in a time-estimation task: Evidence for a ‘‘generic’’ neural system for error detection. Journal of Cognitive Neuroscience, 9, 788–798. Mineka, S., Watson, D., & Clark, L. A. (1998). Comorbidity of anxiety and unipolar mood disorders. Annual Review of Psychology, 49, 377– 412. Morgan, A. B., & Lilienfeld, S. O. (2000). A meta-analytic review of the relation between antisocial behavior and neuropsychological measures of executive function. Clinical Psychology Review, 20, 113–136. Nieuwenhuis, S., Aston-Jones, G., & Cohen, J. D. (2005). Decision making, the P3, and the locus coeruleus-norepinephrine system. Psychological Bulletin, 131, 510–532. Nye, F. I., & Short, J. F. Jr. (1957). Scaling delinquent behavior. American Sociological Review, 22, 326–331. Pailing, P. E., & Segalowitz, S. J. (2004). The error-related negativity as a state and trait measure: Motivation, personality, and ERPs in response to errors. Psychophysiology, 41, 84–95. Pailing, P. E., Segalowitz, S. J., Dywan, J., & Davies, P. L. (2002). Error negativity and response control. Psychophysiology, 39, 198–206. Patrick, C. J., Bernat, E., Malone, S. M., Iacono, W. G., Krueger, R. F., & McGue, M. (2006). P300 amplitude as an indicator of externalizing in adolescent males. Psychophysiology, 43, 84–92. Peterson, J. B., & Pihl, R. O. (1990). Information processing, neuropsychological function, and the inherited predisposition to alcoholism. Neuropsychology Review, 1, 343–369. Picton, T. W., Bentin, S., Berg, P., Donchin, E., Hillyard, S. A., Johnson, R. Jr., et al. (2000). Guidelines for using human event-related potentials to study cognition: Recording standards and publication criteria. Psychophysiology, 37, 127–152. Porjesz, B., Begleiter, H., & Garozzo, R. (1980). Visual evoked potential correlates of information processing deficits in chronic alcoholics. In H. Begleiter (Ed.), Biological effects of alcohol (pp. 603–623). New York: Plenum.
73 Potts, G. F., George, M. R. M., Martin, L. E., & Barratt, E. S. (2006). Reduced punishment sensitivity in neural systems of behavior monitoring in impulsive individuals. Neuroscience Letters, 397, 130–134. Ruchsow, M., Herrnberger, B., Beschoner, P., Gro¨n, G., Spitzer, M., & Kiefer, M. (2006). Error processing in major depressive disorder: Evidence from event-related potentials. Journal of Psychiatric Research, 40, 37–46. Santesso, D. L., Segalowitz, S. J., & Schmidt, L. A. (2005). ERP correlates of error monitoring in 10 year-olds are related to socialization. Biological Psychology, 70, 79–87. Semlitsch, H. V., Anderer, P., Schuster, P., & Presslich, O. (1986). A solution for reliable and valid reduction of ocular artifacts, applied to the P300 ERP. Psychophysiology, 23, 695–703. Sher, K. J., & Trull, T. (1994). Personality and disinhibitory psychopathology: Alcoholism and antisocial personality disorder. Journal of Abnormal Psychology, 103, 92–102. Skinner, A. (1982). The Drug Abuse Screening Test. Addictive Behaviors, 7, 363–371. Skinner, H. A., & Allen, B. A. (1982). Alcohol dependence syndrome: Measurement and validation. Journal of Abnormal Psychology, 91, 199–209. Spielberger, C. D. (1985). Assessment of state and trait anxiety: Conceptual and methodological issues. The Southern Psychologist, 2, 6–16. Vaidyanathan, U., Patrick, C. J., & Bernat, E. M. (2009). Startle reflex potentiation during aversive picture viewing as an indicator of trait fear. Psychophysiology, 46, 75–85. Venables, N. C., Bernat, E. M., Hall, J. R., Steffen, B. V., Cadwallader, M., Krueger, R. F., et al. (2005). Neurophysiological correlates of behavioral disinhibition: Separable contributions of distinct personality traits. Psychophysiology, 42, S126. Yanai, I., Fujikawa, T., Osada, M., Yamawaki, S., & Touhouda, Y. (1997). Changes in auditory P300 in patients with major depression and silent cerebral infarction. Journal of Affective Disorders, 46, 264– 271. Young, S., Stallings, M., Corley, R., Krauter, K., & Hewitt, J. (2000). Genetic and environmental influences on behavioral disinhibition. American Journal of Medical Genetics (Neuropsychiatric Genetics), 96, 684–695. Zung, W. W. K. (1965). A self-rating depression scale. Archives of General Psychiatry, 12, 63–70.
(Received September 8, 2009; Accepted February 3, 2010)
Psychophysiology, 48 (2011), 74–81. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01048.x
Testing asymmetries in noncognate translation priming: Evidence from RTs and ERPs
SOFIE SCHOONBAERT,a PHILLIP J. HOLCOMB,b JONATHAN GRAINGER,c and ROBERT J. HARTSUIKERa a
Department of Experimental Psychology, Ghent University, Ghent, Belgium Department of Psychology, Tufts University, Medford, Massachusetts, USA CNRS and Cognitive Psychology Laboratory, Aix-Marseille University, Marseille, France
b c
Abstract In this study, English–French bilinguals performed a lexical decision task while reaction times (RTs) and event related potentials (ERPs) were measured to L2 targets, preceded by noncognate L1 translation primes versus L1 unrelated primes (Experiment 1a) and vice versa (Experiment 1b). The prime–target stimulus onset asynchrony was 120 ms. Significant masked translation priming was observed, indicated by faster reaction times and a decreased N400 for translation pairs as opposed to unrelated pairs, both from L1 to L2 (Experiment 1a) and from L2 to L1 (Experiment 1b), with the latter effect being weaker (RTs) and less longer lasting (ERPs). A translation priming effect was also found in the N250 ERP component, and this effect was stronger and earlier in the L2 to L1 priming direction than the reverse. The results are discussed with respect to possible mechanisms at the basis of asymmetric translation priming effects in bilinguals. Descriptors: N250, N400, Bilingualism, Visual word recognition, Masked translation priming
using the lexical decision task), suggesting that L1/L2 representational differences are quantitative rather than qualitative. However, because the L2 of learners of a second language is unlikely to be as strongly represented as their L1, priming from L1 to L2 should be stronger than priming from L2 to L1. One limitation of the above mentioned behavioral studies is that they cannot separate out a semantic from a lexical locus of the effects. Holcomb and Grainger (2006) suggested a way to do this with event-related potentials (ERPs) in masked priming. Using ERPs, one can track the time course of language processing during priming more precisely, in order to explore if the priming effects originate at a lexical (earlier effects) or semantic level (later effects). In some recent electrophysiological studies (Grainger, Kiyonaga, & Holcomb, 2006; Holcomb & Grainger, 2006, 2007), Holcomb and colleagues described a range of ERP components that are modulated in within-language repetition priming paradigms. Two of these components are of particular relevance for the present bilingual study. The first component, namely, the N400, is a negative-going component that peaks between 400 and 600 ms after target onset and is typically larger at middle and posterior scalp sites. In masked priming, this component is known to be reduced for targets preceded by repeated items, as opposed to targets preceded by unrelated items. Because the semantic representation of the target (e.g., BOY) is preactivated by an identity or repetition prime (boy), the N400 component, reflecting semantic integration (see Kutas & Federmeier, 2000; Kutas & Hillyard, 1980, 1984; Kounios & Holcomb, 1992, 1994), is less negative and thus reduced. Finding this N400 modulation in masked priming from L2 to L1 would
Although bilinguals have been the focus of study for years now, there is still much debate about how knowledge concerning each language is represented in long-term memory and how their representations interact. While many researchers agree that bilinguals’ first language (L1) might influence their second language (L2) processing, there is less of a consensus about L2 influences on L1. For instance, conflicting data have been obtained using the masked translation priming paradigm and lexical decision task to study L2 to L1 influences. Many studies have failed to find faster lexical decision times to L1 targets (e.g., BOY) when preceded by masked noncognate L2 translation primes (L2 translation of boy) than when preceded by an unrelated L2 word (e.g., Finkbeiner, Forster, Nicol, & Nakamura, 2004; Gollan, Forster, & Frost, 1997; Jiang, 1999; Jiang & Forster, 2001). However, some recent behavioral studies have found significant L2 to L1 priming effects (Basnight-Brown & Altarriba, 2007; Dun˜abeitia, Perea, & Carreiras, 2010; Duyck & Warlop, 2009; Perea, Dun˜abeita, & Carreiras, 2008; Schoonbaert, Duyck, Brysbaert, & Hartsuiker, 2009; we refer to the latter study for a recent review of behavioral masked translation priming studies, This research was supported by the Research Foundation–Flanders (F.W.O.-Vlaanderen), of which the first author is a research assistant, and a grant by the National Institute of Health, awarded to the second and third authors (HD 043251 and HD 25889). We thank Marianna Eddy, Courtney Brown, and especially Katherine Midgley for technical assistance and help with the setup of the ERP experiments. Address correspondence to: Sofie Schoonbaert, Department of Experimental Psychology, Ghent University, Henri Dunantlaan 2, 9000 Ghent, Belgium. E-mail:
[email protected] 74
Noncognate translation priming thus clearly indicate the use of a semantic route to transfer activation from L2 to L1, in other words, conceptual mediation. A second ERP component that has recently been identified in masked repetition priming studies is the N250. This negativegoing wave peaks around 250 ms. Its amplitude is reduced mostly (less negative) for targets that were preceded by identity primes and increases with decreasing lexical overlap from targets with the preceded primes (Holcomb & Grainger, 2006). Holcomb and Grainger (2006) proposed that the N250 reflects a process whereby prelexical orthographic representations are mapped onto lexical representations. It remains to be seen if N250 effects will be observed across languages (when using noncognates translation pairs; see below; Midgley, Holcomb, & Grainger, 2009). Following the Revised Hierarchical Model (RHM; Kroll & Stewart, 1994), it could be hypothesized that L2 to L1 priming will show strong ‘‘lexical’’ N250 effectsFyet less evidence of ‘‘semantic’’ N400 effectsFthan L1 to L2 priming, because the model posits that L2 has strong direct lexical connections with L1, whereas activation from L1 to L2 will heavily rely on semantic mediation and therefore should show larger semantic N400 effects. The one neurophysiological study that investigated the translation priming paradigm used a semantic categorization task (Midgley et al., 2009). Midgley and colleagues tested unbalanced French–English bilinguals under masked translation priming conditions using a short 50-ms prime duration and 17-ms backward mask (i.e., 67 ms stimulus onset asynchrony [SOA]). Significant priming effects, indicated by a typical (i.e., more posterior) N400 change across conditions, were observed for L1 to L2 priming, but not for L2 to L1 priming. Interestingly, Midgley et al. also observed a modulation of the N250 in the L1 to L2 priming condition, but not in the reverse L2 to L1 condition. This particular finding would seem to be inconsistent with the predictions of the RHM, where L2 to L1 lexical connections would be expected to result in lexical level priming as reflected in the N250. Furthermore, it is not simply the case that L2 primes were not able to produce priming effects, because Midgley et al. did observe significant N250 and N400 effects within the second language (i.e., priming from L2 to L2). Midgley et al. (2009) interpreted the L1-L2 translation priming effect seen in the N250 component as reflecting flow of activation from semantic representations, rapidly activated by the prime stimulus, back down to whole-word form representations in L2. They argued that such feedback operates quickly enough to modulate ongoing feedforward processing of L2 target words at the level of prelexical and lexical form representations. Midgley et al. also speculated that the lack of an L2-L1 priming effect in the N250 could have been due to the slower processing of L2 primes not allowing sufficient time to access semantic representations and feedback information to form level representations in L1. Of course, no such feedback is necessary in order to obtain an N250 priming effect when both primes and targets are in L2. They therefore predicted that with longer prime durations, L2L1 priming effects should be observable in the N250 component. The present study provides a direct test of this prediction. In the present study we investigated masked translation priming effects under slightly different conditions from the Midgley et al. (2009) study. First, we used a lexical decision task rather than a semantic categorization task. Based on the literature, we might expect that masked translation priming effects are more elusive in the lexical decision task than the semantic categorization task (e.g., Grainger & Frenck-Mestre, 1998), because these effects are believed to have a semantic locus, and semantic cat-
75 egorization taps deeper into semantics than lexical decision. However, recent masked priming studies using the lexical decision task have found significant cross-language priming effects (e.g., Duyck & Warlop, 2009; Schoonbaert et al. 2009). Furthermore, previous monolingual masked priming ERP studies have also shown similar effects in the ERP signal whether participants were performing a semantic categorization or a lexical decision task (Kiyonaga, Grainger, Midgley, & Holcomb, 2007; see Grainger & Holcomb, 2009, for review). Nevertheless, it remains to be seen if evidence for L2 to L1 priming can be found in ERPs when the task focuses participants’ attention on lexical rather than semantic processes. The second difference relative to the Midgley et al. study is that we used a longer prime duration (100 ms vs. 50 ms) in order to give priming more opportunity to take effect, but we continued to use the masked priming paradigm to avoid strategic priming effects (see Altarriba & BasnightBrown, 2007, for methodological recommendations in performing cross-language priming). In short, the present study provides a further exploration of masked translation priming, with the specific aim of providing information about the time course of such priming effects from L1 to L2 and L2 to L1. Most important is that we used a longer prime duration (and thus a longer SOA) than in prior research that found little evidence for priming from L2 to L1. We investigate whether specific ERP components can provide evidence for the existence of the much debated L2 to L1 priming effect and its lexical or semantic locus. Finding a N400 effect in this condition would indicate early semantic activation in L2. Below, we report a test of same-script translation priming effects in both directions (L1 to L2Fsee Experiment 1aFand, more critically, L2 to L1Fsee Experiment 1b) with noncognates, using unbalanced English–French bilinguals living in an L1 environment.
EXPERIMENT 1A: TRANSLATION PRIMING FROM L1 TO L2 AT 120 MS SOA Methods Participants Twenty English-French bilinguals (16 women; mean age 5 19.85 years; SD 5 0.99) from Tufts University participated in the experiment and were monetarily compensated for their time. Participants were all English native speakers and primarily used their mother tongue in daily life. All of them learned French in school and were currently enrolled or recently finished advanced French classes. None of them had learned French or any other second language before the age of 4. Mean age of the beginning of acquisition for French was 11.85 years (SD 5 2.67). The number of months of immersion in a French-speaking environment ranged from 0.25 to 15 (mean 5 4.39, SD 5 3.62). Detailed measures of language proficiency based on participants’ self-ratings are shown in Table 1. All participants were right-handed (Edinburgh Handedness Inventory; Oldfield, 1971), and all reported having normal or corrected-to-normal vision with no history of neurological insult or language disability. Stimuli and Design The critical stimuli in this experiment were 160 English–French translation pairs (all three to eight letter words; see the Appendix). The mean printed frequency for all French target words was 1.83 log10 per million and ranged from 0.45 to 2.98 (Lexique
76
S. Schoonbaert et al.
Table 1. Mean (SD) Self-ratings in L1 and L2 in Experiments 1a and 1b
Measure Reading ability Speaking ability Auditory comprehension Overall proficiency
L1 (English) mean (SD)
L2 (French) mean (SD)
7.00 (0.00) 7.00 (0.00) 7.00 (0.00) 7.00 (0.00)
5.35 (0.59) 5.33 (0.77) 5.83 (0.85) 5.50 (0.76)
Note: 7-point Likert scale (1 5 very poor; 7 5 excellent).
database of New, Pallier, Brysbaert, & Ferrand, 2004). The mean printed frequency for all English translation primes (used as targets in Experiment 1a) was 1.94 log10 per million and ranged from 0.30 to 3.04 (Celex lexical database of Baayen, Piepenbrock, & van Rijn, 1993). Cognate or interlingual homograph/homophone prime–target pairs, as well as overly polysemous words, were excluded from our stimulus lists. The French word targets could be preceded by their English translation or by an unrelated English word. Prime–target pairing was counterbalanced using a Latin-square design. We created unrelated prime–target pairs by reassigning related primes to different targets, thus creating four lists. Each participant was assigned to one list and consequently saw each target only once, either with the translation prime or with its control. However, all stimuli occurred as both translations and unrelated an equal number of times across participants. The order of prime–target trials was pseudorandomized. An important feature of this design is that the prime and target ERPs in the different conditions are formed from exactly the same physical stimuli (across subjects), which should reduce the possibility of ERP effects across conditions due to differences in physical features or lexical properties. The experiment involved one repeated measures factor, namely Prime Type (translation vs. unrelated). Additionally, 160 nonwords were created that followed the French GPC rules, serving as French filler targets for the lexical decision task. These nonword targets were matched with the French word targets on number of letters, bigram frequency, and number of orthographic neighbors (all ps 4.30, two-tailed t tests) in order to ensure their word-likeness and pronouncability. The WordGen stimulus generation program (Duyck, Desmet, Verbeke, & Brysbaert, 2004) was used for all matching purposes. All nonwords were preceded by English word primes. Procedure Each trial consisted of a sequence of four visual events. First, a row of 10 hash marks [##########], serving as a forward mask and as a fixation mark, was presented for 500 ms. Second, the prime was displayed on the screen for 100 ms (10 refresh rates at 100 Hz). Third, a backward mask [##########] was presented for 20 ms, creating a 120-ms SOA (see recommendations by Altarriba & Basnight-Brown, 2007; these authors stated that preferably SOAs below 200–300 ms should be used). Fourth, the target was presented for 500 ms. After each priming sequence, a blank interval of 1000 ms was presented and replaced by a 2000-ms blink stimulus [(- -)]. Participants were asked to blink only when the blink stimulus was displayed. All stimuli were presented in Verdana font type as centered white characters with a black background on a standard 19-in. monitor, located 143 cm directly in front of the participant. Primes appeared in lowercase (font width 15, font height 30), whereas
targets were presented in uppercase (font width 20, font height 40) to minimize visual feature overlap between primes and targets. For the masks, the same font size as for the primes was used. Participants were asked to fixate the center of the screen and to decide as quickly and accurately as possible if the target stimulus was a French word or not. The two possible response buttons were the right key (for a ‘‘yes’’ response) and the left key (for a ‘‘no’’ response) of a millisecond-accurate game pad. The assignment of responses was reversed for half of the participants. Participants were not informed about the presence of the primes. Instructions were given in English (L1) by the experimenter (before the experiment). During the setup, participants filled out a handedness questionnaire (Edinburgh Handedness Inventory; Oldfield, 1971). After the experiment, participants were asked to complete a short questionnaire about their L2 learning age and L1 and L2 language proficiency (including self-ratings; see Table 1). They were also given a list of all L2 words in the experiment and were asked to type in the L1 translation. Mean performance on this posttranslation task was 88.39% correct (SD 5 6.61, range 71.88% to 96.88%). Event-Related Potential Recording Procedure This study was run at the Neurocognition Lab at Tufts University, Medford, Massachusetts. Participants were seated in a comfortable chair in a sound-attenuating room. The electroencephalogram (EEG) was recorded from 29 active tin electrodes mounted on an elastic cap that was fitted on the participant’s scalp (Electro-cap International, Eaton, OH). Additional electrodes were attached below the left eye (LE, to monitor for vertical eye movement or blinks), to the right of the right eye (HE, to monitor horizontal eye movement), over the left mastoid bone (used as reference), and over the right mastoid bone (recorded actively to monitor for differential mastoid activity; see Figure 1 for the electrode montage). All EEG electrode impedances were maintained below 5 kO (except the impedance for eye electrodes, which was less than 10 kO). The EEG (200-Hz sampling rate, bandpass 0.01 and 40 Hz) was recorded continuously. Data Analysis Averaged ERPs time-locked to target onset were formed off-line, excluding trials with ocular and muscular artifact (o0.57%). Trials with lexical decision errors, RTs below 200 ms and above 1500 ms, and post-translation errors were also excluded from the RTand ERP analyses (18.56% of all data). One French item was unknown to all subjects, and therefore this item (as well as its translation in Experiment 1b) was excluded from all analyses (see the Appendix). ERP data from a representative subarray of the full 28-channel scalp montage was used for analysis. For the sake of clarity in presenting the results, we only report data from the sites where the effects are maximal. This included nines sites extending from the front to the back of the head as well as over left, center, and right hemisphere locations (see Figure 1). We have successfully used a similar approach to ERP data analysis in a number of previous reports (e.g., Grainger et al., 2006) and find it the best compromise between simplicity of design (a single ANOVA can be used in each analysis epoch) and a full description of the distribution of effects. For both behavioral (by subjects and by items) and ERP data, an ANOVA (per time window, see below) was performed with Prime Type (translation vs. unrelated) as the repeated measures factor, treating mean reaction time, mean error percentages, and mean amplitude as respective dependent variables and additional scalp distribution factors of
Noncognate translation priming
77 Results Behavioral French targets preceded by their English translation (583 ms) were recognized faster than those preceded by an unrelated English word (653 ms). This 70 ms (L1 to L2) priming effect was significant by subjects, F1(1,19) 5 102.20, po.001, and by items, F2(1,155) 5 85.89, po.001. There was a significant effect of Prime Type on the percentage of errors to words (7%), F1(1,19) 5 22.31, po.001, and F2(1,158) 5 29.49, po.001. French targets preceded by their English translation yielded fewer errors (4%) than those preceded by English unrelated primes (11%).
Figure 1. Electrode montage and nine sites used in analyses.
Electrode Laterality (left vs. center vs. right), and Front-to-Back Distribution (FP vs. C vs. O) were included in the analyses of ERP data. The Greenhouse and Geisser (1959) correction was applied to all repeated measures in the ERP analyses with more than one degree of freedom). The dependent measures in ERP analyses were the mean amplitude measurements in five consecutive time windows: 100–200 ms, 200–300 ms, 300–400 ms, 400– 500 ms, and 500–600 ms. In previous work, similar windows have been used to assess activity in the N250/N300 and the N400 epochs (e.g., Eddy, Schmid, & Holcomb, 2006; Holcomb & Grainger, 2006). To get a detailed view on the scalp distribution across all electrodes, scalp maps of ERP difference waves (unrelated–translation) are presented (see Figures 3 and 4).
ERPs ERPs for Prime Type conditions are plotted for the nine electrodes used in the analyses. For this experiment, ERPs can be found in the left panel of Figure 2. Figure 3 presents the voltage maps (formed from all 29 scalp sites) calculated by subtracting translation ERPs from unrelated ERPs in several different time windows. Significant effects are reported below, per 100-ms time window (from 100 ms to 600 ms after target onset) in order to best capture our results. 100- to 200-ms target epoch. Inspecting Figures 2 and 3, between 100 and 200 ms, clearly shows no effect of the priming manipulation (Fo1). 200- to 300-ms target epoch. Inspecting Figures 2 and 3, between 200 and 300 ms, shows a small L1 to L2 priming effect (unrelated more negative than translation), which peaks at about 250 ms and is largest over anterior sites. This observation is supported by a significant Prime Type ! Front-to-Back Distribution interaction, F(2,38) 5 7.60, po.01. 300- to 400-ms target epoch. By inspecting Figures 2 and 3, a clear effect of priming can be seen at 350 ms. ANOVAs
Figure 2. Event-related potentials time-locked to target onset in L1 to L2 translation priming conditions (1a) and L2 to L1 translation priming conditions (1b), plotted with the waveforms for their respective control conditions (Experiment 1). Note that target onset is marked by the vertical calibration bar and that negative is plotted up.
78
S. Schoonbaert et al. Procedure The procedure was identical to the procedure used in Experiment 1a. The order of the experiments was counterbalanced across subjects, with a lag of 2 weeks in between both experiments. Data Analysis Averaged ERPs time-locked to target onset were formed off-line, excluding trials with ocular and muscular artifact (o1.07%). Trials with lexical decision errors, RTs below 200 ms and above 1500 ms, and posttranslation errors were excluded (15.22% of all data). Results
Figure 3. Voltage maps calculated from difference waves (unrelated– translation) in Experiment 1a (L1 to L2 priming) at each of five time points encompassing the ERP measurements windows reported in the text. Note that we have also included the voltage map at 500 ms because it shows most clearly the prolonged N400 to L2 targets.
confirmed that this L1 to L2 priming effect (unrelated more negative than translation) was significant, F(1,19) 5 21.12, po.001. 400- to 500-ms target epoch. Figures 2 and 3 show very strong effects of priming (unrelated more negative than translation) at about 450 ms, being largest over the more posterior electrode sites. ANOVAs confirmed that the main L2 to L1 priming effect was significant, F(1,19) 5 27.19, po.001, as well as the interaction of L2 to L1 priming with Front-to-Back Distribution, F(2,38) 5 22.29, po.001. 500- to 600-ms target epoch. Figures 2 and 3 continue to show a clear L1 to L2 priming effect around 500–600 ms, although it appears mostly at posterior electrode sites. ANOVAs confirmed that there was a significant interaction between Prime Type and Front-to-Back Distribution, F(2,38) 5 6.52, po.05.
EXPERIMENT 1B: TRANSLATION PRIMING FROM L2 TO L1 AT 120 MS SOA Before providing a detailed discussion on the above mentioned data, we will present the data of the reverse priming direction, L2 to L1 (Experiment 1b). Experiment 1b used the same participants and stimuli (by swapping primes and target) as in Experiment 1a. Both experiments will then be discussed as one data set.
Methods Participants The same 20 English–French bilinguals who participated in Experiment 1a also participated in Experiment 1b. Stimuli Experiment 1b used the exact same critical stimuli as in Experiment 1a except that the primes and targets were swapped. The L1 translation primes of Experiment 1a now served as L1 target words, preceded by L2 translation primes (the L2 targets from Experiment 1a). Additional filler items (French word primes and English nonword targets) were created as in Experiment 1a.
Behavioral English targets preceded by their French translations (559 ms) were recognized faster than those preceded by unrelated French words (583 ms). This 24 ms priming effect was significant by subjects, F1(1,19) 5 23.87, po.001, and items, F2(1,155) 5 6.38, po.05. The L2 to L1 priming effect on the percentage of errors to words (1%) was not significant, F1(1,19) 5 4.00, po.06, and F2(1,158) 5 1.08, po.31. English targets preceded by their French translation yielded almost as few errors (3%) as those preceded by English unrelated primes (4%). ERPs ERPs for Prime Type conditions in this experiment are shown in the right panel of Figure 2. Figure 4 presents the voltage maps of difference waves (formed from all 29 scalp sites) across different time windows. 100- to 200-ms target epoch. Figures 2 and 4, between 100 and 200 ms, show no effect of the priming manipulation (p4.14) and no interaction between Prime Type and Front-to-Back Distribution (Fo1). 200- to 300-ms target epoch. Inspecting Figures 2 and 4, between 200 and 300 ms, shows a strong and widely distributed L2 to L1 priming effect (unrelated more negative than translation) peaking at about 250 ms. This observation is supported by a significant main effect of Priming, F(1,19) 5 26.49, po.001. 300- to 400-ms target epoch. By inspecting Figures 2 and 4, an effect of priming can be seen at 350 ms. ANOVAs confirmed that this L2 to L1 priming effect (unrelated more negative than translation) was significant, F(1,19) 5 13.40, po.01. 400- to 500-ms target epoch. Figures 2 and 4 show very strong effects of priming at about 450 ms over the more posterior electrode sites. ANOVAs confirmed that the L2 to L1 priming effect (unrelated more negative than translation) was significant, F(1,19) 5 20.20, po.001, as well as its interaction with Frontto-Back Distribution, F(2,38) 5 34.00, po.001. 500- to 600-ms target epoch. Figures 2 and 4 still show some of the L2 to L1 priming effect around 500 ms. ANOVAs confirmed that there was a significant priming effect in this epoch, F(1,19) 5 6.53, po.05. Discussion The behavioral analyses showed a significant translation priming effect from L1 to L2 as well as from L2 to L1, although the latter
Noncognate translation priming
79 GENERAL DISCUSSION
Figure 4. Voltage maps calculated from difference waves (unrelated– translation) in Experiment 1b (L2 to L1 priming) at each of five time points encompassing the ERP measurement windows reported in the text. Note that we have also included the voltage map at 500 ms to draw a comparison with Figure 3.
effect was smaller (70 ms vs. 24 ms). An additional analysis across both experiments, adding Direction (L1-L2 vs. L2-L1) as a within-subjects factor, confirmed this traditional translation priming asymmetry, F1(1,19) 5 35.40, po.001, and F2(1,156) 5 40.90, po.001. This analysis also indicated that targets were recognized faster and more accurately in L1 than in L2 (all pso.05). This pattern of results is a replication of the data of Schoonbaert et al. (2009), where behavioral priming effects from L1 to L2 and vice versa ran to 100 ms and 19 ms, respectively, at 250 ms SOA and 28 ms and 12 ms at 100 ms SOA. The ERP analyses confirmed the existence of L1 to L2 priming effects as well as L2 to L1 priming effects. The effects start at about 250 ms, which is the typical N250 window (Holcomb & Grainger, 2006). We seem to observe a strong widely distributed N250 effect for the L2 to L1 priming condition (i.e., no interaction with distribution). There is also an N250 effect in the L1 to L2 condition, although it appears to be smaller than the L2 to L1 effect and was larger at anterior sites than posterior. A combined analysis confirmed this observation. The N250 was significantly smaller in the L1-L2 direction: Direction ! Prime Type interaction, F(1,19) 5 5.77, po.05. At about 450 ms, large N400 translation priming effects are observed for both priming directions. These effects have a typical N400 posterior distribution, which was confirmed in the combined analysis, Prime Type ! Front-toBack Distribution, F(2,38) 5 49.23, po.001, and showed to be equally strong in both priming directions (no significant Prime Type ! Direction interaction, Fo1). Translation priming effects are still visible early in the 500–600 ms time window, but are larger for the L1 to L2 direction of priming, a trend that was confirmed in the combined analysis, Direction ! Prime Type ! Front-to-Back Distribution interaction, F(2,38) 5 6.44, po.01. A latency analysis, including the 400–500-ms and 500–600-ms time windows, further confirmed the existence of a more sustained N400 effect when priming from L1 to L2 than vice versa, Latency ! Direction ! Prime Type ! Front-to-Back Distribution interaction, F(2,38) 5 17.51, po.001. Follow-up analyses showed that the three-way interaction (Latency ! Direction ! Prime type) was only significant at frontal electrodes, F(1,19) 5 20.61, po.01, but not at central and occipital sites, F(1,19) 5 2.66, po.12, and Fo1, respectively.
In this study, we tested masked translation priming for unique noncognate translation pairs with unbalanced English (L1)– French (L2) bilinguals engaging in a lexical decision task. Our key innovation was the inclusion of ERPs in this particular paradigm. Both behavioral and ERP measures were collected for the two priming directions (L1 to L2 and vice versa). We expected to find priming effects on the N400 component, as evidence for semantic activation across languages, and possibly effects on the N250 component as a measure of earlier lexical processing. To our knowledge, this is the first study to report masked crosslanguage priming effects with ERPs using a lexical decision task. We observed large posterior N400-priming effects (peaking at about 450 ms) in both priming directions. However, the L1 to L2 priming effect was longer lasting than the reverse effect. This probably reflects an N400-latency shift for L2 targets, due to slower processing of L2 targets. Furthermore, we observed strong and widely distributed N250-priming effects from L2 to L1, whereas the N250 effect for the reverse priming direction (L1 to L2) seemed to be less pronounced. This would appear to be strong evidence for what we will argue are both form-based (N250) and semantic (N400) effects of translation primes in L2 on target processing in L1. The first main conclusion that can be drawn with respect to the present results when compared with prior research is that noncognate translation priming effects from L2 to L1 are robust when sufficient time is allowed for processing of the L2 prime. We therefore confirm Midgley et al.’s (2009) prediction that L2-L1 priming effects in relatively unbalanced bilinguals should emerge with longer prime exposures. This fits with the general hypothesis that the typical asymmetric pattern of translation priming effects as a function of priming direction is being driven by quantitative rather than qualitative differences in processing. Such quantitative differences are likely related to the way in which amount of exposure to the L2 determines the speed with which L2 words are processed. Such an account is easily accommodated within the general framework of the BIA model (Dijkstra & van Heuven, 2002; Grainger & Dijkstra, 1992). The present results are consistent with recent behavioral studies showing significant masked translation priming from L2 to L1 when more balanced bilinguals were tested (BasnightBrown & Altarriba, 2007; Dun˜abeitia et al., 2010) or allowing unbalanced bilinguals more time between the prime and the target (Duyck & Warlop, 2009; Schoonbaert et al., 2009). Therefore, increasing participants’ proficiency in L2 or increasing prime–target SOA can be thought of as having the same influence on the amount of processing of briefly presented L2 prime words. Increasing the SOA provides more time, and increasing L2 proficiency means that more processing can be performed for a fixed amount of time. There is one aspect of the present results that contrasts with the pattern found by Midgley et al. (2009) using a shorter SOA. This is the fact that the N250 priming effect was actually stronger from L2 to L1 in the present study, whereas the more typical asymmetry (stronger effects from L1 to L2) was seen in the Midgley et al. study. We provide two tentative interpretations of this key finding that are not mutually exclusive. The first interpretation is based on the possibility that translation priming effects from L1 to L2 in the N250 component might actually get weaker as SOA is increased. Prior research with monolingual participants and a within-language repetition priming manipu-
80
S. Schoonbaert et al.
lation has indeed shown that N250 priming effects diminish as prime–target SOA is increased (Holcomb & Grainger, 2007). In the same study, no such decrease in N400 priming effects was seen. Holcomb and Grainger (2007) argued that although semantic representations must remain active for sentence-level integration processes, word form representations must be rapidly suppressed in order to clear the way for the processing of upcoming words (see Grainger & Jacobs, 1999, for a discussion of this mechanism). Such a reset mechanism operating on wholeword form representations would lead to the suppression of activity in any whole-word representation activated by the prime word, including its translation equivalent. Because priming effects in the N250 component are thought to reflect the mapping of prelexical form representations onto whole-word representations, these priming effects will be affected by the above described mechanism. According to this proposal, the relationship between the size of N250 priming effects and prime–target SOA is nonmonotonic, with a positive correlation up to some critical SOA value (corresponding to when the reset mechanism kicks in), followed by a decrease in the size of priming effects with further increases in SOA. The stronger N250 translation priming effect from L2 to L1 than from L1 to L2 in the present study might also be driven by asymmetries in the connection strengths between the word form representations of translation equivalents, as postulated in the RHM (Kroll & Stewart, 1994). This pattern of priming effects would result from the stronger associations going from L2 to L1 than vice versa. In this framework, L2 primes will more rapidly activate the corresponding word form in L1 than L1 primes will activate their translation in L2. If one further assumes, following the RHM, that connection strengths from word forms to semantics are stronger in L1 than in L2, then a complete account of
the present findings emerges. Translation priming effects from L1 to L2 are driven mostly by semantic feedback (the L1 prime rapidly activates semantic representations that are compatible with the subsequent processing of the L2 translation equivalent), whereas L2-L1 priming effects are mostly driven by direct associations between word form representations (the L2 prime activates the corresponding word form representation in L1). Therefore, following Holcomb and Grainger’s (2006) interpretation of the modulation of the N250 and N400 ERP components seen in single word priming paradigms, L1-L2 translation priming effects will be mostly visible in the N400 whereas L2-L1 effects will be mostly visible in the N250, the precise pattern that was seen in the present study. The fact that Midgley et al. (2009) failed to find such a pattern suggests that a minimal amount of processing of L2 primes is necessary before the associative links with L1 word forms can be activated. The longer SOA used in the present study could therefore be critical for obtaining such priming effects from L2 to L1 with the specific population of bilinguals tested here. To conclude, our study replicated recent behavioral translation priming studies by showing robust priming from L1 to L2 and vice versa and extended this finding to English–French unbalanced bilinguals performing a lexical decision. We also contributed to the existing literature by including ERP measures, which mirrored the behavioral results by showing clear N400priming effects, indicating semantic involvement during priming in both directions. We found strong evidence for asymmetric N400 effects (i.e., smaller priming effects in the L2-L1 direction compared to L1-L2 effects), mostly likely caused by the 100-ms processing delay for L2 targets. Furthermore, we observed asymmetric N250 effects, possibly indicating traces of a strong lexical route of processing when priming from L2 to L1.
REFERENCES Altarriba, J., & Basnight-Brown, D. M. (2007). Methodological considerations in performing semantic- and translation-priming experiments across languages. Behavior Research Methods, 39, 1–18. Baayen, R. H., Piepenbrock, R., & van Rijn, H. (1993). The CELEX lexical database [CD-ROM]. Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania. Basnight-Brown, D. M., & Altarriba, J. (2007). Differences in semantic and translation priming across languages: The role of language direction and language dominance. Memory & Cognition, 35, 953– 965. Dijkstra, T., & Van Heuven, W. (2002). The architecture of the bilingual word recognition system: From identification to decision. Bilingualism: Language and Cognition, 5, 175–197. Dun˜abeitia, J. A., Perea, M., & Carreiras, M. (2010). Masked translation priming effects with highly proficient simultaneous bilinguals. Experimental Psychology, 57, 98–107. Duyck, W., Desmet, T., Verbeke, L., & Brysbaert, M. (2004). WordGen: A word selection and non-word generator tool for Dutch, German, English, and French. Behavior Research Methods, Instruments & Computers, 36, 488–499. Duyck, W., & Warlop, N. (2009). Translation priming between the native Language and a second language: New evidence from Dutch-French bilinguals. Experimental Psychology, 56, 173–179. Eddy, M., Schmid, A., & Holcomb, P. J. (2006). A new approach to tracking the time course of object perception: Masked repetition priming and event-related brain potentials. Psychophysiology, 43, 564–568. Finkbeiner, M., Forster, K. [I. ], Nicol, J., & Nakamura, K. (2004). The role of polysemy in masked semantic and translation priming. Journal of Memory & Language, 51, 1–22.
Grainger, J., & Dijkstra, T. (1992). On the representation and use of language information in bilinguals. In R. J. Harris (Ed.), Cognitive processing in bilinguals. Amsterdam: North Holland. Grainger, J., & Frenck-Mestre, C. (1998). Masked translation priming in bilinguals. Language and Cognitive Processes, 13, 601–623. Grainger, J., & Jacobs, A. M. (1999). Temporal integration of information in orthographic priming. Visual Cognition, 6, 461–492. Grainger, J., & Holcomb, P. J. (2009). Watching the word go by: On the time-course of component processes in visual word recognition. Language and Linguistics Compass, 3, 128–156. Grainger, J., Kiyonaga, K., & Holcomb, P. J. (2006). The time-course of orthographic and phonological code activation. Psychological Science, 17, 1021–1026. Gollan, T. H., Forster, K. I., & Frost, R. (1997). Translation priming with different scripts: Masked priming with cognates and non-cognates in Hebrew-English bilinguals. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23, 1122–1139. Greenhouse, S., & Geisser, S. (1959). On methods in the analysis of profile data. Psychometrika, 24, 95–112. Holcomb, P. J., & Grainger, J. (2006). On the time course of visual word recognition: An event-related potential investigation using masked repetition priming. Journal of Cognitive Neuroscience, 18, 1631–1643. Holcomb, P. J., & Grainger, J. (2007). The effects of stimulus duration and prime-target SOA on ERP measures of masked repetition priming. Brain Research, 1180, 39–58. Jiang, N. (1999). Testing processing explanations for the asymmetry in masked cross-language priming. Bilingualism: Language & Cognition, 2, 59–75. Jiang, N., & Forster, K. I. (2001). Cross-language priming asymmetries in lexical decision and episodic recognition. Journal of Memory & Language, 44, 32–51.
Noncognate translation priming
81
Kiyonaga, K., Grainger, J., Midgley, K. J., & Holcomb, P. J. (2007). Masked cross-modal repetition priming: An event-related potential investigation. Language and Cognitive Processes, 22, 337–376. Kounios, J., & Holcomb, P. J. (1992). Structure and process in semantic memory: Evidence from event-related brain potentials and reaction times. Journal of Experimental Psychology: General, 121, 459–479. Kounios, J., & Holcomb, P. J. (1994). Concreteness effects in semantic processing: Event-related brain potential evidence supporting dualcoding theory. Journal of Experimental Psychology: Learning, Memory and Cognition, 20, 804–823. Kroll, J. F., & Stewart, E. (1994). Category interference in translation and picture naming: Evidence for asymmetric connections between bilingual memory representations. Journal of Memory and Language, 33, 149–174. Kutas, M., & Federmeier, K. D. (2000). Electrophysiology reveals semantic memory use in language comprehension. Trends in Cognitive Sciences, 4, 463–470. Kutas, M., & Hillyard, S. A. (1980). Reading senseless sentences: Brain potentials reflect semantic incongruity. Science, 207, 203–205. Kutas, M., & Hillyard, S. A. (1984). Brain potentials during reading reflect word expectancy and semantic association. Nature, 307, 161– 163.
Midgley, K., Holcomb, P. J., & Grainger, J. (2009). Masked repetition priming and translation priming in second Language learners: A window on the time-course of form and meaning using ERPs. Psychophysiology, 46, 551–565. New, B., Pallier, C., Brysbaert, M., & Ferrand, L. (2004). Lexique 2: A new French lexical database. Behavior Research Methods, Instruments & Computers, 36, 516–524. Oldfield, R. C. (1971). The assessment and analysis of handedness: The Edinburgh inventory. Neuropsychologia, 9, 97–113. Perea, M., Dun˜abeitia, J. A., & Carreiras, M. (2008). Masked associative/semantic priming effects across languages with highly proficient bilinguals. Journal of Memory and Language, 58, 916–930. Schoonbaert, S., Duyck, W., Brysbaert, M., & Hartsuiker, R. J. (2009). Semantic and translation priming from a first language to a second and back: Making sense of the findings. Memory & Cognition, 37, 569–586.
(Received November 4, 2009; Accepted January 30, 2010)
APPENDIX
Table A1. English–French Translation Pairs, Used as Critical Stimuli in Experiments 1a and 1b
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. a
English (L1)
French (L2)
advice anger another apple beach belief belt better bird boat book boredom boy brain breast broken brother butter cake candle care castle century cheese chicken child chin church cloud coal coat curtain disgust dish dream duck early empty english faith
conseil cole`re autre pomme plage croyance ceinture mieux oiseau bateau livre ennui garc¸on cerveau sein casse´ fre`re beurre gaˆteau bougie soin chaˆteau sie`cle fromage poulet enfant menton e´glise nuage charbon manteau rideau de´gou¨t assiette reˆve canard toˆt vide anglais foi
41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75. 76. 77. 78. 79. 80.
English (L1)
French (L2)
fame father fear fire fish foot girl glove goat god goodness guilty happy hatred health heart heavy heel hell help hill hole hope house hunger hunter husband illness joke key kitchen knee knife last late law leaf leather leg less
renom pe`re peur feu poisson pied fille gant che`vre dieu bonte´ coupable heureux haine sante´ coeur lourd talon enfer secours colline trou espoir maison faim chasseur mari maladie blague cle´ cuisine genou couteau dernier tard loi feuille cuir jambe moins
Item excluded based on posttranslation data.
81. 82. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95. 96. 97. 98. 99. 100. 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120.
English (L1)
French (L2)
level life loss lost love meat milk monkey month mood moon mouth nail need needle new next noise nothing old peace poor pride queen rabbit rain reminder ring river roof school screen shame sheep shirt shoulder sick sight silk sin
niveau vie perte perdu amour viande lait singe mois humeur lune bouche ongle besoin aiguille nouveau prochain bruit rien vieux paix pauvre fierte´ reine lapin pluie rappel anneau fleuve toit e´cole e´cran honte mouton chemise e´paule malade vue soie pe´che´
121. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133. 134. 135. 136. 137. 138. 139. 140. 141. 142. 143. 144. 145. 146. 147. 148. 149. 150. 151. 152. 153. 154. 155. 156. 157. 158. 159. 160.
English (L1)
French (L2)
sister size skin skirt sleeve slippery snow soap soon soul speed state stone tail taste tear thought ticket tomorrow tree trucea truck truth ugliness unknown useless wait weak wealth week weight welcome wheel window wing wisdom wish worse worthy young
soeur taille peau jupe manche glissant neige savon bientoˆt aˆme vitesse e´tat pierre queue gou¨t larme pense´e billet demain arbre treˆvea camion ve´rite´ laideur inconnu inutile attente faible richesse semaine poids bienvenu roue feneˆtre aile sagesse souhait pire digne jeune
Psychophysiology, 48 (2011), 82–95. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01035.x
The oft-neglected role of parietal EEG asymmetry and risk for major depressive disorder
JENNIFER L. STEWART,a DAVID N. TOWERS,b JAMES A. COAN,c and JOHN J. B. ALLENa a
Psychology Department, University of Arizona, Tuscon, Arizona Psychology Department, University of Illinois at Urbana-Champaign, Champaign, Illinois c Psychology Department, University of Virginia, Charlottesville, Virginia b
Abstract Relatively less right parietal activity may reflect reduced arousal and signify risk for major depressive disorder (MDD). Inconsistent findings with parietal electroencephalographic (EEG) asymmetry, however, suggest issues such as anxiety comorbidity and sex differences have yet to be resolved. Resting parietal EEG asymmetry was assessed in 306 individuals (31% male) with (n 5 143) and without (n 5 163) a DSM-IV diagnosis of lifetime MDD and no comorbid anxiety disorders. Past MDD1 women displayed relatively less right parietal activity than current MDD1 and MDD ! women, replicating prior work. Recent caffeine intake, an index of arousal, moderated the relationship between depression and EEG asymmetry for women and men. Findings suggest that sex differences and arousal should be examined in studies of depression and regional brain activity. Descriptors: EEG asymmetry, Depression, Endophenotype
Bruder et al., 2002) and spatial task performance (Henriques & Davidson, 1997; Rabe, Debener, Brocke, & Beauducel, 2005), suggesting that right parietal hypoactivity in depression may manifest under many conditions, particularly for tasks that require right hemisphere processing. Rather than simply serving as a state index of depression status, right parietal hypoactivity may instead represent an endophenotype of depression that could provide insight into mechanisms involved in risk for depression. Relatively lower right resting parietal electroencephalogram (EEG) activity (inferred by relatively greater right alpha band activity; see Allen, Coan, & Nazarian, 2004) distinguishes both symptomatic and remitted depressed individuals from never-depressed individuals (Blackhart, Minnix, & Kline, 2006; Bruder et al., 1997; Henriques & Davidson, 1990; Kentgen et al., 2000), is prominent in family members of MDD patients (Bruder et al., 2005; Bruder, Tenke, Warner, & Weissman, 2007), and is linked with other indices of depression risk such as low positive emotionality (Hayden et al., 2008; Shankman et al., 2005), suggesting that parietal EEG asymmetry may also be a psychophysiological indicator for depression risk. Consistent with this hypothesis, parietal brain asymmetry demonstrates reliable trait-like properties in clinical and nonclinical samples (e.g., Debener et al., 2000; Hagemann, Naumann, Thayer, & Bartussek, 2002; Vuga et al., 2006), and in contrast to frontal asymmetry that appears to reflect not quite 60% stable trait variance across recording sessions, parietal asymmetry has higher trait variance, with approximately 70% reflecting stable trait variance (Hagemann et al., 2002) and convergence across EEG reference montages (Hagemann, Naumann, & Thayer, 2001; Henriques & Davidson, 1990; but see Reid, Duke & Allen, 1998 and Tomarken, Dichter, Garber, & Simien, 2004).
In recent years, researchers have examined the brain mechanisms involved in cognitive and emotional disturbances in depressed individuals to identify endophenotypes, biological markers of risk that may improve diagnosis and treatment of major depressive disorder (MDD) (e.g., Hasler, Drevets, Manji, & Charney, 2004; Mayberg, 2003). Although the research spotlight often focuses on the prefrontal and anterior cingulate cortices, the parietal cortex has also been implicated in depression-related attention and executive function deficits in both cognitive and emotional tasks and is thus a valuable candidate of study (Liotti & Mayberg, 2001; Mayberg, 1997). It has been argued that depression is particularly associated with impaired right parietal cortex function, reflecting reduced arousal and impaired processing of emotional stimuli (e.g., Bruder, 2003; Heller, 1993; Heller & Nitschke, 1997). This impairment is evident on neuropsychological tests of perceptual asymmetry (e.g., Heller, Etienne, & Miller, 1995; Keller et al., 2000) and in event-related potential studies of emotional perception (Deldin, Keller, Gergen, & Miller, 2000; Kayser, Bruder, Tenke, Stewart, & Quitkin, 2000), lateralized auditory processing (e.g., Bruder et al., 1995; Bruder, Wexler, Stewart, Price, & Quitkin, 1999; This research was supported in part by grants from the National Institutes of Health (R01–MH066902) and the National Alliance for Research on Schizophrenia and Depression (NARSAD) to John Allen. The authors wish to thank Andrew Bismark, Eliza Fergerson, Jamie Velo, Dara Halpern, Craig Santerre, Eynav Accortt, Amanda Brody, and Jay Hegde for assistance with subject recruitment, and myriad research assistants who helped to collect and review EEG data. Address correspondence to: John J. B. Allen, Department of Psychology, University of Arizona, 1503 E. University Ave., Room 312, Tucson, AZ 85721-0068. E-mail
[email protected] 82
Parietal EEG asymmetry and depression Several resting EEG studies, however, have failed to confirm an association between right parietal hypoactivity and depression (e.g., Debener et al., 2000; Deslandes et al., 2008; Henriques & Davidson, 1991; Mathersul, Williams, Hopkinson, & Kemp, 2008; Nitschke, Heller, Palmieri, & Miller, 1999). Furthermore, within high risk samples, infants of depressed mothers have not displayed less right than left parietal activity (e.g., Dawson, Frey, Panagiotides, Osterling, & Hessl, 1997; Diego et al., 2004; Field, Fox, Pickens, & Nawrocki, 1995; Jones, Field, Fox, Lundy, & Davalos, 1997; Jones et al., 1998), and another study showed that adolescents with depressed mothers exhibited relatively greater, not less, right parietal activity than their low-risk counterparts (Tomarken et al., 2004). Inconsistent results may be due to a number of factors, including small patient samples, diagnostic heterogeneity and anxiety comorbidity, depression recruitment strategies, and sex differences in depression and/or EEG asymmetry (e.g., Davidson, 1998; Heller & Nitschke, 1998). With respect to patient samples, a few studies demonstrate effects in the predicted direction but do not reach significance, likely due to the limited number of depressed patients included (e.g., Allen, Iacono, Depue, & Arbisi, 1993). Significant parietal EEG asymmetry results between pure MDD patients and controls tend to possess a medium effect size (e.g., Bruder et al., 1997; Reid et al., 1998), requiring a substantial number of subjects to detect group differences, which may explain null results for studies with small sample sizes (e.g., Henriques & Davidson, 1991). Conflicting results across studies may also be a result of heterogeneity of depressed samples, perhaps due to subtypes of depression such as seasonal affective disorder (e.g., Allen et al., 1993; Volf & Passynkova, 2002), anhedonic depression (Nitschke et al., 1999), and depression co-occurring with types of comorbid anxiety that are associated with opposing patterns of brain asymmetry compared to those displayed by non-anxious depressed individuals (Heller & Nitschke, 1998). For example, anxious apprehension (worry) has been linked to relatively less right hemisphere activity, and anxious arousal (somatic symptoms of anxiety) to relatively more right hemisphere activity (e.g., Heller, Nitschke, Etienne, & Miller, 1997; Nitschke et al., 1999), patterns that could potentially exaggerate or cancel out relatively lower right parietal activity in depression. Consistent with this proposition research indicates that: (1) individuals with MDD and at least one anxiety disorder display relatively more right parietal activity than MDD patients without anxiety (Bruder et al., 1997); (2) comorbid anxiety disorders in adolescents with MDD were associated with relatively greater right parietal activity (Kentgen et al., 2000); and (3) comorbid anxious arousal and depression symptoms were linked to relatively greater right parietal activity in patients with posttraumatic stress disorder (Metzger et al., 2004). Differences between pure and comorbid depressed individuals, however, are not consistently found (Mathersul et al., 2008), potentially due to the type of recruitment strategy used to obtain depressed individuals (i.e., on the basis of DSM-IV diagnoses versus questionnaires measuring symptoms of depression and anxiety). Null results are apparent in some studies using questionnaires measuring current depressive symptoms (e.g., Deslandes et al., 2008; Diego, Field, & Hernandez-Reif, 2001; Harmon-Jones et al., 2002; Nitschke et al., 1999; Reid et al., 1998; Schaffer, Davidson, & Saron, 1983), consistent with research indicating that some depression scales may also index anxiety (Nitschke, Heller, Imig, McDonald, & Miller, 2001), and may cancel out lateralization effects associated with depression. Table 1 summarizes studies examining the relationship between parietal EEG asymmetry and depression.
83 In addition to heterogeneity in symptom presentation, sex differences in depression may influence patterns of regional brain activity. Most resting EEG asymmetry studies of depression that examined differences in parietal activity have utilized only female samples (Allen et al., 1993; Diego et al., 2001; Graae et al., 1996; Kentgen et al., 2000; Reid et al., 1998; Volf & Passynkova, 2002) or predominantly female samples, thereby lacking the power to reliably examine sex differences (Blackhart et al., 2006; Debener et al., 2000; Deslandes et al., 2008; Henriques & Davidson, 1991; Nitschke et al., 1999; Schaffer et al., 1983). Although there is some evidence that sex differences may be an important factor in frontal EEG asymmetry and its relationship to depression (e.g., Miller et al., 2002; Tomarken et al., 2004), parietal asymmetry (the focus of this report) has not been examined with respect to sex differences (with the exception of Bruder et al., 2007, who found no differences in parietal asymmetry between women and men with and without risk for depression). Examining sex differences in depression and brain asymmetry is important, since depressed men and women appear to display opposing patterns of frontal EEG activity that may be differentially associated with risk for depression (e.g., Miller et al., 2002; Stewart, Bismark, Towers, Coan, & Allen, in press). Null results found in studies of clinical MDD that pool similar numbers of women and men could be due to opposing patterns of parietal activity that cancel out when sex effects are not examined (e.g., Henriques & Davidson, 1991). Furthermore, EEG asymmetry results in the unpredicted direction (relatively higher right parietal activity predicting depression symptoms one year later; Po¨ssel, Lo, Fritz, & Seeman, 2008) could be due to a higher number of male than female participants included in the study. The present investigation addressed these issues of samplespecific variability by recruiting a substantial sample of depressed individuals (31% male) without comorbid anxiety disorders to examine whether relatively lower right parietal activity at rest would similarly characterize women and men with a lifetime diagnosis of MDD. Additional analyses determined whether lifetime MDD results were due to a diagnosis of current MDD versus past MDD. Symptoms of depression and anxiety were included in analyses to (1) confirm that EEG asymmetry findings were not simply due to current distress among those with lifetime MDD, and (2) attempt to replicate null EEG asymmetry findings using dimensional questionnaire measures of depression. In addition, since parietal EEG asymmetry is thought to reflect arousal-related processes, the present study examined whether an index of arousal (recent caffeine intake) moderated the relationship between parietal EEG asymmetry and depression in men and women. Resting EEG was collected eight times, twice per day on four separate days, to ensure measurement of trait-related variance associated with parietal asymmetry. In addition, asymmetry scores were calculated for four reference derivations (average, current source density, Cz, and linked mastoid) to replicate research demonstrating convergent results for parietal asymmetry across EEG reference montages (e.g., Hagemann et al., 2001; Henriques & Davidson, 1990).
Method Participants A total of 306 participants (95 male, 73% Caucasian; also reported in Stewart et al., in press) with an age range of 17 to 34 years (M 5 19.1, SE 5 0.1) were enrolled in the study from a
4 women with bipolar seasonal affective disorder (SAD); 4 female control subjects 28 (23 female)
Allen et al. (1993)
Debener 15 current MDD et al. (2000) (10 female); 22 controls (15 female)
23–64
33 right; 4 left
LM; P4-P3
LM; P4-P3
13–15 months
Not reported
LE; P4-P3, P8-P7
Right Offspring with 2 MDD relatives M 5 15.4, SD 5 4.7; offspring with 1 MDD relative M 5 13.6, SD 5 6.2; controls M 5 10.6, SD 5 4.5
Cz, Nose; P4-P3, P8-P7
LE; P4-P3
Cz; P4-P3
Reference scheme and parietal sites
LE; P4-P3, P8-P7
8–50
Bruder et al. 18 offspring of 2 (2005) parents with MDD (10 female); 40 offspring of 1 parent with MDD (25 female); 29 controls with no MDD parents (18 female) Bruder et al. 19 (11 female) offspring (2007) having parent and grandparent with MDD; 14 (6 female) offspring having parent or grandparent with MDD; 16 (9 female) offspring with neither parent or grandparent with MDD Dawson 117 infants (52 female), et al. (1997) 54 with MDD mothers and 63 non-MDD mothers
50 right and 10 left
Right
Right
Handedness
Right
20–60
18–25
Pre-menopausal
Age
Bruder et al. 19 anxious-depressed (1997) (9 female); 25 nonanxious depressed (13 female); 26 control subjects (13 female)
Blackhart et al. (2006)
Sample
Citation
Table 1. Studies of Resting Parietal EEG Asymmetry and Depression
ICD criteria
DSM-III-R criteria
DSM-IV criteria
DSM-IV criteria
DSM-III-R criteria
BDI
DSM-III-R criteria
No significant findings
No significant findings
Yes
Yes
Offspring of 2 Yes MDD relatives # RPA than other 2 groups
Anxious-depressed Yes " RPA than LPA; Non-anxiousdepressed # RPA than LPA; controls showed parietal symmetry; nonanxious depressed # RPA than anxiousdepressed Offspring of 2 Yes parents with MDD # RPA than offspring of 1 parent with MDD and controls
Yes; results remain when subjects with lifetime anxiety diagnoses are removed from analysis
Yes; anxiety symptoms did not moderate results
Yes; participants had no reported history of psychopathology; anxiety symptoms did not predict parietal asymmetry Yes; addressed in main analysis with anxiety groups; also found no relationship between STAI and asymmetry
No
Comorbid anxiety analysis
Yes; male infants Yes, anxiety of non-MDD mothers accounted for " RPA than female in analyses infants; female infants of MDD mothers " RPA than male infants of MDD mothers No No
Yes, no effects involving gender found
No
No
Yes, no effects involving gender found
No
Hemisphere Gender difference analysis analysis
SAD group No marginally # RPA than control group " BDI scores No predicted # RPA
Primary assessment Parietal results of depression summary
84 J. L. Stewart et al.
Jones et al. (1998) 1 week
1–3 months
20 infants of depressed mothers; 24 infants of non-depressed mothers (gender not reported) 35 infants (57.9% male) of depressed mothers; 28 infants
Jones et al. (1997)
Not reported
Not reported
Cz, P4-P3
Cz; P4-P3
AVG, Cz, LE; P4-P3
31–57
Henriques 15 MDD patients (8 & Davidson female); 13 controls (9 (1991) female) Right
AVG, Cz, LM; P4-P3
Right Depressed M 5 37.4, SD 5 9.5; Controls M 5 34.7, SD 5 3.4
Henriques 6 euthymic depressed & Davidson patients (5 female); 8 (1990) controls (6 female)
Not reported
LE; P4-P3
72 (37 female)
Right
HarmonJones et al. (2002)
Right
12–17
Nose; P4-P3
Cz; P4-P3
Right-handed parents
3–6 months
Cz; P4-P3
Cz; P4-P3
83% right
M 5 23 (SD 5 5)
LE; P4-P3
Reference scheme and parietal sites
M 5 1.7 weeks Not reported (SD 5 0.8 weeks)
Right
4 60
Deslandes 22 depressed and 14 et al. (2008) controls Diego et al. 163 women (2001)
Diego et al. Babies of 20 prepartum (2004) and postpartum depressed mothers (58% male), 20 prepartum depressed mothers (35% male), 20 postpartum depressed mothers (40% male), and 20 non-depressed mothers (40% male) Field et al. 17 infants of depressed (1995) mothers; 17 infants of non-depressed mothers Graae et al. 16 female suicide (1996) attempters; 22 female controls
Handedness
Age
Sample
Citation
Table 1. (Contd.)
Schedule for Affective Disorders and Schizophrenia according to Research Diagnostic Criteria Schedule for Affective Disorders and Schizophrenia according to Research Diagnostic Criteria Center for Epidemiological Studies Depression Scale Center for Epidemiological
General Behavior Inventory Depression scale
BDI and Diagnostic Interview Schedule for Children BDI and Diagnostic Interview Schedule for Children
Center for Epidemiological Studies Depression Scale
Center for Epidemiological Studies Depression Scale
DSM-IV criteria
Yes
# RPA linked to higher suicidal intent, not depressive symptoms No significant findings
No significant findings
No significant findings
No significant findings
No
No
Yes
Yes
Yes
No
No
No
Reported No for frontal, not parietal EEG analyses Yes No
Yes
MDD group # RPA than control group
No
Reported No for frontal, not parietal EEG analyses No No
Yes
Hemisphere Gender difference analysis analysis
No significant findings
No significant findings
No significant findings No significant findings
Primary assessment Parietal results of depression summary
No
No
No
No
Accounted for baseline reported fear
No
No
No
No
No
Comorbid anxiety analysis
Parietal EEG asymmetry and depression 85
(46.2% male) of non-depressed mothers 11 women with current MDD and comorbid anxiety disorder; 8 women with current MDD and no anxiety disorder; 6 women with no current MDD and an anxiety disorder; 10 female controls 428 (214 female) separated into Normal (n 5 52), Depressed (n 5 52), Anxious (n 5 52) and Comorbid (n 5 52) groups 50 female Vietnam War nurse veterans, 18 with current PTSD (9 with comorbid lifetime MDD), 14 with past PTSD (7 with comorbid lifetime MDD), and 18 with no PTSD (5 with lifetime MDD). All groups included other comorbid anxious disorders. 55 with child onset MDD or dysthymia (28 female); 55 controls (38 female)
Sample
Nitschke 9 anxious apprehension et al. (1999) (6 female), 19 anxious arousal (10 female), 12 depressed (9 female), 13 comorbid (6 female), 14 (8 female) Po¨ssel et al. 80 adolescents (2008) (35 female)
Miller et al. (2002)
Metzger et al. (2004)
Mathersul et al. (2008)
Kentgen et al. (2000)
Citation
Table 1. (Contd.)
Nose; P4-P3
Right
13–15
AVG; P4-P3, P8-P7
LE; P4-P3
LM; P4-P3
Right
M 5 53.7, SD 5 2.8
Studies Depression Scale DSM-IV criteria Subjects with MDD but no anxiety disorder showed # RPA than LPA
Primary assessment Parietal results of depression summary
Yes
DepressionScreening Questionnaire and Self-Rating Questionnaire for Depressive Disorders
Mood and Anxiety Symptom Questionnaire, Anhedonic Depression scale
DSM-III and DSM-IV
No significant findings for depressed group once 2 male outliers were removed " RPA predicted depression symptoms 12 months later
No significant findings
No
Yes
Yes
DSM-IV; Clinician- Higher PTSD No Administered PTSD arousal symptoms Scale; Symptom " RPA; PTSD Checklist-90–Revised arousal and PTSD arousal by depression interaction accounted for 25% of the variance in RPA
Yes. Discussed failure to find link between higher depression symptoms # RPA may be due to lack of depressed subjects without high arousal
Yes; addressed in main analysis with anxiety groups
Yes; addressed in main analysis with anxiety groups
Comorbid anxiety analysis
No
Yes, accounted for anxiety in depression analyses
Yes, no gender effects Yes; addressed in emerged main analysis with anxiety groups
Did examine them in Yes; anxiety frontal regions but symptoms did not unclear with regard to moderate results parietal
No
None examining gender ! group to predict EEG
No
Hemisphere Gender difference analysis analysis
AVG; T6-T5 Depression Anxiety Depressed and Yes and P4-P3 Stress Scales comorbid anxietyaveraged together (21 item version) depression groups depression scale " RPA than control group
Cz, Nose; P4-P3, P8-P7
Reference scheme and parietal sites
85% Depressed M 5 26.0 right-handed (SD 5 3.2); Control M 5 27.2 (SD 5 5.8) 17–20 Right
Not reported
Right
Handedness
18–60
12–19
Age
86 J. L. Stewart et al.
Study 1: 19 low BDI women and 17 high BDI women; Study 2: 13 women with MDD and 14 controls
Reid et al. (1998)
AVG, Cz, LE; P4-P3
Right
Right
12–14
28–55
LE; P4-P3
LM; P4-P3, P8-P7
Age 3 and follow- 73.2% up at age 5–6 right-handed
AVG, Cz, LM; P4-P3
Reference scheme and parietal sites
Cz; P4-P3
Handedness
Study 1: Low BDI Right group M 5 19.1 (SD 5 1.1) and High BDI group M 5 17.9 (SD 5 0.2); Study 2: MDD group M 5 27.5 (SD 5 8.1) and control group M 5 27.6 (SD 5 7.5) Not reported Not reported
Age
No significant findings
DSM-III-R
MDD/SAD group # BPA than controls
Positive Emotionality Low PE group score based on tests in showed # RPA the Laboratory than LPA Temperament Assessment Battery DSM-III-R High risk group associated with " RPA (Cz)
BDI
Yes, no gender effects emerged
Yes, but did not mention gender differences in parietal asymmetry No
Reported for frontal, not parietal EEG analyses Yes
No
Yes
No
No
Hemisphere Gender difference analysis analysis
based on ICD-10 and DSM-IV criteria Study 1: BDI; Study Study 1: No No 2: DSM-III-R significant findings; Study 2: MDD group # RPA than control group (LM)
Primary assessment Parietal results of depression summary
No
No
Neuroticism not related to asymmetry
No
No
Comorbid anxiety analysis
Note: MDD 5 major depressive disorder; LPA 5 left parietal activity; RPA 5 right parietal activity; BPA 5 bilateral parietal activity; BDI 5 Beck Depression Inventory; DSM 5 Diagnostic and Statistical Manual; ICD 5 International Classification of Diseases; STAI 5 State Trait Anxiety Inventory; SAD 5 seasonal affective disorder; AVG 5 average reference; LE 5 linked ears reference; LM 5 linked mastoids reference.
Schaffer 6 high BDI scorers et al. (1983) (4 female); 6 low BDI scorers (4 female) Shankman 12 low positive et al. (2005) emotionality (PE; 58% female) and 17 high PE group (44% female) Tomarken 25 offspring of MDD et al. (2004) mothers (14 female); 13 offspring of mothers with no MDD (6 female) Volf & 31 MDD/SAD patients Passynkova (29 females); 30 (2002) matched controls
Sample
Citation
Table 1. (Contd.)
Parietal EEG asymmetry and depression 87
88
J. L. Stewart et al. Completed BDI in Pre-Testing (N = 10,227)
Invited to Participate in Study Screening (N = 1904)
Invited for Interview (N = 520)
Excluded After Interview (N = 197) No Longer Interested (n = 9) Psychotropic Medication (n = 11) Unknown (n = 14) Did Not Show for Interview (n = 15) Subsyndromal Past MDD and No Current MDD (n =18) Did not Meet BDI Criteria (n = 30) Head Injury / LOC (n = 33) Comorbid Axis I Diagnoses (n = 67)
Did Not Respond (N = 863)
Eligible and Enrolled in Study (N = 323)
Excluded After Screening (N = 521) Epilepsy (n = 3) Unknown (n = 19) Did Not Schedule Interview (n = 65) Head Injury / LOC (n = 85) Psychotropic Medication (n = 104) Left-handedness (n = 245)
Final Sample For Analysis (N = 306) Withdrew From Study Prior to EEG Recording (n = 10) Excluded for a diagnosis of Current Dysthymia without MDD (n = 7) Anxiety Disorders PTSD (n = 1) Social Phobia (n = 2) Panic Disorder (n = 3) Anxiety NOS (n = 4) Specific Phobia (n = 6) OCD (n = 7) GAD (n = 11)
Substance Use Dependence (n = 13) Abuse (n = 33) Psychotic Disorders Psychotic NOS (n = 1) Schizophrenia (n = 1) Bipolar Disorder (n = 4)
Eating Disorders Eating NOS (n = 4) Bulimia (n = 7) Anorexia (n = 8) Other Hypochondriasis (n = 3) ADHD (n = 5)
Figure 1. Flowchart of participant screening and enrollment.
possible pool of over 10,000 individuals on the basis of their scores on the Beck Depression Inventory (BDI; Beck, Ward, Mendelson, Mock, & Erbaugh, 1961) completed during pretesting in a large introductory psychology course or online after learning about the study from a flyer or referral source (see Figure 1 for a detailed flow chart summarizing study recruitment over a 4-year period). Individuals were selected to span the full range of depressive severity (from absent to full clinical levels as well as ranges in between), and participated in a phone screening session administered by a post-bachelors project manager to screen for preliminary inclusion and exclusion criteria. To be eligible, individuals were required to be strongly right-handed (a score greater than 35 on the 39-point scale of Chapman & Chapman, 1987) and to report no history of head injury with loss of consciousness greater than 10 min, concussion, epilepsy, electroshock therapy, use of current psychotropic medications, and active suicidal potential necessitating immediate treatment (although participation in current psychotherapy was allowed). Those passing this brief phone screen were invited for an intake interview, administered by a trained graduate clinical rater. Individuals were enrolled in the study if the Structured Clinical Interview for DSM-IV (SCID, First, Spitzer, Gibbon, & Williams, 1997) indicated that they did not meet criteria for any DSM-IV Axis I disorder other than lifetime MDD and comorbid current dysthymia. Inter-rater reliability analyses (performed by clinical interviewers and the first and last authors) for a randomly selected 10% of SCIDs demonstrated inter-rater agreements of 96% (Kappa 5 .81) for current MDD diagnoses and 96% (Kappa 5 .91) for past MDD diagnoses. The sample of individuals with lifetime MDD was moderately impaired; with data available from 129 of the 143 with lifetime MDD, the number of major depressive episodes averaged 3.2 (SD 5 3.1); with data available from 44 of the 62 with current MDD, the approximate
length of the current episode was 107 days (SD 5 101 days). The lifetime MDD1 group was further separated into a current MDD1 group (consisting of all participants with current MDD, regardless of past MDD status) and a past MDD1 group (consisting of participants with past MDD but not current MDD or current dysthymia) to examine whether lifetime MDD results were actually due to current symptoms (indicating a state, not a trait depression effect).1 Symptoms of depression and anxious apprehension for all groups (see Table 2) were assessed with the Beck Depression Inventory II (BDI-II; Beck, Steer, & Brown, 1996), the 17-item Hamilton Rating Scale for Depression (HRSD; Hamilton, 1960; intra-class correlation of inter-rater agreement of .95 for 10% of randomly selected HRSD interviews in the present sample), and the Penn State Worry Questionnaire (PSWQ; Meyer, Miller, Metzger, & Borkovec, 1990). Internal consistency reliability for all questionnaires ranged from acceptable to high in the current sample (Cronbach’s alpha 5 .90 for BDI-II, .84 for HRSD, .95 for PSWQ). In addition to measures of depression and anxiety, caffeine consumption was used as a proxy for current arousal. Participants indicated the recency of their caffeine intake by answering the question ‘‘When was the last time you consumed caffeine? 0 5 I have not used any since my last visit, 1 5 earlier 1 Within the lifetime MDD1group (n 5 143), the following diagnoses were met: 14 (5 male) for current MDD only, 75 (20 male) for past MDD only, 39 (10 male) for current MDD and past MDD, 2 (0 male) for current MDD and current dysthymia, 6 (1 male) for past MDD and current dysthymia, and 7 (3 male) for current MDD, past MDD, and current dysthymia. A total of six participants with diagnoses of past MDD and current dysthymia that were included in lifetime MDD analyses were excluded from current/past MDD analyses due to high levels of dysphoric DSM-IV symptomatology that did not meet criteria for a current MDD diagnosis.
Parietal EEG asymmetry and depression
89
Table 2. Group Demographics
MDD/BDI Status Lifetime MDD1 (n 5 143)a
Lifetime MDD– (n 5 163)
Group Current MDD1 Men (n 5 18) Women (n 5 44) Past MDD1 Men (n 5 20) Women (n 5 55) Men (n 5 56) Women (n 5 107)
BDI-II M (SE)
HRSD M (SE)
PSWQ M (SE)
CAFFEINE M (SE)
22.2 (1.6) 24.5 (1.1)
13.3 (1.1) 16.4 (0.7)
52.4 (2.9) 60.5 (1.9)
2.5 (0.2)b 2.2 (0.1)
9.3 (1.5) 12.3 (0.9) 5.7 (0.9) 6.2 (0.7)
6.1 (1.1) 8.0 (0.6) 3.9 (0.6) 4.0 (0.4)
49.3 (2.8) 54.3 (1.7)b 41.4 (1.7) 45.9 (1.2)b
2.7 (0.3) 2.5 (0.2) 2.5 (0.2) 2.2 (0.1)
Note: MDD 5 Major Depressive Disorder; BDI-II 5 Beck Depression Inventory II; HRSD 5 Hamilton Rating Scale for Depression; PSWQ 5 Penn State Worry Questionnaire; CAFFEINE 5 Ordinal Scale of recency of use ranging from 0 (not since last visit) to 4 (today). a Six participants (1 male) with past MDD and current dysthymia are included in lifetime MDD group analyses but not current/past MDD1 group analyses. b One participant did not complete the scale.
this week, but not yesterday, 2 5 yesterday before 5 pm, 3 5 yesterday evening after 5 pm, 4 5 today.’’ Caffeine intake ratings were averaged across sessions for each participant to obtain an index of general caffeine consumption. To examine whether groups differed in depression, anxious apprehension, and arousal, univariate analyses of variance (ANOVAs) were computed with current MDD status (current MDD1, past MDD1, MDD ! ) and biological sex as betweensubjects variables and each questionnaire score as the dependent variable. Effect size (Cohen’s d) is reported for significant differences between groups. A main effect of current MDD status emerged for BDI-II (F(2,294) 5 125.5, po.001), HRSD (F(2,294) 5 98.3, po.001) and PSWQ (F(2,292) 5 23.5, po.001), indicating that 1) the current MDD1 group endorsed higher depression scores than the past MDD1 group (both po.001; BDI-II d 5 1.66 and HRSD d 5 1.78), and 2) current MDD1 and past MDD1 groups endorsed higher depression and anxious apprehension scores than the MDD ! group (all po.001; BDI-II d 5 2.38 and .65, HRSD d 5 1.69 and .42, and PSWQ d 5 .96 and .60, respectively). In addition, a main effect of sex emerged for BDI-II (F(1,294) 5 4.2, p 5 .04, d 5 .26), HRSD (F(1,294) 5 6.5, p 5 .01, d 5 .32), and PSWQ (F(1,292) 5 11.6, po.01, d 5 .43), indicating that women had higher symptom scores than men. No effects emerged for caffeine intake (p4.28). EEG Data Collection and Reduction Two resting EEG sessions were completed during each visit, on four separate days with no fewer then 24 hr between visits, and with all four visits completed within a 14-day period.2 Each resting EEG session was recorded for 8 min in 1-min periods of eyes-open (O) and eyes-closed (C), in one of two counterbalanced orders (OCCOCOOC or COOCOCCO). EEG data were collected continuously for each 8-min resting session with a 64-channel NeuroScan Synamps2 (Charlotte, NC) amplifier and acquisition system using Ag-AgCl electrodes, a 1000 Hz sampling rate, and a gain of 2816, with bandpass from DC to 200 Hz prior to digitization. EEG data were acquired with an online reference site 2 Of the 21 participants who did not complete their sessions within a 2week time frame, 15 completed all sessions within 16 days, whereas the remaining 6 completed all sessions within 18–20 days. In addition, 7 participants attended fewer than all four EEG assessment days (N 5 4 three days, N 5 1 two days, N 5 2 one day), but these individuals were included in mixed linear model analyses that successfully accommodated missing data.
immediately posterior to Cz and subsequently re-referenced offline to four references: (1) average of all EEG leads 5 AVG, (2) current source density 5 CSD (using algorithms from Kayser & Tenke, 2006, and based on the spherical spline approach summarized by Perrin, Pernier, Bertrand, & Echallier, 1989; Perrin, Pernier, Bertrand, & Echallier, 1990), (3) Cz, and (4) averaged (‘‘linked’’) mastoids 5 LM. The international 10–20 system was utilized for electrode placement and two electro-oculogram (EOG) channels (horizontal: outer canthi; vertical: superior and inferior orbit of the left eye) were collected for ocular artifact rejection. All impedances were kept under 10K Ohms. Before data reduction was implemented using custom scripts in Matlab (release 2007b, The Mathworks Inc., Natick, MA), resting EEG files were visually inspected to remove intervals contaminated with movement and muscle artifacts. EEG files were then epoched into 117 2.048-sec length epochs for each minute of data, overlapping by 1.5 sec to compensate for the minimal weight applied to the end of the epoch by the use of the Hamming window function, retaining only epochs that did not overlap rejected segments due to artifacts. Following epoching, a blink rejection algorithm rejected additional data segments where ocular activity exceeded "75 microvolts in the vertical EOG, and an artifact rejection algorithm rejected segments with large, fast deviations in amplitude (e.g., spikes) and DC steps in any channel. Subsequently, a Fast Fourier Transform (FFT) was applied to all artifact-free epochs. All 2.048-sec epochs were first baseline adjusted by removing the mean of all samples in the epoch, effectively removing the large DC component prior to the FFT. The power spectra from all artifact-free epochs across all 8 min were averaged to provide a summary spectrum for each resting session (range of artifact-free epochs per subject entered into the FFT for each single resting session 5 44-931, lifetime MDD1 men: M 5 489.0, SE 5 9.3, lifetime MDD– men: M 5 498.6, SE 5 8.7, lifetime MDD1 women: M 5 446.9, SE 5 6.1, lifetime MDD ! women: M 5 444.3, SE 5 5.3). Finally, total alpha power (8–13 Hz) was extracted from the spectrum and an asymmetry score for each resting session was then calculated for each site by subtracting the natural log transformed scores (i.e., ln[Right] – ln[Left]) for each homologous left and right pair (FP1 & FP2, AF3 & AF4, F7 & F8, F5 & F6, F3 & F4, F1 & F2, FT7 & FT8, FC5 & FC6, FC3 & FC4, FC1 & FC2, T7 & T8, C7 & C6, C3 & C4, C1 & C2, TP7 & TP8, CP5 & CP6, CP3 & CP4, CP1 & CP2, P7 & P8, P5 & P6, P3 & P4, P1 & P2, PO7 & PO8, PO5 & PO6, PO3 & PO4, O1 & O2). Higher
90
J. L. Stewart et al.
Results Lifetime MDD StatusFEEG Asymmetry Analysis To examine the relationship between lifetime MDD status and parietal EEG asymmetry, a full factorial mixed linear model (SAS 9.2, Gary, NC) was performed. Lifetime MDD status (past and/or current MDD 5 lifetime MDD1, never depressed 5 lifetime MDD-) and biological sex (male, female) were betweensubjects variables, whereas reference (4: AVG, CSD, Cz, and LM), and channel (4: P2-P1, P4-P3, P6-P5, P8-P7) were withinsubjects variables. EEG asymmetry score based on total 8–13 Hz alpha power was the dependent variable. Effects of interest based on prior work (e.g., Bruder et al., 1997; Stewart et al., in press) were: 1) a main effect of lifetime MDD, and 2) a lifetime MDD by sex interaction. Effect size (Cohen’s d) is reported for significant differences between MDD1 and MDD- groups. Results revealed several effects that were not of primary interest that will be reported first, with those involving MDD status being further described. Main effects of reference (F(3,906) 5 60.7, po.001) and channel (F(3,906) 5 157.4, po.001) were qualified by a reference by channel interaction (F(9,2718) 5 7.1, po.001). Most importantly, the main effects of lifetime MDD (F(1,302) 5 37.3, po.001) and sex (F(1,302) 5 23.9, po.001) were qualified by a lifetime MDD by sex interaction (F(1,302) 5 51.1, po.001), and follow-up linear mixed models for each sex separately (see Figure 2) indicated that lifetime MDD1 men (n 5 39) displayed relatively greater right parietal activity than lifetime MDD ! men (n 5 56) (po.001, d 5 1.66), whereas lifetime MDD1 women (n 5 104) did not differ from lifetime MDD ! women (n 5 107) (p4.34). These effects were not moderated by channel pair (p4.08) or reference (p4.91). 3
Resting EEG asymmetry results for frontal regions alone were previously reported in Stewart et al. (in press) and therefore were not included in this article.
0.4
Lifetime MDD+
ln(R)-ln(L) Total Alpha Power
Lifetime MDD–
0.3
0.2
0.1
Women
Men
Figure 2. Parietal alpha asymmetry scores (8–13 Hz) as a function of lifetime MDD status and sex collapsed across channel and reference. Higher values on the asymmetry score putatively reflect greater relative left or less relative right activity. Error bars reflect standard error.
Current MDD Status A follow-up full factorial linear mixed model was run to examine whether parietal EEG asymmetry results for lifetime MDD and sex differed as a function of current versus past MDD status. Current MDD status (current MDD1 5 all participants with current MDD, regardless of past MDD status; past MDD1 5 participants with past MDD but not current MDD or dysthymia; MDD ! 5 participants without current and past MDD and dysthymia) and sex were between-subjects variables. Within-subjects variables were reference and channel. The interaction of interest was the current MDD status by sex interaction. Results (see Figure 3) indicated that a current MDD status by sex interaction emerged (F(2,294) 5 42.4, po.001), and follow-up mixed models performed for each sex separately demonstrated that current MDD1 men (n 5 18) and past MDD1 men (n 5 20) displayed relatively greater right parietal activity than MDD ! men (n 5 56) (both po.001, d 5 1.18, and 2.18, respectively). In addition, past MDD1 men showed relatively greater right parietal activity than current MDD1 men (po.001,
0.4 ln(R)-ln(L) Total Alpha Power
asymmetry score values are thought to reflect relatively greater left than right parietal activity (i.e., relatively greater right than left alpha; cf. Allen et al., 2004). The present study will be framing results in terms of right, not left, parietal activity because previous research points to right parietal dysfunction in depressed individuals (e.g., Heller & Nitschke, 1997, 1998), with lower scores thus reflective of relatively greater right activity. Asymmetry scores for the eight resting sessions (two resting sessions within each day) were then averaged together to create a trait measurement of regional brain activity. Separate asymmetry scores for each of four reference montages were utilized in analyses, resulting in four asymmetry scores per participant at each homologous pair. Although asymmetry scores were computed for all homologous pairs of channels, analyses for the present study were performed on a specific subset of those pairs (P2-P1, P4-P3, P6-P5, P8-P7) that correspond to a region commonly studied in the parietal asymmetry literature (P4-P3; e.g., Bruder et al., 1997; Shankman et al., 2005) as well as pairs of channels that neighbor P4-P3 to add specificity to the nature of parietal asymmetry as a function of lifetime MDD status.3 Intraclass correlations indicated that parietal EEG asymmetry scores were highly stable across the eight resting sessions for each reference montage (AVG range 5 .80–.86 across parietal pairs; CSD range 5 .74–.84; Cz range 5 .82–.87; LM range 5 .74–.84).
Current MDD+ Past MDD+ MDD–
0.3
0.2
0.1
Women
Men
Figure 3. Parietal alpha asymmetry scores (8–13 Hz) as a function of current MDD status and sex across channel and reference. Higher values on the asymmetry score putatively reflect greater relative left or less relative right activity. Error bars reflect standard error.
Parietal EEG asymmetry and depression
Current symptomatology. Two types of mixed model approaches were performed for each of three questionnaire measures: BDI-II intake, HRSD, and PSWQ. The first approach was designed to assess whether current symptomatology was in fact responsible for the current MDD status by sex EEG asymmetry effects observed, which would then suggest that EEG asymmetry would be sensitive to state levels of depression and anxiety rather than a trait indicator of risk for depression. Thus, hierarchical linear mixed models using Type 1 (rather than Type 3) sums of squares were run wherein reference, channel, and sex were entered first, followed by one questionnaire (z-scored), and a questionnaire by sex interaction (z-scored). Subsequently, current MDD status and the current MDD status by sex interaction were added to the model. If EEG asymmetry results are not due to current symptomatology, the current MDD status by sex interaction should remain significant. Parietal EEG asymmetry score was the dependent variable. Results indicated that the current MDD by sex interaction remained significant in all questionnaire analyses (BDI-II intake: F(2,292) 5 39.1, po.001; HRSD: F(2,292) 5 40.1, po.001; PSWQ: F(2,290) 5 46.7, po.001), suggesting that current symptomatology cannot account for MDD asymmetry findings in women or men. In addition, BDI-II by sex (p4.54), HRSD by sex (p4.63), and PSWQ by sex (p4.40) interactions did not emerge, indicating that sex differences on these questionnaires do not meaningfully influence parietal asymmetry or mirror results for the observed current MDD by sex interaction. The second mixed model approach was an attempt to replicate prior null results in the literature using dimensional psychopathology measures (not DSM-IVcategories) to predict EEG asymmetry with Type 3 sums of squares. Each questionnaire was entered in its own full factorial mixed model with reference and channel as repeated factors, and sex as the other between-subjects factor. No effects emerged for BDI-II intake, BDI-II intake by sex, HRSD, HRSD by sex, PSWQ, or PSWQ by sex (all p4.39). Post-hoc examination of moderators of EEG asymmetry. The relationship between an index of current arousal (caffeine intake) and parietal EEG asymmetry was explored as a function of current MDD status to determine whether it might differentially moderate patterns of parietal asymmetry in men and women and explain two effects not previously reported in the EEG asymmetry literature; namely, (a) why current and past MDD1 men may exhibit relatively higher right parietal activity than MDD ! men, and (b) why current MDD1 women might display relatively higher right parietal activity than past MDD1 and MDD ! women. A full factorial mixed model was run with current MDD and sex as between-subject factors, reference and channel as within-subject factors, and caffeine intake as the covariate. Parietal EEG asymmetry was the dependent variable. The
ln(R)-ln(L) Total Alpha Power
Follow-Up Analysis
Low Caffeine High Caffeine
0.6
0.4 Women 0.2
0 ln(R)-ln(L) Total Alpha Power
d 5 .99). Women demonstrated a different pattern than men, wherein past MDD1 women (n 5 55) displayed relatively less right parietal activity than current MDD1 women (n 5 44) (po.001, d 5 1.45) and MDD ! women (n 5 107; po.001, d 5 .63), and current MDD1 women exhibited relatively greater right parietal activity than MDD ! women (po.001, d 5 .82). These findings explain why no significant asymmetry effects for lifetime MDD status were found for women: current MDD1 and past MDD1 groups show opposing patterns of parietal asymmetry (and thus cancel their effects).
91
0.6
0.4 Men 0.2
0
Current MDD+
Past MDD+
MDD–
Figure 4. Parietal alpha asymmetry scores (8–13 Hz) as a function of current MDD status and caffeine intake averaged across sessions (illustrated by plotting estimated means " 1 standard deviation) for women (top panel) and men (lower panel) collapsed across channel and reference. Higher values on the asymmetry score putatively reflect greater relative left or less relative right activity. Error bars reflect standard error.
effect of interest for each model was the current MDD by sex by caffeine intake interaction. A current MDD by sex by caffeine interaction emerged (F(2,287) 5 6.0, po.01), and follow-up mixed models run for each sex separately (see Figure 4) indicated that a current MDD by caffeine interaction was significant for men (F(2, 87) 5 15.9, po.001), showing that at low levels of caffeine intake, current MDD1 men, past MDD1 men, and MDD ! men did not differ in parietal asymmetry (all p4.07), but at high levels of caffeine intake, current MDD1 men and past MDD1 men exhibited relatively greater right parietal activity than MDD ! men (both po.001; d 5 1.21 and d 5 2.62, respectively). In addition, past MDD1 men displayed relatively greater right parietal activity than current MDD1 men (po.001; d 5 1.17). These results suggest that the asymmetry findings for men presented in the main analyses were only apparent at higher levels of caffeine intake and, by inference, arousal. In addition, a current MDD by caffeine interaction was significant for women (F(2,200) 5 20.8, po.001), demonstrating that, at low levels of caffeine intake, current MDD1 women and MDD – women displayed relatively greater right parietal activity than past MDD1 women (p 5 .02 and d 5 .49, and po.001 and d 5 .75, respectively). At high levels of caffeine intake, current MDD1 women still displayed relatively greater right parietal activity than past MDD1 women (po.001; d 5 1.47) but now also exhibited relatively higher right parietal activity than MDD ! women (po.001 and d 5 1.35). Most importantly, Figure 4 illustrates that, whereas caffeine intake did not moderate parietal asymmetry for past MDD1 women or MDDwomen, it did moderate asymmetry for current MDD1 women,
92 such that higher caffeine intake was linked to higher relative right parietal activity. These results account for asymmetry differences between current MDD1 and MDD ! women presented in the main analyses, and these findings also partially explain initial differences between current MDD1 and past MDD1 women. Although current MDD1 women displayed relatively greater right parietal activity than past MDD1 women even at low levels of caffeine, the effect size between groups became much larger at high levels of caffeine (d 5 .49 compared to d 5 1.47).
Discussion The present study examined regional parietal brain activity in a large sample of depressed and non-depressed individuals without comorbid anxiety disorders in order to determine whether parietal EEG asymmetry, a potential endophenotype of MDD, is moderated by sex differences. Patterns of parietal EEG asymmetry were indeed different between men and women with and without MDD. Results indicated that, although lifetime MDD1 women did not differ from lifetime MDD ! women, this null finding was due to opposing patterns of parietal asymmetry for current MDD1 and past MDD1 women. Past MDD1 women displayed relatively less right parietal activity than MDD ! women, a pattern of asymmetry consistent with other parietal EEG studies of depression (e.g., Blackhart et al., 2006; Bruder et al., 1997; Kentgen et al., 2000), presenting with a medium effect size similar to those previously demonstrated in the literature (e.g., Bruder et al., 1997; Reid et al., 1998). Although current MDD1 women exhibited higher relative right parietal activity than past MDD1 women, an unexpected finding not previously described in the literature, this effect was partially moderated by caffeine intake, such that the parietal asymmetry difference between past MDD1 women and current MDD1 women was larger at high than low levels of recent caffeine consumption. Current MDD1 women did not differ from past MDD1 women on levels of caffeine intake, however, indicating that a higher amount of caffeine in the current MDD1 group was not responsible for this finding. Since prior research indicates that currently depressed patients report more anxiety than non-depressed individuals at similar levels of caffeine ingestion (Lee, Flegel, Greden, & Cameron, 1988), it could be that current MDD1 women had higher levels of anxious arousal (symptoms of panic) associated with caffeine than past MDD1 and MDD ! women, reflected in higher relative right parietal activity, although additional research is needed to address this hypothesis. In summary, results for women suggest that caffeine may affect arousal processes differently as a function of current depression severity to obfuscate the underlying risk pattern for MDD, and that future work on MDD and parietal asymmetry might utilize multiple measures of arousal sensitivity to explore this possibility. Unlike lifetime MDD results for women, lifetime MDD1 men displayed higher relative right parietal activity than lifetime MDD ! men, and this large effect size was replicated in analyses of current MDD1 and past MDD1 men, who also displayed relatively greater right parietal activity than MDD ! men. Recent caffeine consumption moderated the relationship between MDD status and parietal asymmetry in men, wherein current and past MDD1 men displayed relatively higher right parietal activity than MDD– men at high but not low levels of recent caffeine intake. These large parietal EEG asymmetry differences
J. L. Stewart et al. in men are new findings not previously discussed in the literature, but these results may explain null findings in parietal EEG asymmetry studies that did not examine sex differences in depression. Due to the limited research on sex differences and EEG asymmetry in individuals with MDD (thus far, only in frontal regions: Stewart et al., in press, using the present sample, and Miller et al., 2002, using a substantial male sample), further examination is needed to evaluate the significance of parietal asymmetry in men. Unlike parietal EEG asymmetry results for DSM-IV–defined depression categories, findings for dimensional measures of current depression symptoms (BDI-II and HRSD) were non-significant, replicating other studies finding null results with depression scales (e.g., Diego et al., 2001; Harmon-Jones et al., 2002; Nitschke et al., 1999; Reid et al., 1998), suggesting that parietal EEG asymmetry could be linked to a more enduring trait-like factor of past depression, as parietal results for women indicate. In addition, no relationship between anxious apprehension (PSWQ) and parietal asymmetry were found, replicating research finding no hemispheric differences associated with worry at rest (Nitschke et al., 1999). Implications of Sex Differences Relatively higher right than left parietal EEG activity is thought to reflect higher levels of emotional arousal. Since men with current and/or past depression in the present study exhibited relative right parietal EEG activity (and women with past depression showed the opposite pattern), results of the present study suggest that depressed men, regardless of current depression status, might possess higher levels of anxious arousal than depressed women. Prior research, however, demonstrates the opposite pattern. First, depressed women have a higher prevalence of panic attacks than depressed men, and women with depression also report higher incidence of palpitations and tremor/shaking, consistent with panic disorder/attacks (Angst et al., 2002). Second, women have higher comorbidity of depression and anxiety disorders than men (e.g., Breslau, Schultz, & Peterson, 1995; Howell, Brawman-Mintzer, Monnier, & Yonkers, 2001). Complicating the clinical picture further is the assertion that two types of anxiety, anxious apprehension and anxious arousal, are associated with opposing patterns of EEG asymmetry, with anxious apprehension linked to asymmetry in favor of relatively greater left hemisphere activity, and anxious arousal associated with asymmetry in favor of relatively greater right hemisphere activity (e.g., Heller & Nitschke, 1998). Although present findings indicate that sex differences in anxious apprehension did not account for parietal EEG asymmetry differences between depressed men and women, the present study did not include a measure of anxious arousal, nor did it include depressed individuals with comorbid anxiety disorders, so it must remain an empirical question as to whether higher incidence of particular types of anxiety/anxiety disorders in depressed women and men influence differential patterns of parietal EEG asymmetry. Parietal EEG Asymmetry: An Endophenotype for Depression? To be an endophenotype for depression, parietal EEG asymmetry should be specific to depression and present as a trait-like feature, independent of state factors (Gottesman & Gould, 2003; Iacono, 1998). Results of the present study do not produce strong evidence that reduced right parietal EEG activity is a risk marker for depression, since parietal EEG asymmetry did not function as a trait independent of state factors. Although women
Parietal EEG asymmetry and depression
93
with a past history of MDD displayed lower relative right parietal activity regardless of current arousal level (as indexed by recent caffeine consumption), women with current MDD and men with current and/or past MDD showed higher, not lower, relative right parietal activity, suggestive of physiological hyperarousal, not underarousal. In addition, the fact that depressed men differed from non-depressed men only at high but not low levels of recent caffeine intake (current arousal) indicate that higher right parietal EEG activity is dependent on state arousal. Reduced right parietal EEG activity, however, may be a marker for features associated with MDD pertaining to a subtype of anxiety or underarousal, warranting further explication and examination. Since a measure of anxious arousal was not included in the present study, it is possible that, even though no participants met criteria for DSM-IV anxiety disorders, anxious arousal symptoms could, alone or in conjunction with anxious apprehension, moderate results for depressed participants, such that individuals with lifetime MDD and low arousal might demonstrate relatively less right parietal activation. Strengths, Limitations, and Synopsis The design of the present study was beneficial for examining the relationship between MDD status and parietal EEG asymmetry due to the recruitment of a substantial sample of medically healthy, medication-free men and women with no comorbid anxiety disorders. In addition, multiple sessions of EEG recording provided a reliable estimate of trait asymmetry effects that were consistent across medial and lateral regions of the parietal cortex. Parietal EEG results were also highly consistent across all four reference derivations, supporting the assertion that references should possess similar signal-to-noise ratios in posterior brain regions where EEG alpha activity is strongest (Hagemann et al., 2001). The present study is the first to examine parietal EEG asymmetry differences in a large sample of men and women with current versus past MDD to attempt to disentangle state versus trait MDD effects. The divergent patterns of EEG asymmetry for current MDD versus past MDD, particularly in women, suggest that relatively less right parietal activity at rest may ac-
tually be an enduring risk factor for depression (characterizing those with past MDD) but not an indicator of current severity (as it does not characterize those with current MDD). A limitation of this study is the failure to measure specifically symptoms of anxious arousal, which could potentially moderate the relationship between current MDD status and EEG asymmetry in women, since higher arousal symptoms are linked to relatively greater right parietal activity in women and men with current MDD. Limitations of the present study include a younger cohort who was not actively seeking treatment for depression, suggesting that these findings may not be assumed to apply for lateronset depression, or severe cases of depression in individuals receiving inpatient or outpatient treatment. Our early-onset sample, however, was moderately depressed, as indicated by number of major depressive episodes experienced and length of current depressive episodes, and thus might be expected to have a recurrent or chronic course of depression (cf. findings with chronic depression, Klein et al., 1999), potentially generalizing to depression later in life. Although results of the current study are most generalizable to a young, medication-free sample without comorbid Axis I disorders, the fact that an index of arousal moderated patterns of parietal asymmetry in depressed men and women suggests that the present results may extend to depressed individuals with comorbid conditions associated with anxiety, which comprise a large percentage of the MDD population (e.g., Kessler et al., 2003). In the largest study of parietal EEG asymmetry of MDD to date, men and women exhibited differential patterns of regional brain activity as a function of current and past depression status across four EEG reference montages, indicating that (a) parietal EEG asymmetry differences in MDD are robust, and (b) future studies of parietal brain asymmetry and risk for depression must take sex differences into consideration. In addition, the strength of an arousal index (recent caffeine consumption) as a moderator of parietal asymmetry in men and women indicates that comorbidity of depression and anxiety symptoms may be important in the study of endophenotypic markers of depression vulnerability.
REFERENCES Allen, J. J. B., Iacono, W. G., Depue, R. A., & Arbisi, P. (1993). Regional electroencephalographic asymmetries in bipolar seasonal affective disorder before and after exposure to bright light. Biological Psychiatry, 33, 642–646. Allen, J. J. B., Coan, J. A., & Nazarian, M. (2004). Issues and assumptions on the road from raw signals to metrics of frontal EEG asymmetry in emotion. Biological Psychology, 67, 183–218. Angst, J., Gamma, A., Gastpar, M., Lepine, J. P., Mendlewicz, J., & Tylee, A. (2002). Gender differences in depression. European Archives of Psychiatry and Clinical Neuroscience, 252, 201–209. Beck, A. T., Steer, R. A., & Brown, G. K. (1996). The Beck Depression Inventory–II. San Antonio: Harcourt Assessment. Beck, A. T., Ward, C. H., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An inventory for measuring depression. Archives of General Psychiatry, 4, 561–571. Blackhart, G. C., Minnix, J. A., & Kline, J. P. (2006). Can EEG asymmetry patterns predict future development of anxiety and depression?: A preliminary study. Biological Psychology, 72, 46–50. Breslau, N., Schultz, L., & Peterson, E. (1995). Sex differences in depression: A role for preexisting anxiety. Psychiatry Research, 58, 1–12. Bruder, G. E. (2003). Frontal and parietotemporal asymmetries in depressive disorders: Behavioral, electrophysiologic and neuroimaging findings. In K. Hugdahl & R. J. Davidson (Eds.), The asymmetrical brain (pp. 719–742). Cambridge, MA: MIT Press.
Bruder, G. E., Fong, R., Tenke, C. E., Leite, P., Towey, J. P., Stewart, J. E., & Quitkin, F. M. (1997). Regional brain asymmetries in major depression with or without an anxiety disorder: A quantitative electroencephalographic study. Biological Psychiatry, 41, 939–948. Bruder, G. E., Kayser, J., Tenke, C. E., Leite, P., Schneier, F. R., Stewart, J. W., & Quitkin, F. M. (2002). Cognitive ERPs in depressive and anxiety disorders during tonal and phonetic oddball tasks. Clinical Electroencephalography, 33, 119–124. Bruder, G. E., Tenke, C. E., Stewart, J. W., Towey, J. P., Leite, P., Voglmaier, M., & Quitkin, F. M. (1995). Brain event-related potentials to complex tones in depressed patients: Relations to perceptual asymmetry and clinical features. Psychophysiology, 32, 373–381. Bruder, G. E., Tenke, C. E., Warner, V., Nomura, Y., Grillon, C., Hille, J., & Weissman, M. M. (2005). Electroencephalographic measures of regional hemispheric activity in offspring at risk for depressive disorders. Biological Psychiatry, 57, 328–335. Bruder, G. E., Tenke, C. E., Warner, V., & Weissman, M. M. (2007). Grandchildren at high and low risk for depression differ in EEG measures of regional brain asymmetry. Biological Psychiatry, 62, 1317–1323. Bruder, G. E., Wexler, B. E., Stewart, J. W., Price, L. H., & Quitkin, F. M. (1999). Perceptual asymmetry differences between major depression with or without a comorbid anxiety disorder: A dichotic listening study. Journal of Abnormal Psychology, 108, 233–239.
94 Chapman, L. J., & Chapman, J. P. (1987). The measurement of handedness. Brain and Cognition, 6, 175–183. Davidson, R. J. (1998). Anterior electrophysiological asymmetries, emotion, and depression: Conceptual and methodological conundrums. Psychophysiology, 35, 607–614. Dawson, G., Frey, K., Panagiotides, H., Osterling, J., & Hessl, D. (1997). Infants of depressed mothers exhibit atypical frontal brain activity: A replication and extension of previous findings. Journal of Child Psychology and Psychiatry, 38, 179–186. Debener, S., Beauducel, A., Nessler, D., Brocke, B., Heilemann, H., & Kayser, J. (2000). Is resting anterior EEG alpha asymmetry a trait marker for depression? Neuropsychobiology, 41, 31–37. Deldin, P. J., Keller, J., Gergen, J. A., & Miller, G. A. (2000). Rightposterior face processing anomaly in depression. Journal of Abnormal Psychology, 109, 116–121. Deslandes, A. C., de Moraes, H., Pompeu, F. A. M. S., Ribeiro, P., Cagy, M., Capita¨u, C., & Laks, J. (2008). Electroencephalographic frontal asymmetry and depressive symptoms in the elderly. Biological Psychology, 79, 317–322. Diego, M. A., Field, T., & Hernandez-Reif, M. (2001). CES-D depression scores are correlated with frontal EEG alpha asymmetry. Depression and Anxiety, 13, 32–37. Diego, M. A., Field, T., Hernandez-Reif, M., Cullen, C., Schanberg, S., & Kuhn, C. (2004). Prepartum, postpartum, and chronic depression effects on newborns. Psychiatry, 67, 63–80. Field, T., Fox, N. A., Pickens, J., & Nawrocki, T. (1995). Relative right frontal EEG activation in 3- to 6-month-old infants of ‘‘depressed’’ mothers. Developmental Psychology, 31, 358–363. First, M. G., Spitzer, R. L., Gibbon, M., & Williams, J. B. (1997). Structured clinical interview for DSM-IV Axis I disorderFclinical version, administration booklet. New York, NY: Biometrics Research Department. Gottesman, I. I., & Gould, T. D. (2003). The endophenotypic concept in psychiatry: Etymology and strategic intentions. American Journal of Psychiatry, 160, 636–645. Graae, F., Tenke, C., Bruder, G., Rotheram, M., Piacentini, J., & Castro-Blanco, D. (1996). Abnormality of EEG alpha asymmetry in female adolescent suicide attempters. Biological Psychiatry, 40, 706–713. Hagemann, D., Naumann, E., & Thayer, J. F. (2001). The quest for the EEG reference revisited: A glance from brain asymmetry research. Psychophysiology, 38, 847–857. Hagemann, D., Naumann, E., Thayer, J. F., & Bartussek, D. (2002). Does resting electroencephalograph asymmetry reflect a trait? An application of latent state-trait theory. Journal of Personality and Social Psychology, 82, 619–641. Hamilton, M. (1960). A rating scale for depression. Journal of Neurology, Neurosurgery, and Psychiatry, 23, 56–62. Harmon-Jones, E., Abramson, L. Y., Sigelman, J., Bohlig, A., Hogan, M. E., & Harmon-Jones, C. (2002). Proneness to hypomania/mania symptoms or depression symptoms and asymmetrical frontal cortical responses to an anger-evoking event. Journal of Personality and Social Psychology, 82, 610–618. Hasler, G., Drevets, W. C., Manji, H. K., & Charney, D. S. (2004). Discovering endophenotypes for major depression. Neuropsychopharmacology, 29, 1765–1781. Hayden, E. P., Shankman, S. A., Olino, T. M., Durbin, C. E., Tenke, C. E., Bruder, G. E., & Klein, D. N. (2008). Cognitive and temperamental vulnerability to depression: Longitudinal associations with regional cortical activity. Cognition and Emotion, 22, 1415–1428. Heller, W. (1993). Neuropsychological mechanisms of individual differences in emotion, personality, and arousal. Neuropsychology, 7, 476–489. Heller, W., Etienne, M. A., & Miller, G. A. (1995). Patterns of perceptual asymmetry in depression and anxiety: Implications for neuropsychological models of emotion and psychopathology. Journal of Abnormal Psychology, 104, 327–333. Heller, W., & Nitschke, J. B. (1997). Regional brain activity in emotion: A framework for understanding cognition in depression. Cognition and Emotion, 11, 637–661. Heller, W., & Nitschke, J. B. (1998). The puzzle of regional brain activity in depression and anxiety: The importance of subtypes and comorbidity. Cognition & Emotion, 12, 421–447.
J. L. Stewart et al. Heller, W., Nitschke, J. B., Etienne, M. A., & Miller, G. A. (1997). Patterns of regional brain activity differentiate types of anxiety. Journal of Abnormal Psychology, 106, 376–385. Henriques, J. B., & Davidson, R. J. (1990). Regional brain electrical asymmetries discriminate between previously depressed and healthy control subjects. Journal of Abnormal Psychology, 99, 22–31. Henriques, J. B., & Davidson, R. J. (1991). Left frontal hypoactivation in depression. Journal of Abnormal Psychology, 100, 535–545. Henriques, J. B., & Davidson, R. J. (1997). Brain electrical asymmetries during cognitive task performance in depressed and non-depressed subjects. Biological Psychiatry, 42, 1039–1050. Howell, H. B., Brawman-Mintzer, O., Monnier, J., & Yonkers, K. A. (2001). Generalized anxiety disorder in women. Psychiatric Clinics of North America, 24, 165–178. Iacono, W. G. (1998). Identifying psychophysiological risk for psychopathology: Examples from substance abuse and schizophrenia research. Psychophysiology, 35, 621–637. Jones, N. A., Field, T., Fox, N. A., Davalos, M., Lundy, B., & Hart, S. (1998). Newborns of mothers with depressive symptoms are physiologically less developed. Infant Behavior and Development, 21, 537–541. Jones, N. A., Field, T., Fox, N. A., Lundy, B., & Davalos, M. (1997). EEG activation in 1-month-old infants of depressed mothers. Development and Psychopathology, 9, 491–505. Kayser, J., Bruder, G. E., Tenke, C. E., Stewart, J. W., & Quitkin, F. M. (2000). Event-related potentials (ERPs) to hemifield presentations of emotional stimuli: Differences between depressed patients and healthy adults in P3 amplitude and asymmetry. International Journal of Psychophysiology, 36, 211–236. Kayser, J., & Tenke, C. E. (2006). Principal components analysis of Laplacian waveforms as a generic method for identifying ERP generator patterns: I. Evaluation with auditory oddball tasks. Clinical Neurophysiology, 117, 348–368. Keller, J., Nitschke, J. B., Bhargava, T., Deldin, P. J., Gergen, J. A., Miller, G. A., & Heller, W. (2000). Neuropsychological differentiation of depression and anxiety. Journal of Abnormal Psychology, 109, 3–10. Kentgen, L. M., Tenke, C. E., Pine, D. S., Fong, R., Klein, R. G., & Bruder, G. E. (2000). Electroencephalographic asymmetries in adolescents with major depression: Influence of comorbidity with anxiety disorders. Journal of Abnormal Psychology, 109, 797–802. Kessler, R. C., Berglund, P., Demler, O., Jin, R., Koretz, D., Merikangas, K. R., & Wang, P. S. (2003). The epidemiology of major depressive disorder: Results from the National Comorbidity Survey Replication (NCS-R). Journal of the American Medical Association, 289, 3095–3105. Klein, D. N., Schatzberg, A. F., McCullough, J. P., Dowling, F., Goodman, D., Howland, R. H., & Keller, M. B. (1999). Age of onset in chronic major depression: Relation to demographic and clinical variables, family history, and treatment response. Journal of Affective Disorders, 55, 149–157. Lee, M. A., Flegel, P., Greden, J. F., & Cameron, O. G. (1988). Anxiogenic effects of caffeine on panic and depressed patients. American Journal of Psychiatry, 145, 632–635. Liotti, M., & Mayberg, H. S. (2001). The role of functional neuroimaging in the neuropsychology of depression. Journal of Clinical and Experimental Neuropsychology, 23, 121–136. Mathersul, D., Williams, L. M., Hopkinson, P. J., & Kemp, A. H. (2008). Investigating models of affect: Relationships among EEG alpha asymmetry, depression, and anxiety. Emotion, 8, 560–572. Mayberg, H. S. (1997). Limbic-cortical dysregulation: A proposed model of depression. Journal of Neuropsychiatry and Clinical Neuroscience, 9, 471–481. Mayberg, H. S. (2003). Modulating dysfunctional limbic-cortical circuits in depression: Towards development of brain-based algorithms for diagnosis and optimised treatment. British Medical Bulletin, 65, 193–207. Metzger, L. J., Paige, S. R., Carson, M. A., Lasko, N. B., Paulus, L. A., Pitman, R. K., & Orr, S. P. (2004). PTSD arousal and depression symptoms associated with increased right-sided parietal EEG asymmetry. Journal of Abnormal Psychology, 113, 324–329. Meyer, T. J., Miller, M. L., Metzger, R. L., & Borkovec, T. D. (1990). Development and validation of the Penn State Worry Questionnaire. Behaviour Research and Therapy, 28, 487–495.
Parietal EEG asymmetry and depression Miller, A., Fox, N. A., Cohn, J. F., Forbes, E. E., Sherrill, J. T., & Kovacs, M. (2002). Regional patterns of brain activity in adults with a history of childhood-onset depression: Gender differences and clinical variability. American Journal of Psychiatry, 159, 934–940. Nitschke, J. B., Heller, W., Palmieri, P. A., & Miller, G. A. (1999). Contrasting patterns of brain activity in anxious apprehension and anxious arousal. Psychophysiology, 36, 628–637. Nitschke, J. B., Heller, W., Imig, J. C., McDonald, R. P., & Miller, G. A. (2001). Distinguishing dimensions of anxiety and depression. Cognitive Therapy and Research, 25, 1–22. Perrin, F., Pernier, J., Bertrand, O., & Echallier, J. F. (1989). Spherical splines for scalp potential and current density mapping. Electroencephalography and Clinical Neurophysiology, 72, 184–187. Perrin, F., Pernier, J., Bertrand, O., & Echallier, J. F. (1990). Corrigenda. Electroencephalography and Clinical Neurophysiology, 76, 565–566. Po¨ssel, P., Lo, H., Fritz, A., & Seeman, S. (2008). A longitudinal study of cortical EEG activity in adolescents. Biological Psychology, 78, 173–178. Rabe, S., Debener, S., Brocke, B., & Beauducel, A. (2005). Depression and its relation to posterior cortical activity during performance of neuropsychological verbal and spatial tasks. Personality and Individual Differences, 39, 601–611. Reid, S. A., Duke, L. M., & Allen, J. J. B. (1998). Resting frontal electroencephalographic asymmetry in depression: Inconsistencies suggest the need to identify mediating factors. Psychophysiology, 35, 389–404.
95 Schaffer, C. E., Davidson, R. J., & Saron, C. (1983). Frontal and parietal electroencephalogram asymmetry in depressed and nondepressed subjects. Biological Psychiatry, 18, 753–762. Shankman, S. A., Tenke, C. E., Bruder, G. E., Durbin, C. E., Hayden, E. P., & Klein, D. N. (2005). Low positive emotionality in young children: Association with EEG asymmetry. Development and Psychopathology, 17, 85–98. Stewart, J. L., Bismark, A. W., Towers, D. N., Coan, J. A., & Allen, J. J. B. (in press). Resting frontal EEG asymmetry as an endophenotype for depression risk: Sex-specific patterns of frontal asymmetry. Journal of Abnormal Psychology (submitted). Tomarken, A. J., Dichter, G. S., Garber, J., & Simien, C. (2004). Resting frontal brain activity: Linkages to maternal depression and socioeconomic status among adolescents. Biological Psychology, 67, 77–102. Volf, N. V., & Passynkova, N. R. (2002). EEG mapping in seasonal affective disorder. Journal of Affective Disorders, 72, 61–69. Vuga, M., Fox, N. A., Cohn, J. F., George, C. J., Levenstein, R. M., & Kovacs, M. (2006). Long term stability of frontal electroencephalographic asymmetry in adults with a history of depression and controls. International Journal of Psychophysiology, 59, 107–115.
(Received September 14, 2009; Accepted January 14, 2010)
Psychophysiology, 48 (2011), 96–101. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01036.x
Temporal stability of regression-based electrooculographic correction coefficients
TRIEU T. H. PHAM,a RODNEY J. CROFT,b and PETER J. CADUSCHc a
Brain Sciences Institute, Swinburne University of Technology, Melbourne, Australia School of Psychology, University of Wollongong, Wollongong, Australia c Centre for Atom Optics and Ultrafast Spectroscopy, Swinburne University of Technology, Melbourne, Australia b
Abstract A means of accounting for ocular artifact in the electroencephalograph (EEG) is to subtract portions (Bs) of ocular voltage measured by the electrooculograph (EOG) from the EEG. Some such EOG correction methods calculate Bs at one time and use these to correct data recorded at a different time; these require information about the temporal stability of the Bs. This study investigated the stability of Bs over a 2-hr EEG recording session. Participants performed 5 eye movement tasks, each separated by 30 min. Four EOG correction methods were then used to calculate Bs from each of the 5 data sets, resulting in VEOG, HEOG, and REOG (where appropriate) Bs for each methods at each of the 5 time points. We did not find evidence that Bs changed over the 2-hr period, nor of any difference in temporal stability between the methods. This study suggests that it is appropriate to employ Bs calculated from calibration trials to correct data recorded within at least a 2-hr time window. Descriptors: Ocular artifact, EEG/EOG, Correction coefficients, Temporal stability
Bs. For example, if a regression procedure estimates Bs from preexperiment calibration data and applies these to data collected later, it would require the Bs to be stable across that time period. Previous research has investigated B stability, for example, by comparing two sets of Bs from two different tasks (Verleger, Gasser, & Mo¨cks, 1982), sets of Bs from the same experimental session as well as between sessions (different days) (Gratton, Coles, & Donchin, 1983), and Bs from two different subsets of data (odd or even epochs) (Semlitsch, Anderer, Schuster, & Presslich, 1986). However, these studies did not provide the analyses required to determine whether Bs are stable enough to allow calibration trials to be employed. This study extended the above research by determining the temporal stability of the Bs across a 2-hr EEG recording session, from four well-defined regression-based EOG correction methods: Verleger et al. (1982; VGM); Gratton et al. (1983; GCD); Semlitsch et al. (1986; SASP), Croft and Barry (2000a; CB). The study thus determined whether the Bs are stable across a 2-hr time window, and whether this is affected by the particular EOG correction method employed.
Regression-based electrooculograph (EOG) correction methods remove ocular artifact from the electroencephalograph (EEG) by subtracting weighted portions (correction coefficients; Bs) of ocular voltage measured by the EOG from the EEG. Other means of accounting for ocular artifact in the EEG include ocular fixation instructions and the rejection of data time locked to eye movements, and data separation procedures such as multiple source eye correction (MSEC), principle component analysis (PCA) and independent component analysis (ICA). While the advantage of regression over fixation/rejection procedures has been demonstrated (Croft & Barry, 2002), the relative advantage of regression/data separation procedures has not been demonstrated (although there are some reports of advantages of each) (Joyce, Gorodnitsky, & Kutas, 2004; Jung et al., 2000; Schlo¨gl et al., 2007; Wallstrom, Kass, Miller, Cohn, & Fox, 2004). While the present study considers issues relevant to both regression and data separation techniques, it does so through the analysis of regression-based approaches only. Regression-based EOG correction methods differ in numerous ways, one being whether Bs are calculated from the actual data that are to be corrected (some calculate Bs at one time and use these to correct data recorded at a different time, while others calculate Bs from the data to be corrected). It is important to note that the former assumes some level of temporal stability of the
Methods Subjects Twenty-four healthy volunteers (11 male and 13 female) aged 18 to 46 (mean 5 24.12) years participated in the study and were paid for their time (prospective participants with pathologic eye disease, such as macular degenerations and choroidal sclerosis,
This research was supported by ARC Discovery Grant, DP0559410 and Australian Postgraduate Award scholarship (to Trieu T. H. Pham). Address correspondence to: Rodney J. Croft, School of Psychology, University of Wollongong, Wollongong 2522, Australia. E mail: rcroft@ uow.edu.au 96
Temporal stability of EOG correction coefficients were excluded from the study). Each gave written informed consent and was free to withdraw from the study at any time. Procedure This study was approved by the Swinburne University of Technology Human Research Ethics Committee. Upon arrival at the laboratory, participants completed consent forms and had EEG recording apparatus attached. They were seated in an armchair in an electrically shielded, sound attenuated booth, and verbally instructed to keep their head and body reasonably still and to perform a series of tasks. The recording session lasted approximately 120 min. Tasks Participants performed 5 eye movement tasks, with each separated by a 30-min cognitive battery that will not be reported here. For each eye movement task, participants were positioned such that their eyes were approximately 60 cm from and at the same height as the center of a 38.5 ! 27.5 cm Hyundai ImageQuest L70S computer monitor. The plane of the monitor screen was perpendicular to both the floor and the sagittal plane of the participants, with this monitor used to display stimuli that cued a series of eye movements. The first half of each eye movement task (EM data) involved sequential presentations of 0.6-sec duration, 2.5 ! 1.5 cm dark red rectangles (stimulus onset asynchrony; SOA 5 0.8 sec). These alternated between the top and bottom (midline of screen’s X-axis) positions (i.e., resulting in 40 ‘down’ and 40 ‘up’ movements), and then between the left and right (midline of screen’s Y-axis) positions (i.e., resulting in 40 ‘right’ and 40 ‘left’ movements). Participants were asked (both verbally and via written instructions on the screen) to keep their head still and to follow the stimuli with their eyes. The second half of each eye movement task (blink data) involved sequential presentations of 40 0.8-sec duration, low-intensity dark red and then high-intensity royal blue rectangles (2.5 ! 1.5 cm), at the centre of the screen (SOA 5 1.0 sec). Participants were asked (both verbally and via written instructions on the screen) to blink whenever the rectangle changed from one type to the other. Prior to performing the eye movement tasks, subjects were allowed to practice a few times. Data Acquisition EEG data were recorded from a 64-channel, tin-electrode electro-cap using Neuroscan Acquire 4.3 software and SynAmps2 amplifiers, referenced to a point midway between Cz and CPz, and grounded to a point midway between Fz and FPz. EOG data were recorded from tin electrodes above (E1) and below (E3) the left eye, above (E2) and below (E4) the right eye, and from the outer canthi of the left (E5) and right (E6) eyes. vertical EOG (VEOG) was defined as (E11E2-E3-E4)/2, horizontal EOG (HEOG) as E5-E6, and radial EOG (REOG) as (E11E21E31E4)/4. A gain of 2500 was used for each channel with a bandpass of 0 to 200 Hz, system sensitivity was 24 nV/ bit, A/D (analog to digital) resolution was 32 bit, impedances were below 5 kO at the start of the recording, and data were digitized at 1250 Hz. Data Analysis Analyses were performed using Excel, Scan 4.2, and Scan 4.3 Edit software. EEG data were re-referenced to the algebraic mean of M1 and M2, 2nd order Butterworth low-pass filtered at 30 Hz, DC corrected and decimated to 250 Hz in Neuroscan 4.3.
97 Each of the ocular artifact methods was then performed on each of the 5 eye movement data sets, producing VEOG, HEOG (where appropriate: SASP does not employ HEOG for correction) and REOG (where appropriate: only CB employs REOG for correction) Bs for each data set and each method. The VGM method employed the Verleger et al. (1982) algorithm, the GCD employed the EMCP2001 version (http://www.gehringlab.org/ resources.html) of the Gratton et al. (1983) procedure without modification, the SASP was the unmodified default ocular artifact reduction procedure of Neuroscan 4.3 (Semlitsch et al., 1986), and the CB employed the RAAA algorithm of Croft and Barry (2000a). Statistical Analysis Statistical analyses were performed using SPSS version 16. Note that, as different correction procedures employed different EOG derivations for different eye movement types, an omnibus analysis cannot be performed across all methods/EOG derivations/ eye movement types. Thus, separate analyses were performed where the following were treated as dependent variables: 1/ VEOG Bs for SASP blink data, 2/VEOG Bs for VGM combined EM plus blink data, 3/HEOG Bs for VGM combined EM plus blink data, 4/VEOG Bs for GCD EM data, 5/HEOG Bs for GCD EM data, 6/VEOG Bs for GCD blink data, 7/HEOG Bs for GCD blink data, 8/VEOG Bs for CB EM data, 9/HEOG Bs for CB EM data, and 10/REOG Bs for CB partially corrected blink data. For each of the above dependent variables, repeated measures contrasts were performed to determine whether there were effects of Time (0, 30, 60, 90, 120 min; linear and quadratic), Sagittal Plane (Prefrontal, Frontal, Central, Parietal, Occipital; linear), Lateral Plane (Left, Midline, Right; linear), and Time (linear and quadratic) ! Sagittal Plane (linear) and Time (linear and quadratic) ! Lateral Plane (linear) interactions. Results As expected, for all methods VEOG and REOG Bs decreased along the Sagittal Plane as a function of distance from the eyes, and HEOG Bs changed from positive at left to negative at right hemisphere sites. See Table 1 for descriptive and Table 2 for inferential statistics. For each method, no significant effect of Time on the Bs was found: VEOG, HEOG, and REOG Bs were not linearly or quadratically related to Time, and there was no interaction between Time and either the Sagittal or Lateral Plane. See Figure 1 for a graphical representation of these results. To provide an indication of the magnitude of the effect of time as a function of the EOG voltages, Table 1 also displays ‘B Temporal Effect’ (BTE) values, such that BTE values represented ‘the standard deviation of B across time ! the average of the standard deviation of the EOG voltages (in mV) of the ERP data points over time, averaged across participants.’ Discussion The results revealed that VEOG, HEOG, and REOG Bs (for both blink and EM data) did not differ significantly across a 2-hr time window, and that this was not affected by the particular EOG correction method employed. As can be seen in Table 1, as well as there not being a statistically significant effect of time, the
98
T. T. H. Pham et al.
Table 1. Bs and an Index of EOG Variation Attributable to B Changes Over Time (BTE) for the 4 Regression Methods, for Scalp Regions and Time Points Separately Index
Time
PF
F
C
P
O
0 30 60 90 120 0–120 0 30 60 90 120 0–120 0 30 60 90 120 0–120 0 30 60 90 120 0–120 0 30 60 90 120 0–120 0 30 60 90 120 0–120 0 30 60 90 120 0–120 0 30 60 90 120 0–120 0 30 60 90 120 0–120 0 30 60 90 120 0–120
0.398 0.390 0.397 0.402 0.396 0.336 0.367 0.346 0.367 0.360 0.362 0.187
0.208 0.204 0.207 0.213 0.208 0.316 0.175 0.163 0.179 0.178 0.177 0.141
0.132 0.132 0.131 0.136 0.132 0.316 0.093 0.085 0.098 0.095 0.094 0.123
0.103 0.101 0.097 0.107 0.103 0.306 0.062 0.053 0.065 0.065 0.064 0.108
0.086 0.084 0.080 0.088 0.084 0.285 0.041 0.033 0.043 0.043 0.043 0.100
SASP
Blink VEOG
B
VGM
EM/Blink VEOG
BTE B
EM/Blink HEOG
BTE B
EM VEOG
BTE B
EM HEOG
BTE B
Blink VEOG
BTE B
Blink HEOG
BTE B
EM VEOG
BTE B
EM HEOG
BTE B
Blink REOG
BTE B
GCD
CB
BTE
0.427 0.419 0.423 0.421 0.414 0.123
0.392 0.365 0.396 0.389 0.393 0.380
0.472 0.474 0.465 0.460 0.460 0.111
! 0.596 ! 0.609 ! 0.644 ! 0.709 ! 0.518 1.162
0.254 0.244 0.250 0.248 0.246 0.123
0.203 0.182 0.208 0.206 0.207 0.338
0.340 0.336 0.324 0.323 0.328 0.135
! 1.047 ! 0.996 ! 1.027 ! 1.127 ! 0.970 1.234
0.176 0.167 0.175 0.171 0.170 0.116
0.127 0.108 0.132 0.129 0.129 0.318
0.279 0.274 0.260 0.257 0.262 0.166
! 1.150 ! 1.176 ! 1.112 ! 1.204 ! 1.094 1.237
0.149 0.139 0.143 0.142 0.141 0.112
0.106 0.085 0.109 0.106 0.108 0.304
0.247 0.238 0.223 0.225 0.230 0.159
! 1.038 ! 1.075 ! 1.033 ! 1.188 ! 1.094 1.323
L
M
R
0.036 0.029 0.033 0.034 0.034 0.031
! 0.007 ! 0.012 ! 0.008 ! 0.007 ! 0.006 0.030
! 0.052 ! 0.057 ! 0.052 ! 0.050 ! 0.049 0.029
0.046 0.051 0.040 0.041 0.045 0.059
0.001 0.007 ! 0.003 0.001 0.003 0.057
! 0.044 ! 0.031 ! 0.047 ! 0.042 ! 0.040 0.075
0.087 0.178 0.104 0.111 0.105 0.243
0.015 0.126 0.040 0.062 0.047 0.243
! 0.036 0.101 ! 0.001 0.030 ! 0.011 0.287
0.048 0.043 0.041 0.042 0.042 0.063
0.007 0.003 0.001 0.003 0.003 0.061
! 0.038 ! 0.041 ! 0.042 ! 0.041 ! 0.040 0.059
0.125 0.116 0.120 0.120 0.119 0.113
0.087 0.068 0.089 0.086 0.089 0.292
0.215 0.210 0.197 0.199 0.202 0.147
! 0.894 ! 0.970 ! 0.919 ! 1.091 ! 0.93 1.242
Note : ‘PF,’ ‘F,’ ‘C,’ ‘P,’ ‘O,’ ‘L,’ ‘M,’ and ‘R’ represent ‘prefrontal,’ ‘frontal,’ ‘central,’ ‘parietal,’ ‘occipital,’ ‘left,’ ‘middle,’ and ‘right,’ respectively; BTE represents the effect of temporal variation in B on the corresponding EOG channel (i.e., ‘the standard deviation of B across time’ " ‘the standard deviation of the EOG voltages (in mV) of the ERP data points over time,’ averaged across participants). SASP, Semlitsch et al. (1986); VGM, Verleger et al. (1982); GCD, Gratton et al. (1983); CB, Croft and Barry (2000a).
magnitude of effect was very small. For instance, as is represented by BTE, the variation in EEG voltages that would occur due to changes in the correction coefficients over time varied
from o0.03 mV for HEOG correction using the VGM method to 0.38 mV for VEOG correction using the GCD method. There were some significant differences among the Bs derived from
Temporal stability of EOG correction coefficients
99
Table 2. Statistical Results for Both EM and Blink VEOG, HEOG, REOG (where appropriate) Bs, for the 4 Regression-Based EOG Correction Methods Statistical comparisons Sag SASP
Blink VEOG
Lat
Time
TimenSag
TimenLat
L L L Q LnL LnL
VGM
EM/Blink VEOG
L L L Q LnL LnL
EM/Blink HEOG
L L L Q LnL LnL
GCD
EM VEOG
L L L Q LnL LnL
EM HEOG
L L L Q LnL LnL
Blink VEOG
L L L Q LnL LnL
Blink HEOG
L L L Q LnL LnL
CB
EM VEOG
L L L Q LnL LnL
EM HEOG
L L L Q LnL LnL
Blink REOG
L L L Q LnL LnL
F(1,23)
Sig (p)
725.51 6.24 0.29 0.06 0.97 0.89 853.66 20.75 1.31 0.29 0.30 1.49 0.62 647.34 0.59 1.69 2.15 2.88 831.02 1.39 0.76 0.20 1.48 0.28 3.58 380.14 0.47 0.24 3.00 3.96 662.62 1.75 1.32 0.56 0.33 2.12 0.21 35.12 0.02 1.76 1.69 0.03 944.60 0.05 2.52 0.23 0.02 1.07 2.94 632.18 0.78 1.54 0.58 0.93 14.43 0.53 o0.01 0.24 0.41 2.27
o0.001 0.020 0.598 0.804 0.335 0.356 o0.001 o0.001 0.264 0.597 0.588 0.234 0.440 o0.001 0.451 0.207 0.156 0.103 o0.001 0.250 0.394 0.660 0.236 0.602 0.071 o0.001 0.499 0.627 0.097 0.059 o0.001 0.199 0.262 0.464 0.571 0.159 0.652 o0.001 0.903 0.198 0.207 0.869 o0.001 0.818 0.126 0.143 0.896 0.312 0.100 o0.001 0.386 0.227 0.454 0.345 0.001 0.474 0.955 0.626 0.531 0.146
Note : ‘L’ 5 ‘linear,’ ‘Q’ 5 ‘quadratic.’ Significant results are highlighted in BOLD. SASP, Semlitsch et al. (1986); VGM, Verleger et al. (1982); GCD, Gratton et al. (1983); CB, Croft and Barry (2000a).
different EOG correction methods, but these were not related to their temporal stability (VEOG Bs for SASP and VGM were significantly but marginally smaller on the right than left, but this was not the case for GCD and CB).
The results thus suggest that it is appropriate to employ Bs calculated from calibration trials to correct data recorded within at least a 2-hr time window, and corresponding to this there will be no difference in estimating Bs from the data to be corrected
100
T. T. H. Pham et al. VEOG
HEOG
0.40 0.20 0.00 –0.20
0
30
60
90
120
0.00 0
30
60
90
0.06 0.00 –0.06 –0.12
120
0.12
0.40
0.06
EM Bs
0.60
0.20 0.00 –0.20
0
30
60
90
0.80
0.40
0.40
0.00 –0.20
0
30
60
90
60
90
120
0
30
60
90
120
0
30
60
90
120
0.12
1.50
0.40
0.06
0.50
0.00 –0.20
0
30
60 90 Time (ms)
120
Left
Occipital
Right
–0.40
0.60
0.20
Prefrontal
0.00
–0.80
120
30
–0.06
0.60
0.20
0
0.00
–0.12
120
Blink Bs
GCD EMBs
EM / Blink Bs
0.20
–0.20
GCD Blink Bs
0.12
0.40
EM Bs
VGM EM / Blink Bs
0.60
CB EM Bs
REOG
Blink Bs
SASP Blink Bs
0.60
0.00 –0.06 –0.12
0
30
60 90 Time (ms)
120
–0.50 –1.50 –2.50
0
30
60 90 Time (ms)
120
Figure 1. Temporal stability (means and standard deviations) of the Bs across a 2-hr time window for SASP VEOG blink data, VGM VEOG combined EM plus blink data, VGM HEOG combined EM plus blink data, GCD VEOG EM data, GCD HEOG EM data, GCD VEOG blink data, GCD HEOG blink data, CB VEOG EM data, CB HEOG EM data, and CB REOG blink data are shown on separate rows. Note that columns represent VEOG, HEOG, and REOG Bs, and representative sagittal (Prefrontal and Occipital) and lateral (Left and Right) scalp regions are shown separately. Other sagittal (Frontal, Central, Parietal) and lateral (Middle) scalp regions displayed similar behaviors.
compared with that of a pre-experiment calibration trial. As can be seen in Figure 1, there will still be some random variation of Bs over time, but this would become less of an effect (especially for HEOG and REOG data) when there are good ocular signal-tonoise ratios (SNRs) in the data from which the Bs are derived (as argued by Verleger et al., 1982, and Croft & Barry, 2000b). For example, the present data set employed the minimum number of epochs recommended for adequate SNR in the B estimation process (Croft & Barry, 2000b) in order to allow more time points to be considered, but this could be increased to provide more reliable Bs. This may be particularly important for methods such as GCD, where the HEOG channel Bs for blinks were derived from only a small number of data points and HEOG power, with the result being a tendency to greater variability over time (albeit, far from a significant one). It should be noted that ‘stability’ does not imply ‘validity’; methods that have stable Bs may perform consistently well or consistently poorly. For example, a data set that consisted of
only noise (with no ocular artifact) might consistently produce an erroneous B 5 1 (with a poor EOG correction algorithm), and this would provide a very poor correction of the data. Consequently, the present results cannot comment on the validity of any of the EOG correction methods employed here, it can only comment on the consistency of their Bs. For a discussion of the validity of the methods employed here, see Croft, Chandler, Barry, Cooper, and Clarke (2005). In conclusion, the results revealed that there were no significant effects of Time on the Bs within at least a 2-hr time window, which suggests that it is just as appropriate to employ Bs calculated from calibration trials to correct data recorded within this time period, as it would be to calculate them from the data to be corrected itself. However, given that estimation of Bs from a separate calibration trial allows for greater ocular signal-to-noise ratios and that that in turn can improve B estimation, there may be advantages of employing calibration trials in EOG correction methods.
Temporal stability of EOG correction coefficients
101 REFERENCES
Croft, R. J., & Barry, R. J. (2000a). EOG correction of blinks with saccade coefficients: A test and revision of the aligned-artifact average solution. Clinical Neurophysiology, 3, 444–455. Croft, R. J., & Barry, R. J. (2000b). EOG correction: Comparing different calibration methods, and determining the number of epochs required in a calibration average. Clinical Neurophysiology, 111, 440– 443. Croft, R. J., & Barry, R. J. (2002). Issues relating to the subtraction phase in EOG artifact correction of the EEG. International Journal of Psychophysiology, 44, 187–195. Croft, R. J., Chandler, J. S., Barry, R. J., Cooper, N. R., & Clarke, A. R. (2005). EOG correction: A comparison of four methods. Psychophysiology, 42, 16–24. Gratton, G., Coles, M. G. H., & Donchin, E. (1983). A new method for the off-line removal of ocular artifact. Electroencephalography and Clinical Neurophysiology, 55, 468–484. Joyce, C. A., Gorodnitsky, I. F., & Kutas, M. (2004). Automatic removal of eye movement and blink artifacts from EEG data using blind component separation. Psychophysiology, 41, 313–25. Jung, T. P., Makeig, S., Humphries, C., Lee, T.-W., McKeown, M. J., Iragui, V., & Sejnowski, T. J. (2000). Removing electroencephalo-
graphic artifacts by blind source separation. Psychophysiology, 37, 163–178. Semlitsch, H. V., Anderer, P., Schuster, P., & Presslich, O. (1986). A solution for reliable and valid reduction of ocular artifacts, applied to the P300. Psychophysiology, 23, 695–703. Schlo¨gl, A., Keinrath, C., Zimmermann, D., Scherer, R., Leeb, R., & Pfurtscheller, G. (2007). A fully automated correction method of EOG artifacts in EEG recordings. Clinical Neurophysiology, 118, 98– 104. Verleger, R., Gasser, T., & Mo¨cks, J. (1982). Correction of EOG artifacts in event-related potentials of the EEG: Aspects of reliability and validity. Psychophysiology, 19, 472–480. Wallstrom, G. L., Kass, R. E., Miller, A., Cohn, J. F., & Fox, N. A. (2004). Automatic correction of ocular artifacts in the EEG: A comparison of regression-based and component-based methods. International Journal of Psychophysiology, 53, 105–119.
(Received June 29, 2009; Accepted January 6, 2010)
Psychophysiology, 48 (2011), 102–111. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01040.x
Impaired semantic processing during task-set switching: Evidence from the N400 in rapid serial visual presentation
FRANC¸OIS VACHON and PIERRE JOLICŒUR Centre de recherche en Neuropsychologie et Cognition, De´partement de psychologie, Universite´ de Montre´al, Montre´al, Que´bec, Canada
Abstract The cognitive system is able to reconfigure mental resources flexibly to adapt to new a task. While task-set switching is known to be detrimental to behavioral performance, less is known about the precise loci of these effects on stimulus processing. We measured event-related potentials to explore the neural consequences of task-set switching on semantic processing. We examined the context-sensitive N400 component evoked by the second of two target words embedded in a rapid serial visual presentation under conditions that involved either a task-set switch or no switching. Whereas the N400 was unaffected by the lag separating the targets in the absence of switching, it was delayed and attenuated in the switch condition when the targets were adjacent in the sequence. These findings indicate that task-set reconfiguration temporarily prevents semantic activation and provide evidence for the nonautomaticity of semantic processing of words. Descriptors: Cognition, Language/speech, Normal volunteers, EEG/ERP
activity to another (e.g., Kieffaber & Hetrick, 2005; Rushworth, Passingham, & Nobre, 2002; Travers & West, 2008), a thorough understanding of the switch cost at the neural level has been hampered by a relative dearth of studies. Given the importance of reading for most individuals, the present study examined the neural fate of words presented during task-set reconfiguration. Although task-set switching is mainly mediated by executive control processes that are independent from basic task processes (e.g., Rubinstein, Meyer, & Evans, 2001), recent behavioral data suggest that the reconfiguration process can compromise semantic processing. Vachon, Tremblay, and Jones (2007) found reduced semantic priming for words presented during the reconfiguration process. Within the context of a bottleneck approach to dual-task costs, the authors interpreted these findings as reflecting the postponement of semantic processing by task-set reconfiguration (cf. Oriet & Jolicœur, 2003). However, their data do not allow determining whether reconfiguration actually prevents meaning extraction of briefly presented words only for a short period of time or forever (see Enns, Visser, Kawahara, & Di Lollo, 2001). In the present study, we combined the eventrelated potentials (ERP) technique to provide more direct continuous indicators of word processing in the context of the attentional blink (AB) paradigm in order to uncover the neural consequences of task-set switching on word processing. One of the most studied electrophysiological measures of word processing is the N400 component. This negative-going brain potential occurs at centroparietal electrode sites around 400 ms after the onset of a word that fails to match a previously
The ability to switch flexibly between different tasks, a hallmark of cognitive control, is crucial for adapting to many everyday life situations. However, this flexibility in cognitive processes has a cost: People are typically slower and less accurate immediately following a task switch (see Monsell, 2003). This switch cost is assumed to reflect, at least in part, the need for establishing a new task set (or attentional set)Fa specific configuration of the cognitive systemFto perform the new task adequately (e.g., Rogers & Monsell, 1995; Sakai, 2008; Schneider & Logan, 2007). The effects of task-set switching on performance are well documented, but much less is known about the detailed consequences for stimulus processing. We addressed this issue using electrophysiological measures that can sometimes provide more direct indices of stimulus processing than behavioral measures alone. Although most neurophysiological research on task-set switching focused on the neural correlates of the control processes responsible for adapting to the change from one type of mental This work was made possible by research grants from the Natural Sciences and Engineering Research Council of Canada (NSERC), by infrastructure support from the Canada Fund for Innovation and the Fonds pour la Recherche en Sante´ du Que´bec, and by the Canada Research Chairs program awarded to Pierre Jolicœur, and by a postdoctoral fellowship from NSERC awarded to Franc¸ois Vachon. We thank Christine Lefebvre, Jennifer Thibault, Elena Kulagina, Olivier Charron, and Jessica Pineault for technical assistance. Address correspondence to: Franc¸ois Vachon, who is now at E´cole de psychologie, Universite´ Laval, Que´bec, QC, G1V 0A6, Canada. E-mail:
[email protected] 102
Impaired semantic processing during task-set switching established semantic context (e.g., Bentin, McCarthy, & Wood, 1985; Kutas & Hillyard, 1980; Rugg, 1985; see also Friederici, 2004, and Kutas, 1997, for reviews). For example, the word DOG would elicit a larger N400 component if preceded by the context word FURNITURE than if preceded by the context word ANIMAL. Although the N400 response has been shown to vary systematically with the processing of semantic information, the exact functional significance of the N400 is still a matter of debate. Some researchers have proposed that the N400 reflects the process of semantic integration of the target word into its preceding context (e.g., Brown & Hagoort, 1993; Halgren, 1990; Holcomb, 1993; Osterhout & Holcomb, 1992). According to this ‘‘integration’’ view, the N400 is larger for words incongruent with the semantic context because integration is more difficult than in congruent contexts. Alternatively, it has been postulated that the N400 indexes facilitated access to some lexical representations from long-term memory (e.g., Federmeier, 2007; Kutas & Federmeier, 2000; Lau, Phillips, & Poeppel, 2008). The ‘‘lexical’’ view explains the difference in the N400 amplitude between congruent and incongruent words by proposing that congruent and, thus, predictable words are easier to access from long-term memory than incongruent words. As we employed the N400 response as a tool to index semantic processing, the exact nature of the mechanisms underlying the elicitation of the N400 is not of primary relevance in the present study. What matters here is that the presence of an N400 indicates that the word has been analyzed to the point of meaning extraction. A delayed N400 would then reflect some postponement in the semantic processing of target words. The AB paradigm provides an ideal tool to assess the impact of task-set switching on the N400. In this dual-task paradigm, two target stimuli (T1 and T2) are embedded in a rapid serial visual presentation (RSVP), and attending to T1 produces an AB, that is, a transient impairment in T2 report accuracy (Raymond, Shapiro, & Arnell, 1992). The second-target deficit typically shows a U-shaped relationship with the lag separating the two targets, with no impairment when T2 immediately follows T1 in the RSVP (i.e., at Lag 1), an important loss of accuracy when T2 is presented at Lag 2 or 3, and a gradual recovery as the lag increases. The AB is usually said to arise from the attentional demands of processing T1, which momentarily prevent central resources from being applied to T2 (see, e.g., Dux & Marois, 2009, for a review). Critical to the phenomenon is the interference provided by the distractor stimuli, especially those that immediately follow each target, referred to as masks (see Enns et al., 2001; Vachon & Tremblay, 2008). Notably, the removal of all post-T2 distractors usually provokes the abolition of the main behavioral manifestation of the AB, namely, the loss of accuracy for the report of T2 (e.g., Giesbrecht & Di Lollo, 1998; Jolicœur, 1999; Vogel & Luck, 2002; Vachon & Tremblay, 2008). Despite the absence of second-target deficit when T2 is not masked, ERP studies have shown that one of the main consequences of the AB, namely, a delay in the processing of T2, occurs nonetheless, as demonstrated by a delay of the P3 ERP to an unmasked T2 (Ptito, Arnell, Jolicœur, & MacLeod, 2008; Vogel & Luck, 2002). Following these ERP studies, we employed RSVPs in which T2 was always the last item (i.e., T2 was not masked) in order to increase the sensitivity of our paradigm to potential latency effects. One advantage of using the AB paradigm is that word meaning can be accessed during the AB (e.g., Maki, Frigen, & Paulson, 1997; Martens, Wolters, & van Raamsdonk, 2002; Shapiro,
103 Driver, Ward, & Sorensen, 1997; Vachon et al., 2007). Despite the deficit in conscious report, there is evidence that the N400 evoked by the T2 word is unaltered by the AB (Luck, Vogel, & Shapiro, 1996; Rolke, Heil, Streb, & Hennighausen, 2001; Vogel, Luck, & Shapiro, 1998), indicating that T2 is processed to the semantic level even if it could not be reported. The typical AB is obtained when both tasks share the same task requirements, but participants can also be asked to perform a different task for the two targets, inducing a task-set switch. Yet another advantage of the paradigm is that dual-task costs ensuing from task-set switching can be distinguished at the behavioral level from those attributable to the AB because the consequences of switching are usually circumscribed to the case when T1 and T2 are adjacent in the RSVP (i.e., at Lag 1; e.g., Juola, Botella, & Palacios, 2004; Kawahara, Zuvic, Enns, Di Lollo, 2003; Potter, Chun, Banks, & Muckenhoupt, 1998; Vachon et al., 2007; see also Visser, Bischof, & Di Lollo, 1999, for a review). For instance, the presence of task-set switching can lead to a second-target deficit at short lags even when T2 is not masked (e.g., Dell’Acqua, Pascali, Jolicœur, & Sessa, 2003; Kawahara, Di Lollo, & Enns, 2001; Kawahara et al., 2003). Although there is ample evidence of dissociation between task-set switching and the AB at the behavioral level, such a distinction is yet to be demonstrated at the neural level. In the present study, we tested whether semantic processing is constrained by task-set reconfiguration by examining the N400 evoked by a T2 word embedded in RSVPs with and without taskset switching. Following Vogel and colleagues’ (1998) approach, a context word was presented before each RSVP, and participants had to indicate whether the T2 word was related to the context or not. To isolate the T2-elicited N400 from the ERP responses elicited by the other RSVP stimuli, we subtracted the brain activity elicited in T2-related trials from that elicited in T2unrelated trials. In the no-switch condition, the same semantic judgment had to be performed for both T1 and T2 words. In the switch condition, a task-set switch was induced by asking participants to compare digits surrounding the T1 word and then perform the semantic judgment on the T2 word. We examined the impact of such switching on the amplitude and the latency of the N400, with a special interest in results at Lag 1. According to Vachon and colleagues (2007; see also Oriet & Jolicœur, 2003), the substantial reduction in the semantic priming observed for words occurring during task-set reconfiguration (i.e., at Lag 1) is a consequence of two factors: The formation of a perceptual representation for T2 is temporarily suspended by the reconfiguration process, and the subsequent presentation of a distractor hinders the establishment of an adequately encoded representation of T2 that then prevents semantic activation of T2. We predicted that in the absence of T2 masking, semantic activation would be only slightly affected, even for stimulus presentations outlasted by the reconfiguration process, because T2 perceptual representation would be eventually formed, once the reconfiguration completed, without being corrupted by a subsequent stimulus. This prediction has not been tested, however. Moreover, the findings of Vachon and colleagues do not preclude the possibility that task-set reconfiguration permanently prevents semantic processing rather than merely postponing it so that, even without backward masking, no adequate representation could be formed for very brief stimuli disappearing before the completion of the reconfiguration process (e.g., Enns et al., 2001). The use of RSVPs in which T2 constitutes the last stimulus will allow disentangling between these two alternatives. So, if
104 task-set reconfiguration momentarily suspends semantic processing, as suggested by Vachon and colleagues (2007), the N400 evoked by T2 should be delayed at Lag 1 in the switch condition. If, however, semantic processing is permanently prevented by the reconfiguration process (e.g., Enns et al., 2001), T2 is expected to elicit no N400 at Lag 1 in the switch condition. The present electrophysiological data may also speak to the issue of the automaticity of visual word processing. For instance, any suppression or delay of the N400 for a T2 word appearing during task-set reconfiguration would suggest that word processing is not fully automatic (see, e.g., Lien, Ruthruff, Cornett, Goodin, & Allen, 2008). Moreover, an examination of the N400 elicited by T1 may contribute to the debate regarding whether low-level processing is sufficient for semantic activation or not, as participants had to perform a semantic judgment on T1 in the noswitch condition whereas they judged the similarity of digits surrounding the T1 word in the switch condition. Previous research on this issue has yielded rather mixed results: Whereas some studies reported no N400 when using nonreading tasks such as the letter-search task (e.g., Marı´ -Beffa, Valde´s, Cullen, Catena, & Houghton, 2005) and a case-discrimination task (e.g., Chwilla, Brown, & Hagoort, 1995), other found a reliable N400 under similar experimental contexts (e.g., Heil, Rolke, & Pecchinenda, 2004; Ku¨per & Heil, 2009). Hence, a second objective of the present study was to examine, through the analysis of the N400 elicited by T1 and T2, to what extent the processing of visually presented words is automatic.
Method Participants Twenty neurologically normal university students (12 women; mean age 5 21.5 years), who reported normal or correctedto-normal vision, took part in the experiment for financial compensation. All were French native speakers.
Materials Each RSVP stimulus was composed of a seven-character central string surrounded by two bigger characters located at each extremity of the string (see Figure 1). The height of string characters was 0.31 and the height of surrounding characters was 0.61. The whole stimulus subtended a visual angle of approximately 3.21 in width. All stimuli were displayed in black on a light gray background except for T1, which appeared in green. Distractors consisted of a string of seven different digits randomly selected from the set 1–9, surrounded by either two uppercase Os or Xs. The central string of T1 and T2 was a word of three to seven uppercase letters. Words less than seven letters long were flanked by Xs to create a seven-character string. The characters surrounding T1 were always two digits, either identical or different, randomly selected between the digits 2, 3, 4, and 6. The characters surrounding T2 were either Os or Xs. As shown in Figure 1, each trial began with the presentation of a context word, displayed in light blue for 1500 ms, followed by a 500-ms black central fixation cross. A stream of 15 stimuli, including T1 and T2, was then presented at the center of the screen. Each RSVP stimulus was shown for 117 ms with no interstimulus interval. Within the stream, the surrounding characters alternated between Os and Xs. T2 was always the last item
F. Vachon & P. Jolicœur
Figure 1. Schematic representation of the trial sequences used in the present experiment (example at Lag 3). All stimuli appeared in black on a light gray background except the context word, which was presented in light blue, and the first target (T1), which was displayed in green. The second target (T2) was always the last item presented.
presented. T1 could either occur immediately before T2 (Lag 1) or precede T2 by two (Lag 3) or six distractors (Lag 7). The target word list was composed of 360 French words equally distributed among 30 categories inspired from the Van Overschelde, Rawson, and Dunlosky (2004) category norms. Great care was taken to ensure that each word constituted a good category exemplar and was common in French (e.g., apple in the category FRUIT). Design and Procedure For each of the two switching conditions, participants performed six blocks of 60 experimental trials, preceded by 15 practice trials. Half of the participants started with the no-switch condition, in which they had to indicate for both T1 and T2 whether the target word was related or not to the context word. In the switch condition, the task was to indicate whether the digits surrounding T1 were identical or different and then indicate whether the T2 word was related or not to the context word. Hence, the two switching conditions only differed on T1 task instructions. Within each block, the semantic relation between the context word (a category label) and the target words (category exemplars) was manipulated so that T1 and T2 could be independently related (i.e., part of the same category) or unrelated (i.e., part of a different category) to the context word. This manipulation yielded four equiprobable conditions that formed a 2 ! 2 design with context-T2 relation (related vs. unrelated) crossed with context-T1 relation (related vs. unrelated). Word triplets (i.e., context word, T1 word, T2 word) were selected pseudorandomly so that each exemplar of the word list served equally often as T1 and T2 and occurred in every semantic relation trial type. When both targets were related to the context word, T1 and T2 were always related to each other. Within the 25% of trials in which both targets were not related to the context word, T1 and T2 were related to each other in half of them and unrelated in the other
Impaired semantic processing during task-set switching half. Semantic relation type and lag varied randomly from trial to trial in each block. Given that the semantic relation between the targets and the context word served as the basis for the behavioral response, the T1-context and T2-context factors cannot be included in the analysis of the behavioral data. Consequently, behavioral performance (i.e., percent target error) was examined according to a 2 ! 3 within-subject design with switching condition (switch vs. no-switch) and lag (1, 3, and 7) as repeated-measures factors. Instructions were presented on the screen at the beginning of each session. Participants initiated a trial by pressing the spacebar of the keyboard. Following the RSVP, participants were asked to respond to T1 first and then respond to T2, without time pressure. For T1, participants used their left index finger to respond, pressing the z key of the keyboard if T1 was related to the context word (in the switch condition, if the surrounding digits were identical) and the x key if the T1 word and the context word were unrelated (in the switch condition, if the surrounding digits were different). Participants had then to use their right index finger to indicate whether T2 was related to the context word by pressing the n key or, if not, by pressing the m key. No accuracy feedback was provided. Electrophysiological Recordings and Analyses Electroencephalographic (EEG) activity was recorded at 256 Hz from 64 Ag/AgCl electrodes mounted on an elastic cap and placed according to the International 10/10 system and from 5 additional electrodes (two mastoids, external to left and right outer canthi, and below the left eye). EEG signals were re-referenced off-line to the average of the left and right mastoids and then bandpass filtered (0.01–40 Hz). The horizontal electrooculogram (EOG), recorded as the voltage difference between electrodes placed lateral to the external canthi, was used to measure horizontal eye movements. The vertical EOG, recorded as the voltage difference between two electrodes placed above (Fp1) and below the left eye, was used to detect eyeblinks. EOG signals were low-pass filtered at 5 Hz to facilitate the artifact rejection process. Trials containing ocular artifacts (blinks and eye movements) were rejected. The average percentage of trials per participant that were rejected was 7.1% (range: 0.5%–30.1%). EEG signals were analyzed separately for both T1 and T2. Measurements were obtained at central (C3, Cz, and C4) and parietal electrodes (P3, Pz, and P4). The average ERP waveforms in all conditions were computed time-locked to the onset of each target and included a 200-ms prestimulus baseline. The N400 elicited by T2 was isolated by subtracting the ERP waveforms on T2-related trials from the ERP waveforms on T2-unrelated trials (Vogel et al., 1998) whereas the N400 evoked by T1 was extracted by subtracting the ERP waveforms on T1-related trials from the ERP waveforms on T1-unrelated trials. It is noteworthy that any potential effects on the N400 elicited by one target (e.g., T2) of the relation between the two targets and between the other target (e.g., T1) and the context word were controlled by equating the number of trials in which these stimuli were related and unrelated across the relevant target-relation conditions. The magnitude of the N400 was quantified as the mean amplitude of the difference waves over the 350–600-ms posttarget time window. For both targets, the N400 amplitude was analyzed according to the same 2 ! 3 within-subject design as for behavioral performanceFthat is, the Switching condition ! Lag designFto which two withinsubject factors were added in order to take into account the six electrode sites: anterior-posterior electrode position (central vs.
105 parietal) and left-right electrode position (left, midline, and right). In the event that the N400 in the switch condition was to appear delayed compared to the corresponding N400 in the no-switch condition, the onset latency of these N400 would be calculated using a jackknife-based method (see Kiesel, Miller, Jolicœur, & Brisson, 2008; Miller, Ulrich, & Schwarz, 2009; Ulrich & Miller, 2001). With the jackknife approach, n grand average waveforms are computed with n–1 participants (i.e., a different participant is removed for each waveform), and latency onset estimates are obtained for each of these n grand average waveforms. The estimates obtained at a given lag in the switch and no-switch conditions would then be contrasted using a conventional paired-sample t test, but for which the t value must be adjusted according to tadjusted ¼ t=ðn $ 1Þ (a general proof of this adjustment was provided by Ulrich & Miller, 2001). Results For every repeated-measures analysis of variance (ANOVA) performed in the present study, the Greenhouse–Geisser procedure was applied on every within-subject effect for which the sphericity assumption was violated. T2 Results All behavioral and ERP analyses for T2 were limited to trials on which a correct response was made for T1. Behavioral data. The mean percentage of conditionalized T2 errors is plotted in Figure 2A as a function of switching condition and lag. The repeated-measures ANOVA performed on these data revealed that T2 performance was lower in the switch than in the no-switch condition, F(1,19) 5 61.8, po.001, d 5 3.61. Errors decreased as lag increased, F(2,38) 5 31.1, po.001, d 5 2.56. The Switching Condition ! Lag interaction was significant, F(2,38) 5 16.5, po.001, d 5 1.86, because T2 performance did not vary with lag in the absence of switchingFa common result when T2 is not masked (e.g., Giesbrecht & Di Lollo, 1998; Jolicœur, 1999; Ptito et al., 2008; Vachon & Tremblay, 2008; Vogel & Luck, 2002)Fwhereas it decreased with lag in the presence of switching. This latter result is consistent with previous studies that showed reliable deficits in reporting an unmasked T2 at short lags in the presence of taskset switching (e.g., Dell’Acqua et al., 2003; Kawahara et al., 2001, 2003). ERP data. The ERP results for T2 are summarized in Figures 2B and 3. Figure 2B presents the mean T2-elicited N400 amplitude plotted as a function of switching condition and lag. The difference T2-elicited waveforms for midline central and parietal electrodes for both the switch and no-switch conditions are plotted as a function of lag in Figure 3A. These figures show that the task-set switch delayed and attenuated the N400 at Lag 1. The mean amplitude of the N400 elicited by T2 was submitted to a four-way repeated-measures ANOVA with switching condition, anterior–posterior electrode position, left–right electrode position, and lag as the factors. Overall, the N400 triggered by T2 was larger over midline and left hemisphere electrode positions, F(2,38) 5 16.2, po.001, d 5 1.85, as can be seen in the scalp
106
F. Vachon & P. Jolicœur
Figure 2. Second target (T2) behavioral and electrophysiological results. A: Mean percentage of T2 errors, given correct report of the first target (T1), as a function of switching condition (switch vs. no switch) and lag (1, 3, and 7). Note that error rates were plotted instead of typical accuracy rates to highlight the relationship between the patterns of behavioral and electrophysiological results. B: Mean amplitude (in microvolts) of the N400 elicited by T2 as a function of switching condition and lag. Mean amplitude measurements were based on the mean voltage of the unrelated–related difference wave over the 350–600-ms post-T2 time window averaged across all relevant electrodes sites (i.e., C3, Cz, C4, P3, Pz, and P4). In both panels, errors bars represent 95% within-subject confidence intervals.
distributions of the unrelated–related difference waves shown in Figure 3B (see also Lien et al., 2008). The amplitude of the N400 evoked by T2 was larger in the no-switch condition than in the switch condition, F(1,19) 5 5.62, po.028, d 5 1.09. N400
A
B
Lag 1
Lag 3
amplitude, overall, tended to decrease as lag was reduced, F(2,38) 5 3.05, po.06, d 5 0.80. The interaction between switching condition and lag was significant, F(2,38) 5 4.89, po.014, d 5 1.01, because there was no N400 within the 350–600-ms post-T2 interval at Lag 1 in the switch condition, producing a significant effect of lag in this condition, F(2,38) 5 7.80, po.001, d 5 1.28, whereas N400 amplitude remained stable over lag in the no-switch condition, F(2,38)o1, d 5 0.39. All the remaining effects were nonsignificant (ps4.12 and dso0.64). Figure 3A shows a late negativity at Lag 1 in the switch condition, especially at the central electrode site. To assess whether this negativity reflects a delayed N400, we proceeded in two steps. First, we calculated the mean amplitude in the 550–880-ms post-T2 time window averaged across the six relevant electrodes sites (see Figure 4A) and contrasted this amplitude against zero to determine whether there is a reliable N400 within this time interval. The mean amplitude of this N400 compound was significantly lower than zero, M 5 ! 0.76 mV, SE 5 0.31, t(19) 5 2.50, po.022, d 5 1.15, suggesting that a reliable N400 occurred at Lag 1 in the switch condition within the 550–800-ms post-T2 interval. This is supported by the scalp topography of the unrelated–related difference waveforms over the 550–800-ms time window presented in Figure 4B, which shows a mean voltage distribution over the scalp similar to what was observed over the 550–800-ms interval in the other conditions. Then, to provide further evidence that the N400 observed at Lag 1 in the switch condition was indeed delayed compared to the no-switch condition, we contrasted the onset latency of the N400 compound (i.e., the unrelated–related difference waveform pooled across the six relevant electrodes) found in the no-switch condition (in the 350–600-ms interval) to that observed in the switch condition (over the 550–800-ms time window). An additional 5-Hz lowpass filter was applied to the pooled waveforms (see Figure 4A), and the time at which the N400 compound reached 75% of its maximum amplitude (i.e., ! 0.74 mV in the switch condition and ! 2.19 mV in the no-switch condition), starting at 300 ms after T2 onset, was measured using the jackknife method (Kiesel et al., 2008; Ulrich & Miller, 2001). This analysis revealed a difference of 154.5 ms in the N400 onset latency between the switch condition (M 5 585.5 ms, SE 5 2.7) and no-switch condition (M 5 431.0 ms, SE 5 1.4). This difference was significant, tadjusted (19) 5 2.34, po.03, d 5 1.07, providing further evidence that the N400 was delayed at Lag 1 in the switch condition.
Lag 7
A
B
No Switch
Switch
Figure 3. Second target (T2) electrophysiological results. A: Grand average ERP unrelated–related difference waveforms at central (Cz) and parietal (Pz) electrodes sites as a function of switching condition (switch vs. no switch) and lag (1, 3, and 7). B: Scalp topography of the unrelated– related difference waveforms over the 350–600-ms post-T2 time window as a function of switching condition and lag.
Figure 4. Second target (T2) electrophysiological results. A: Grand average ERP unrelated–related difference waveforms averaged across all relevant electrodes sites (i.e., C3, Cz, C4, P3, Pz, and P4) at Lag 1 in the switch and no-switch conditions. B: Scalp topography of the unrelated– related difference waveforms over the 550–800-ms post-T2 time window at Lag 1 in the switch condition.
Impaired semantic processing during task-set switching T1 Results Behavioral data. Figure 5A presents the mean percentage of T1 errors as a function of switching condition and lag. A repeated-measures ANOVA carried out of these data revealed no significant main effect of switching condition, F(1,19)o1, d 5 0.19, indicating that the two different T1 tasks were equivalent in terms of difficulty level. The main effect of lag did not reach significance, F(2,38) 5 1.84, p4.172, d 5 0.62, but the two-way interaction was significant, F(2,38) 5 38.94, po.001, d 5 2.86. This interaction arose because T1 performance increased with lag in the no-switch condition while decreasing as lag increased in the switch condition. In typical AB conditionsFthat is, with no task-set switchingFT1 performance is often slightly lower at shorter than at longer lags. Such a pattern of results has been attributed to some competition for attention between two targets occurring in very close succession, where T2 could be processed before T1 on some trials (e.g., Potter, Staub, & O’Connor, 2002). Such a competition is less likely to occur in the presence of switching, as each target requires a distinct task set. This would explain why T1 errors were less frequent at Lag 1 in the switch than in the no-switch condition. Although it has been previously reported (e.g., Vachon et al., 2007), the decrease in T1 performance as lag increased in the presence of switching remains rather puzzling. Given that the longer the lag separating the targets the earlier in the RSVP T1 occurredFthe position of T2 in the RSVP was fixedFthis pattern of results could reflect a difference in preparation time between shorter and longer lags: the more time to prepare the better is performance. It is possible that the digit identification task performed on T1 in the switch condition was particularly sensitive to preparation time, as participants needed to first detect a change of color and then moved attention to the surrounding digits.
107 ence waveforms for midline central and parietal electrodes plotted in Figure 6A and the scalp distributions of the unrelated– related difference waves shown in Figure 6B as a function of switching condition and lag suggest that T1 elicited no N400 in the switch condition, in which no semantic judgment was made on T1. This was confirmed by a four-way repeated-measures ANOVA performed on the mean amplitude of the N400 evoked by T1 with switching condition, lag, anterior–posterior electrode position, and left–right electrode position as within-subject factors. The main effect of switching condition was significant, F(1,19) 5 36.10, po.001, d 5 2.76, which indicates that the N400 evoked by T1 was larger in the no-switch than in the switch condition. In fact, there was virtually no N400 elicited by T1 in the switch condition, because the mean voltage amplitude within the N400 temporal window did not significantly differ from zero at any lag in that condition (tso1.1, dso0.51). Except maybe for the interaction between anterior–posterior electrode position, switching condition, and lag that almost reached significance, F(2,38) 5 3.16, po.073, d 5 0.82, none of the remaining effects were significant, Fso1.87, dso0.63. A careful scrutiny of Figure 6A points to a possible attenuated and delayed N400 at Lag 1 for the central electrode site. To determine whether this negativity indeed reflects an N400, we calculated the mean amplitude in the 450–650-ms post-T1 time window averaged across the six relevant electrodes sites and contrasted this amplitude against zero. The one-sample t test was not significant, t(19)o1, d 5 0.35, suggesting that the negative deflection observed at Lag 1 was not a reliable N400 elicited by T1 in this condition.
A
ERP data. The ERP results for T1 are summarized in Figures 5B and 6. The mean T1-evoked N400 amplitude is plotted as a function of switching condition and lag in Figure 5B. The differ-
B
Lag 1
Lag 3
Lag 7
No Switch
Figure 5. First target (T1) behavioral and electrophysiological results. A: Mean percentage of T1 errors as a function of switching condition (switch vs. no switch) and lag (1, 3, and 7). Note that error rates were plotted instead of typical accuracy rates to highlight the relationship between the patterns of behavioral and electrophysiological results. B: Mean amplitude (in microvolts) of the N400 elicited by T1 as a function of switching condition and lag. Mean amplitude measurements were based on the mean voltage of the unrelated–related difference wave over the 350–600-ms post-T1 time window averaged across all relevant electrodes sites (i.e., C3, Cz, C4, P3, Pz, and P4). In both panels, errors bars represent 95% within-subject confidence intervals.
Switch
Figure 6. First target (T1) electrophysiological results. A: Grand average ERP unrelated–related difference waveforms at central (Cz) and parietal (Pz) electrodes sites as a function of switching condition (switch vs. no switch) and lag (1, 3, and 7). B: Scalp topography of the unrelated–related difference waveforms over the 350–600-ms post-T1 time window as a function of switching condition and lag.
108 Discussion We assessed to what extent semantic processing of words occurs during the reconfiguration of task set by examining brain activity under multitasking in the presence or the absence of task-set switching. We relied on the fact that a robust N400 can be found for target words presented during the AB (Luck et al., 1996; Rolke et al., 2001; Vogel et al., 1998), an indication that word meaning can be extracted even though word conscious report is impaired. Consistent with such findings was the absence of modulation of the N400 and T2 performance by T1–T2 lag in the noswitch condition. So, the present study provides further evidence that the semantic processing is not impaired during the AB. Not only was the N400 amplitude preserved during the AB (e.g., Vogel et al., 1998) but the latency of the N400 was also unaffected by lag (the present study). The introduction of a task-set switch between T1 and T2, however, significantly impaired the semantic processing of T2. Indeed, the N400 elicited by a T2 (that was not masked) was delayed and reduced in amplitude whereas T2 errors were substantially increased at Lag 1 in the switch condition, suggesting that word processing could not be carried out in parallel with task-set reconfiguration. The presence of N400 implies that meaning was accessed and that processes that trigger the N400 (either integration with context or lexical access from long-term memory) took place. The absence of N400 is more difficult to interpret. One could imagine that meaning access could take place but that N400 processes failed to occur. Consequently, the most conservative interpretation of present results is that at least N400 processes were affected by our experimental manipulations but that meaning access may nonetheless have occurred without any interference (i.e., attenuation or delay) whatsoever. In the present scenario, however, we observed an N400 that was delayed and attenuated. This shows that either semantic processes, N400 processes, or both were delayed. We argue that the best interpretation of the results is that both semantic and N400 processes were delayed (and hence initially prevented from occurring) because, under similar experimental conditions, Vachon and colleagues (2007) showed that semantic priming was also reduced (consistent with delayed or absent semantic access). Although it is possible, in the Vachon and colleagues study, that meaning may have been accessed without interference but only later mechanisms contributing to priming were selectively affected, another interpretation that is based on a common mechanism is that it was the initial semantic activation that was affected and that downstream processes (N400 in one case, priming in the other) reflected this earlier influence. Given that the mechanisms leading to priming in the Vachon and colleagues study were likely relatively low-level, spreading-activation type effects, we prefer an interpretation of both sets of results in terms of effects at the earlier stage of semantic activation. If we consider the compound semantic-activation/N400 as normal ‘‘semantic processing,’’ then task-set switching can certainly be said to influence semantic processing (even if, at the moment, we cannot be sure that it is entirely due to an effect at the stage of initial semantic access). It has been proposed that the semantic analysis of a briefly presented word displayed during task-set reconfiguration is possible but has to wait for the end of the reconfiguration process (Vachon et al., 2007; see also Oriet & Jolicœur, 2003). According to this account, semantic information could be extracted once reconfiguration completed, but only if an adequate perceptual representation of T2 can be formed without being corrupted by
F. Vachon & P. Jolicœur the subsequent presentation of a masking distractor. Given that no distractors followed T2 in the RSVPs employed in the present study, the delayed-processing account predicted that the T2 word would have elicited a delayed N400 when presented at Lag 1 in the presence of a task-set switch. This prediction was confirmed by the present results, suggesting that semantic processing is possible in the context of task-switch switching, but only after the completion of the reconfiguration process. Our findings are thus inconsistent with the idea that the activation of a semantic representation never occurs for words presented during task-set reconfiguration (Enns et al., 2001), which predicted the absence of N400 at Lag 1 in the switch condition. The fact that the N400 triggered by an unmasked target word is not abolished but rather delayed (though attenuated) by task-set switching suggests that the present paradigm could be a powerful tool for future research on task-set switching, as the latency of the N400 could be used as an index of the duration of the reconfiguration process. The nature of the task switch was substantial in the present study: Participants has to process some low-level characteristics of the T1 stimulus and then perform a semantic task on the T2 word. One could argue that different findings (e.g., no effects on the N400) could possibly occur if the switch was entirely within the semantic domain. This is rather unlikely, however, given that we recently found, in the context of the psychological refractory period (PRP) paradigm, the same N400 findings as in the present study even when the task on T1 was very semantic in nature (Vachon & Jolicœur, 2009). One particularity of the present study is that the switch manipulation was blocked. The use of a blocked procedure may have encouraged preparation for the first task to the detriment of the second task (e.g., De Jong, 1995), given that, in switch blocks, participants had to perform the switch on every trial. The adoption of a general, preparation strategy could have increased apparent effects of task-set switching, given that a less strongly prepared task set for T1 might occur if switching varied from trial to trial. The fact that the N400 was sharply reduced at all lags for the T1 word in the switch condition could be taken as an indication of the effectiveness of such a strategy. Nevertheless, the present findings are more likely to reflect the dynamic nature of the switch process, because Vachon and colleagues (2007, Experiment 2) found the same interference with semantic priming in RSVP in the presence of a task-set switch whether or not the switch manipulation was blocked or varied from trial to trial. The present study constitutes a first step toward dissociating task-set switching and AB costs at the neural level. By establishing that the N400 is altered by task-set switching but not the AB, the present study provides further evidence for a functional distinction between these two dual-task phenomena (e.g., Allport & Hsieh, 2001; Juola et al., 2004; Vachon et al., 2007; Visser et al., 1999). Previous electrophysiological studies have shown that the P3 component of the ERP is delayed for unmasked stimuli occurring during the AB (e.g., Ptito et al., 2008; Vogel & Luck, 2002), placing the locus of the phenomenon at a postperceptual level, more precisely, when consolidating information into short-term memory (e.g., Jolicœur & Dell’Acqua, 1998; Jolicœur, Dell’Acqua, & Crebolder, 2000). The demonstration that the N400 is delayed during task-set reconfiguration but not during the AB indicates that task-set switch costs occur at an earlier stage of processing than AB costs. Such findings are problematic for the claims that the AB and task-set switching are functionally equivalent (e.g.,
Impaired semantic processing during task-set switching Kawahara et al., 2001, 2003) or that the AB is the consequence of a task-set reconfiguration process (e.g., Di Lollo, Kawahara, Ghorashi, & Enns, 2005; Kawahara, Enns, & Di Lollo, 2006; Kawahara, Kumada, & Di Lollo, 2006). According to Kawahara and colleagues (2001, 2003), the contribution of task-set reconfiguration to second-target deficits is the same as that of processing T1: They are both additive in delaying the processing of T2, then promoting the decay of T2 representation below the legibility threshold. If reconfiguring the task set is equivalent to processing T1, then T2 should be processed at a semantic level during the reconfiguration, as it is during the AB (e.g., Shapiro, Driver, et al., 1997; Vachon et al., 2007; Vogel et al., 1998; the present study). Hence, this account of dual-task costs fails to explain why the N400 observed in the absence of task-set switching was diminished and delayed in the presence of switching. More recently, a new theory has proposed to explain the AB phenomenon in terms of the reconfiguration of the attentional (or task) set. According to the temporary loss of control model (e.g., Di Lollo et al., 2005; Kawahara, Enns, & Di Lollo, 2006; Kawahara, Kumada, & Di Lollo, 2006), the cognitive system is initially configured to perform the first task optimally. This task set is maintained by a central processor upon the arrival of T1. When T1 is detected, the central processor becomes engaged in consolidating the target and cannot further maintain the original configuration. While the central processor is occupied, it loses control over the task set and a reconfiguration can be exogenously triggered if the stimulus immediately following T1 is a distractor. Such a reconfiguration prevents T2 from being processed efficiently, rendering the target vulnerable to backward masking and leading to a second-target deficit. The present findings are challenging for the temporary loss of control model, which ascribes AB deficits to the reconfiguration of the system, as is the case for task-set switching costs (e.g., Monsell, 2003; Oriet & Jolicœur, 2003; Rogers & Monsell, 1995). If the AB is indeed a consequence of task-set reconfiguration, the N400 should have been affected at Lag 3 in both the no-switch and switch conditions (i.e., during the AB period) as it was at Lag 1 in the switch condition. One could argue that the exogenous reconfiguration responsible for the AB is functionally different from the endogenous reconfiguration triggered by a voluntary task-set switch. Although plausible, this hypothesis raises another problem: The model states that ‘‘the central executive cannot continue to maintain an optimal input configuration while at the same time oversee the processing of a target’’ (Kawahara, Enns, & Di Lollo, 2006, p. 406). This limit of the central processor implies that it cannot process T1 and, at the same time, voluntarily reconfigure the task set to process T2 optimally. If the endogenous task-set reconfiguration must await the end of T1 consolidation, which occurs around 500 ms after the presentation of T1 (see, e.g., Shapiro, Arnell, & Raymond, 1997), then the impact on the N400 observed in the switch condition should have not been limited to Lag 1 but should have extended to much longer lags, which was not the case. The present findings provide clear evidence that the semantic processing of common words is constrained by task demands. Indeed, the quality of the semantic analysis of the T2 word depended on whether participants had to perform the same task twice in close succession or two different tasks (see also Vachon et al., 2007). Other factors such as the perceptual demands of the task can also modulate the extent to which a word is processed at the semantic level (e.g., Giesbrecht, Sy, & Elliott, 2007; Gi-
109 esbrecht, Sy, & Lewis, 2009; Vogel, Woodman, & Luck, 2005). For instance, Giesbrecht and colleagues (2007) showed that the N400 survives the AB under conditions of low perceptual load for the T1 task but not under high T1-load conditions (see also Giesbrecht et al., 2009, for similar behavioral findings). The RSVPs used by Giesbrecht and colleagues involved an important task-set switchFparticipants had to determine the direction of the central arrow within T1 before performing a semantic judgment on a T2 wordFwhich could have contributed to the impairment of semantic processing observed. It is, however, difficult to determine in that case to what extent task-set switching could have interacted with perceptual load in hindering T2 meaning access, given that T2 was never presented at Lag 1, where switching effects are most salient. Given that more resources are required to process task-relevant information under high perceptual load (e.g., Lavie, 2005), one could speculate that the reconfiguration process is delayed or slowed down by the high demands imposed on the processing of T1 under high load, extending its impact on the N400 to longer lags. Yet, the fact that Giesbrecht and colleagues (2007) observed the suppression of the N400 at Lag 3 whereas we did not may be accounted for, at least in part, by methodological factors. In their experiments stimuli were presented for shorter times (80 ms for T1 and 67 ms for T2) prior to backward masking, and this may have produced less stable representations. Most importantly, however, T2 was followed by a mask in their study, whereas it was not masked in ours. The masked T2s in Giesbrecht and colleagues’ study may have extended the temporal lag at which task-set switching could still produce a measurable effect on the N400. Nevertheless, the demonstrations that the N400 is not always immune to multitasking situations support the view that the selectivity of attention over time can flexibly operate at different stages of processing depending on concurrent task demands and behavioral goals (e.g., Kastner & Pinsk, 2004; Vogel et al., 2005; Yi, Woodman, Widdlers, Marois, & Chun, 2004). Implications for the Automaticity of Word Processing In line with the present study, some electrophysiological studies on word processing have reported delayed and attenuated N400s using another dual-task paradigm, namely, the PRP paradigm, in which participants are required to perform two speeded tasks on each trial (Hohlfeld, Sangals, & Sommer, 2004; Lien et al., 2008). Because the delay/reduction of the N400 elicited by the T2 word was only observed when the two target stimuli were separated by a short stimulus onset asynchrony (e.g., 100 ms), Lien and colleagues concluded that semantic activation is prevented while central attention is devoted to T1. Such a conclusion contrasts with the claim that the AB phenomenon, which does not alter semantic processing, reflects the same central processing bottleneck as the PRP effect (e.g., Jolicœur et al., 2000; Ruthruff & Pashler, 2001). The present findings can be taken to reconcile this apparent contradiction. Given that the PRP procedure employed by Lien and colleagues and Hohlfeld and colleagues involved a shift of sensory modality and task/response requirements from T1 to T2, it is possible, and perhaps likely, that the N400 effects they obtained were not mainly mediated by the decision-making and response-selection operations carried out on T1 but instead by the transient task-set reconfiguration process required by their procedure. This issue is currently under investigation in our laboratory (see Vachon & Jolicœur, 2009). Regardless of the source of the interference suffered by the neural mechanisms responsible for the elicitation of the N400, the pres-
110
F. Vachon & P. Jolicœur
ent AB study (see also Giesbrecht et al., 2007) nonetheless supports the main conclusion of these PRP studies, namely, that the stages of word processing that lead up to semantic activation are not automatic. We can draw a similar conclusion based on the present T1 ERP data. When the task involved the deliberate processing of the semantic aspects of the T1 word (no-switch condition), a reliable T1evoked N400 was observed at all lags. However, when the T1 word was irrelevant to the task, so that the judgment involved a comparison of the extremities of the T1 stimulus, no N400 was elicited by T1, suggesting that the presentation of a word at fixation is not sufficient for semantic activation. Such results are consistent with previous studies showing that the N400 is not
sensitive to lexical/semantic information under shallow processing conditions (e.g., Chwilla et al., 1995; Marı´ -Beffa et al., 2005). Taken together, electrophysiological results from both T1 and T2 suggest that the presentation at fixation of a common word does not guarantee ballistic semantic processing, providing further support to the view that visual word processing is not automatic (e.g., Hohlfeld & Sommer, 2005; Holcomb, 1988; Lien et al., 2008; McCann, Remington, & Van Selst, 2000; Stolz & Besner, 1996, 1999). Indeed, semantic processing, as indexed by the N400, seems severely limited when a word appears transiently on the retina while the cognitive system is occupied in a nonsemantic activity, such as reconfiguring the task set or processing low-level characteristics of the stimulus.
REFERENCES Allport, A., & Hsieh, S. (2001). Task-switching: Using RSVP methods to study an experimenter-cued shift of set. In K. L. Shapiro (Ed.), The limits of attention: Temporal constraints in human information processing (pp. 36–64). New York: Oxford University Press. Bentin, S., McCarthy, G., & Wood, C. C. (1985). Event-related potentials, lexical decision and semantic priming. Clinical Neurophysiology, 60, 343–355. Brown, C., & Hagoort, P. (1993). The processing nature of the N400: Evidence from masked priming. Journal of Cognitive Neuroscience, 5, 34–44. Chwilla, D. J., Brown, C. M., & Hagoort, P. (1995). The N400 as a function of the level of processing. Psychophysiology, 32, 274–285. De Jong, R. (1995). The role of preparation in overlapping-task performance. Quarterly Journal of Experimental Psychology, 48A, 2–25. Dell’Acqua, R., Pascali, A., Jolicœur, P., & Sessa, P. (2003). Four-dot masking produces the attentional blink. Vision Research, 43, 1907– 1913. Di Lollo, V., Kawahara, J., Ghorashi, S. M. S., & Enns, J. T. (2005). The attentional blink: Resource depletion or temporary loss of control? Psychological Research, 69, 191–200. Dux, P. E., & Marois, R. (2009). How humans search for targets through time: A review of data and theory from the attentional blink. Attention, Perception, & Psychophysics, 71, 1683–1700. Enns, J. T., Visser, T. A. W., Kawahara, J., & Di Lollo, V. (2001). Visual masking and task switching in the attentional blink. In K. L. Shapiro (Ed.), The limits of attention: Temporal constraints in human information processing (pp. 65–81). New York: Oxford University Press. Federmeier, K. D. (2007). Thinking ahead: The role and roots of prediction in language comprehension. Psychophysiology, 44, 491–505. Friederici, A. D. (2004). Event-related brain potential studies in language. Current Neurology and Neuroscience Reports, 4, 466–470. Giesbrecht, B., & Di Lollo, V. (1998). Beyond the attentional blink: Visual masking by object substitution. Journal of Experimental Psychology: Human Perception and Performance, 24, 1454–1466. Giesbrecht, B., Sy, J. L., & Elliott, J. E. (2007). Electrophysiological evidence for both perceptual and post-perceptual selection during the attentional blink. Journal of Cognitive Neuroscience, 19, 2005–2018. Giesbrecht, B., Sy, J. L., & Lewis, M. K. (2009). Personal names do not always survive the attentional blink: Behavioral evidence for a flexible locus of selection. Vision Research, 49, 1378–1388. Halgren, E. (1990). Insights from evoked potentials into the neuropsychological mechanisms of reading. In B. Arnold & A. F. W. Scheibel (Eds.), Neurobiology of higher cognitive function (pp. 103–150). New York: Guilford Press. Heil, M., Rolke, B., & Pecchinenda, A. (2004). Automatic semantic activation is no myth: Semantic context effects on the N400 in the lettersearch task in the absence of response time effects. Psychological Science, 15, 852–857. Hohlfeld, A., Sangals, J., & Sommer, W. (2004). Effects of additional tasks on language perception: An event-related brain potential investigation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 1012–1025. Hohlfeld, A., & Sommer, W. (2005). Semantic processing of unattended meaning is modulated by additional task load: Evidence from electrophysiology. Cognitive Brain Research, 24, 500–512.
Holcomb, P. J. (1988). Automatic and attentional processing: An eventrelated brain potential analysis of semantic priming. Brain and Language, 35, 66–85. Holcomb, P. J. (1993). Semantic priming and stimulus degradation: Implications for the role of the N400 in language processing. Psychophysiology, 30, 47–61. Jolicœur, P. (1999). Dual-task interference and visual encoding. Journal of Experimental Psychology: Human Perception and Performance, 25, 596–616. Jolicœur, P., & Dell’Acqua, R. (1998). The demonstration of short-term consolidation. Cognitive Psychology, 36, 138–202. Jolicœur, P., Dell’Acqua, R., & Crebolder, J. M. (2000). Multitasking performance deficits: Forging some links between the attentional blink and the psychological refractory period. In S. Monsell & J. Driver (Eds.), Attention and performance XVIII: Control of cognitive processes (pp. 309–330). Cambridge, MA: MIT Press. Juola, J. F., Botella, J., & Palacios, A. (2004). Task- and locationswitching effects on visual attention. Perception & Psychophysics, 66, 1303–1317. Kastner, S., & Pinsk, M. A. (2004). Visual attention as a multilevel selection process. Cognitive, Affective, & Behavioral Neuroscience, 4, 483–500. Kawahara, J., Di Lollo, V., & Enns, J. T. (2001). Attentional requirements in visual detection and identification: Evidence from the attentional blink. Journal of Experimental Psychology: Human Perception and Performance, 27, 969–984. Kawahara, J., Enns, J. T., & Di Lollo, V. (2006). The attentional blink is not a unitary phenomenon. Psychological Research, 70, 405–413. Kawahara, J., Kumada, T., & Di Lollo, V. (2006). The attentional blink is governed by a temporary loss of control. Psychonomic Bulletin & Review, 13, 886–890. Kawahara, J., Zuvic, S. M., Enns, J. T., & Di Lollo, V. (2003). Task switching mediates the attentional blink even without backward masking. Perception & Psychophysics, 65, 339–351. Kieffaber, P. D., & Hetrick, W. P. (2005). Event-related potential correlates of task switching and switch costs. Psychophysiology, 42, 56–71. Kiesel, A., Miller, J., Jolicœur, P., & Brisson, B. (2008). Measurement of ERP latency differences: A comparison of single-participant and jackknife-based scoring methods. Psychophysiology, 45, 250–274. Ku¨per, K., & Heil, M. (2009). Electrophysiology reveals semantic priming at a short SOA irrespective of depth of prime processing. Neuroscience Letters, 453, 107–111. Kutas, M. (1997). Views on how the electrical activity that the brain generates reflects the functions of different language structures. Psychophysiology, 34, 383–398. Kutas, M., & Federmeier, K. D. (2000). Electrophysiology reveals semantic memory use in language comprehension. Trends in Cognitive Science, 4, 463–470. Kutas, M., & Hillyard, S. A. (1980). Reading senseless sentences: Brain potentials reflect semantic incongruity. Science, 207, 203–205. Lau, E. F., Phillips, C., & Poeppel, D. (2008). A cortical network for semantics: (De)constructing the N400. Nature Reviews Neuroscience, 9, 920–933. Lavie, N. (2005). Distracted and confused? Selective attention under load. Trends in Cognitive Sciences, 9, 75–82.
Impaired semantic processing during task-set switching Lien, M.-C., Ruthruff, E., Cornett, L., Goodin, Z., & Allen, P. A. (2008). On the nonautomaticity of the visual word processing: Electrophysiological evidence that word processing requires central attention. Journal of Experimental Psychology: Human Perception and Performance, 34, 751–773. Luck, S. J., Vogel, E. K., & Shapiro, K. L. (1996). Word meanings can be accessed but not reported during the attentional blink. Nature, 382, 616–618. Maki, W. S., Frigen, K., & Paulson, K. (1997). Associative priming by targets and distractors during rapid serial visual presentation: Does word meaning survive the attentional blink? Journal of Experimental Psychology: Human Perception and Performance, 23, 1014–1034. Marı´ -Beffa, P., Valde´s, B., Cullen, D. J., Catena, A., & Houghton, G. (2005). ERP analyses of task effects on semantic processing from words. Cognitive Brain Research, 23, 293–305. Martens, S., Wolters, G., & van Raamsdonk, M. (2002). Blinks of the mind: Memory effects of attentional processes. Journal of Experimental Psychology: Human Perception and Performance, 28, 1275–1287. McCann, R. S., Remington, R. W., & Van Selst, M. (2000). A dual-task investigation of automaticity in visual word processing. Journal of Experimental Psychology: Human Perception and Performance, 26, 1352–1370. Miller, J., Ulrich, R., & Schwarz, W. (2009). Why jackknife yields good latency estimates. Psychophysiology, 46, 300–312. Monsell, S. (2003). Task switching. Trends in Cognitive Science, 7, 134– 140. Oriet, C., & Jolicœur, P. (2003). Absence of perceptual processing during reconfiguration of task set. Journal of Experimental Psychology: Human Perception and Performance, 29, 1036–1049. Osterhout, L., & Holcomb, P. J. (1992). Event-related brain potentials elicited by syntactic anomaly. Journal of Memory and Language, 31, 785–806. Potter, M. C., Chun, M. M., Banks, B. S., & Muckenhoupt, M. (1998). Two attentional deficits in serial target search: The visual attentional blink and an amodal task-switch deficit. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 979–992. Potter, M. C., Staub, A., & O’Connor, D. H. (2002). The time course of competition for attention: Attention is initially labile. Journal of Experimental Psychology: Human Perception and Performance, 28, 1149–1162. Ptito, A., Arnell, K., Jolicœur, P., & MacLeod, J. (2008). Intramodal and crossmodal processing delays in the attentional blink paradigm revealed by event-related potentials. Psychophysiology, 45, 794–803. Raymond, J. E., Shapiro, K. L., & Arnell, K. M. (1992). Temporary suppression of visual processing in an RSVP task: An attentional blink? Journal of Experimental Psychology: Human Perception & Performance, 18, 849–860. Rogers, R. D., & Monsell, S. (1995). Costs of a predictable switch between simple cognitive tasks. Journal of Experimental Psychology: General, 124, 207–231. Rolke, B., Heil, M., Streb, J., & Hennighausen, E. (2001). Missed prime words within the attentional blink evoke an N400 semantic priming effect. Psychophysiology, 38, 165–174. Rubinstein, J. S., Meyer, D. E., & Evans, J. E. (2001). Executive control of cognitive processes in task switching. Journal of Experimental Psychology: Human Perception and Performance, 27, 763–797. Rugg, M. D. (1985). The effects of semantic priming and word repetition on event-related potentials. Psychophysiology, 22, 642–647. Rushworth, M. F. S., Passingham, R. E., & Nobre, A. C. (2002). Components of switching intentional set. Journal of Cognitive Neuroscience, 14, 1139–1150.
111 Ruthruff, E., & Pashler, H. E. (2001). Perceptual and central interference in dual-task performance. In K. L. Shapiro (Ed.), The limits of attention: Temporal constraints in human information processing (pp. 100–123). Oxford: Oxford University Press. Sakai, K. (2008). Task set and prefrontal cortex. Annual Review of Neuroscience, 31, 219–245. Schneider, D. W., & Logan, G. D. (2007). Defining task-set reconfiguration: The case of reference point switching. Psychonomic Bulletin & Review, 14, 118–125. Shapiro, K. L., Arnell, K. M., & Raymond, J. E. (1997). The attentional blink. Trends in Cognitive Sciences, 1, 291–296. Shapiro, K. L., Driver, J., Ward, R., & Sorensen, R. E. (1997). Priming from the attentional blink: A failure to extract visual tokens but not visual types. Psychological Science, 8, 95–100. Stolz, J. A., & Besner, D. (1996). Role of set in visual word recognition: Activation and activation blocking as nonautomatic processes. Journal of Experimental Psychology: Human Perception and Performance, 22, 1166–1177. Stolz, J. A., & Besner, D. (1999). On the myth of automatic semantic activation in reading. Current Directions in Psychological Science, 8, 61–65. Travers, S., & West, R. (2008). Neural correlates of cue retrieval, task set reconfiguration, and rule mapping in the explicit cue task switching paradigm. Psychophysiology, 45, 588–601. Ulrich, R., & Miller, J. (2001). Using the jackknife-based scoring method for measuring LRP onset effects in factorial designs. Psychophysiology, 38, 816–827. Vachon, F., & Jolicœur, P. (2009, November). Electrophysiological evidence for impaired or postponed semantic processing during multitasking and task-set switching. Paper presented at the 50th annual meeting of the Psychonomic Society, Boston, MA. Vachon, F., & Tremblay, S. (2008). Modality-specific and amodal sources of interference in the attentional blink. Perception & Psychophysics, 70, 1000–1015. Vachon, F., Tremblay, S., & Jones, D. M. (2007). Task-set reconfiguration suspends perceptual processing: Evidence from semantic priming during the attentional blink. Journal of Experimental Psychology: Human Perception and Performance, 33, 330–347. Van Overschelde, J. P., Rawson, K. A., & Dunlosky, J. (2004). Category norms: An updated and expanded version of the Battig and Montague (1969) norms. Journal of Memory and Language, 50, 289–335. Visser, T. A. W., Bischof, W. F., & Di Lollo, V. (1999). Attentional switching in spatial and non-spatial domains: Evidence from the attentional blink. Psychological Bulletin, 125, 458–469. Vogel, E. K., & Luck, S. J. (2002). Delayed working memory consolidation during the attentional blink. Psychonomic Bulletin & Review, 9, 739–743. Vogel, E. K., Luck, S. J., & Shapiro, K. L. (1998). Electrophysiological evidence for a postperceptual locus of suppression during the attentional blink. Journal of Experimental Psychology: Human Perception and Performance, 24, 1656–1674. Vogel, E. K., Woodman, G. F., & Luck, S. J. (2005). Pushing around the locus of selection: Evidence for the flexible-selection hypothesis. Journal of Cognitive Neuroscience, 17, 1907–1922. Yi, D.-J., Woodman, G. F., Widdlers, D., Marois, R., & Chun, M. M. (2004). Neural fate of ignored stimuli: Dissociable effects of perceptual and working memory load. Nature Neuroscience, 7, 992–996.
(Received July 1, 2009; Accepted January 18, 2010)
Psychophysiology, 48 (2011), 112–116. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01041.x
BRIEF REPORT
Aversive picture processing: Effects of a concurrent task on sustained defensive system engagement
BETHANY C. WANGELIN, ANDREAS LO¨W, LISA M. MCTEAGUE, MARGARET M. BRADLEY, and PETER J. LANG NIMH Center for the Study of Emotion and Attention, University of Florida, Gainesville, Florida, USA
Abstract Viewing a series of aversive pictures prompts emotional reactivity reflecting sustained defensive engagement. The present study examined the effects of a concurrent visual task on autonomic, somatic, electrocortical, and facial components of this defensive state. Results indicated that emotional activation was largely preserved despite continuous visual distraction, although evidence of attenuation was observed in startle reflex and electrocortical measures. Concurrent task-specific reactivity was also apparent, suggesting that motivational circuits can be simultaneously activated by stimuli with intrinsic survival significance and instructed task significance and that these processes interact differently across the separate components of defensive engagement. Descriptors: Emotion, EMG, EEG/ERP, Startle blink, Electrodermal, heart rate
rent cognitive tasks such as mental arithmetic (Van Dillen, Heslenfeld, & Koole, 2009) or working memory (McRae et al., 2009). In the above studies, aversive images were presented intermixed with neutral and pleasant pictures. In general, these data suggest that distraction can interfere with reactions to single aversive pictures in a dynamically changing context. Considering that more persistent emotional reactions are central to anxiety and affective disorders, we assessed distraction here in the context of a prolonged aversive state, evoked by the viewing of a continuous series of unpleasant pictures. We judged that the most obvious competing task would be in the same perceptual domain as the pictures and from which participants could not divert visual attention. Thus, participants were required to continuously process a series of rapidly changing numbers presented in the center of the visual field and superimposed on the picture. Control picture series were a series of neutral pictures as well as a series of highly pleasant/moderately arousing pictures that were intended to prompt a positive state and to be more engaging than the neutral series. Participants viewed all three series both with and without a concurrent task. According to a motivational theory of emotion (e.g., Lang & Bradley, 2010; Lang & Davis, 2006), affects are prompted by the activation of limbic survival circuits that tune sensory systems, increase attention and perceptual processing, and mobilize for action. In addressing the effects of attenuation by distraction, the present research assessed a full array of emotion measures: mobilization of the sympathetic system in aversive affect is expected to increase skin conductance response frequency, heart rate deceleration indexes increased attention to threat, augmented motor action is shown by startle potentiation, facial expressivity
Aversive pictures capture attention, prompting a cascade of reflex responses mediated by limbic motivational circuits that enhance perceptual processing and mobilize for defensive action (e.g., Lang & Bradley, 2010). Research further suggests that viewing a prolonged series of aversive pictures prompts a sustained affective state in which defensive responses are maintained, and can increase, across a viewing period (Bradley, Cuthbert, & Lang, 1996; Smith, Bradley, & Lang, 2005). The current study examined effects of distraction on such a state, assessing differences in psychophysiological response modulation when a series of aversive pictures was viewed either with or without a concurrent perceptual task. Ochsner and Gross (2005) have indicated that distraction can be a strategy for regulating unpleasant emotion that can involve both visual and internal cognitive processes. Studies employing rapidly presented aversive pictures (e.g., 300 ms) have found that distraction by a perceptual task can interfere with amygdala activation (Blair et al., 2007) or with early components of the visual evoked potential (Schupp et al., 2007), although Hajcak, Dunning, and Foti (2007) found that affective modulation of the late positive electrocortical response was unaffected by concurrent mental arithmetic. Neuroimaging studies using longer (4–8 s) picture presentations, however, found that both self-reported negative affect and amygdala activity are attenuated by concurAndreas Lo¨w is now at the University of Greifswald, Germany. Thanks to Robert Sivinski for his assistance in data collection. Research was supported in part by NIMH grant P50 MH072850. Address correspondence to: Bethany Wangelin, NIMH-CSEA, PO Box 112766, University of Florida, Gainesville, FL 32608, USA. E-mail:
[email protected] 112
Concurrent task and defensive engagement
113
(i.e., corrugator EMG) reflects social communication of threat, and an electrocortical measure (the late positive potential [LPP]) monitors the selective motivational significance of aversive cues (Bradley, 2009). The present research examines whether a distracting perceptual task attenuates aversive emotional engagement and, if so, if this is a general process or one limited to specific defense reflexes.
electrodes on each forearm. A Schmitt trigger detected R-waves; interbeat intervals were reduced to half-second bins. Event-related potentials were sampled at 250 Hz using a 129sensor array, and bandpass filtered (0.1–100 Hz) with a vertex reference. Data were filtered off-line (BESA) at 30 Hz, and stimulus-locked epochs were extracted from 100 ms before to 900 ms after picture onset; ocular artifacts were removed, and an average reference was computed and baseline corrected (100 ms before picture onset).
Method
Procedure Participants sat comfortably in a sound-attenuated, dimly lit room and were instructed to look at each picture during free viewing and to perform the task as accurately as possible for the task conditions.
Participants Fifty-one (25 male) University of Florida psychology students received course credit for study participation, approved by the University of Florida Institutional Review Board. Because of equipment or experimenter error, final Ns were 38 and 49 for corrugator EMG and LPP, respectively.
Materials and Design Stimuli were 60 aversive, 60 neutral, and 60 positive pictures selected from the International Affective Picture System (IAPS; Lang, Bradley, & Cuthbert, 2008). Mean (SE) valence ratings for pictures in the aversive, neutral, and positive series were 2.44 (0.11), 4.97 (0.03), and 7.49 (0.06), respectively; mean (SE) arousal ratings were 6.26 (0.09), 3.21 (0.09), and 5.25 (0.12), respectively. Across the study, nine different picture series were presented. Each series consisted of 20 pictures of the same hedonic content (aversive, neutral, positive), presented for 3 s with no interpicture interval. Three series were presented for free viewing as well as during two slightly different task conditions. All three series (aversive, neutral, positive) were viewed within a condition (i.e., free viewing or concurrent task) prior to a shift. The order of affective content (aversive, neutral, positive) within each distraction condition was counterbalanced for each participant. In free viewing, a fixation cross continuously appeared at the center of the picture screen. In each of two different concurrent task conditions, the cross was replaced by a small circle at fixation (0.751 of visual angle) containing a number (1–9) that changed every 750 ms. Participants performed either a target detection task (button press to a target number) or a one-back task (button press to a target preceded by an odd number). Twenty-five participants completed free viewing followed by the two tasks (free viewing, one-back, detection); 26 participants performed the two tasks followed by free viewing (one-back, detection, free viewing). Color pictures were displayed via an LCD projector on a screen 1.5 m from the participant. Acoustic startle stimuli (50 ms, 98 dB) were presented over headphones at three intervals (separated by approximately 20 s) during each picture series. VPM (Cook, 2001) controlled data acquisition. For startle, 4-mm AgAgCl electrodes were placed over the left orbicularis oculi muscle (Fridlund & Cacioppo, 1986). Raw signals were sampled at 1000 Hz, amplified by 30,000, filtered (28–500 Hz), and integrated (20 ms time constant). Left corrugator EMG was recorded from 4-mm electrodes, amplified by 30,000, filtered (13–1000 Hz,) and integrated (500 ms time constant). Skin conductance was recorded from 8-mm electrodes filled with 0.5 M NaCl paste on the hypothenar eminence of the left palm. Heart rate was recorded from 8-mm
Data Reduction and Analysis Blinks were scored using a peak-scoring algorithm (Cook, 2001); magnitudes were transformed to T scores for each participant, (scores43 standard deviations were excluded). As in other studies measuring state effects (Bradley et al., 1996; Smith et al., 2005), the number of skin conductance responses (40.05 mS) during each series was calculated. The LPP was scored as the mean amplitude over 18 centro-parietal sites 400 to 800 ms after picture onset. Raw heart rate and corrugator activity were averaged for each picture series. Analysis of task performance revealed that, although accuracy was slightly lower and reaction time slightly slower in the one-back compared to the detection task, emotional modulation did not differ as a function of task and the data were averaged into a single distraction condition.1 Accordingly, a 2 (task: free viewing, concurrent task) ! 3 (content: aversive, neutral, positive) within-subjects design was employed, with Greenhouse– Geisser corrections used when relevant. A nonparametric test (Friedman’s w2) evaluated the effect of picture content on task accuracy because the score distribution violated normality. Results Skin Conductance More skin conductance responses were elicited when a concurrent task was performed, compared to free viewing: task, F(1,50) 5 32.48, po.001, Z2p ¼ :394, and this was found for the aversive as well as for the neutral and positive comparison series: Task ! Content, F(2,100) 5 1.02, p 5 .36 (see Table 1). Overall, more skin conductance responses were elicited during the aversive compared to the neutral (p 5 .002) or positive series (p 5 .001): content, F(2,100) 5 9.19, po.001, Z2p ¼ :155). Heart Rate Heart rate increased during concurrent task performance compared to free viewing: task, F(1,50) 5 22.8, po. 001, Z2p ¼ :313, which was evident for the aversive as well as neutral and positive 1 Two tasks were initially included to investigate effects of task difficulty on responding. Although reaction time was slightly slower during the one-back task (mean: 690 ms) compared to the detection task (mean: 556 ms), F(1,46) 5 31.6, po.001, and accuracy was slightly reduced (oneback mean: 0.85, detection mean: 0.95; Friedman’s w2 5 13.9, po.001), ANOVAs using task (detection or one-back) and content (aversive, neutral, or positive) found no significant interactions involving task and content for any affect measure, and therefore the data were averaged into a single distraction condition for simplicity of reporting.
114
B. C. Wangelin et al.
Table 1. Mean (SE) for Each Dependent Measure as a Function of Picture Series Content, Separately for Free Viewing (No Task) and Task Contexts Aversive Skin conductance (no. of responses) Free viewing 2.7 (0.5) Concurrent task 4.3 (0.5) Heart rate (beats per min) Free viewing 72.4 (1.4) Concurrent task 74.1 (1.3) Corrugator EMG (mV) Free viewing 10.2 (8.1) Concurrent task 10.0 (7.0) Startle magnitude (T score) Free viewing 53.5 (1.0) Concurrent task 50.5 (0.5) Late positive potential (mV) Free viewing 2.0 (0.2) Concurrent task 0.9 (0.2) Task accuracy (% correct) 632 (15) Task reaction time (ms) 91 (.02)
Neutral
Positive
1.6 (0.3) 3.6 (0.4)
2.0 (0.4) 3.5 (0.4)
74.5 (1.4) 76.3 (1.4)
72.7 (1.3) 74.9 (1.3)
8.5 (5.6) 9.5 (6.1)
8.3 (5.3) 9.5 (6.2)
50.2 (1.0) 49.2 (0.5)
50.3 (1.1) 48.7 (0.6)
0.7 (0.2) $ 0.1 (0.1) 626 (18) 92 (.01)
1.2 (0.3) 0.8 (0.2) 626 (15) 87 (.02)
Note: Values reflect startle reflex magnitude (T-score: mean withinsubject 5 50), skin conductance (number of skin conductance responses greater than 0.05 mS), average heart rate, average corrugator muscle activity, and amplitude of the late positive potential (mean over 18 centroparietal sensors, 400–800 ms after picture onset). Reaction time and accuracy scores are presented for the concurrent task condition.
comparison series: Task ! Content, F(2,100)o1.0 (see Table 1). Overall, compared to the neutral picture series, heart rate was slower during aversive (po.001) and positive (po.001) viewing: content, F(2,100) 5 26.5, po.001, Z2p ¼ :346. Corrugator EMG Corrugator tension increased slightly during distraction compared to free viewing: task, F(1,37) 5 4.0, p 5 .05, Z2p ¼ :10. A marginally significant interaction suggested that effects of picture content on corrugator activity differed by task: Task ! Content, F(2,74) 5 3.0, p 5 .06, Z2p ¼ :08; overall content, F(2,74) 5 4.3, p 5 .04, Z2p ¼ :10. As illustrated in Figure 1 (top), during free viewing, corrugator tension was enhanced for the aversive compared to neutral (p 5 .05) and positive (p 5 .03) series: simple effect of content, F(1,37) 5 5.3, p 5 .03, Z2p ¼ :13. During concurrent task performance, picture content no longer modulated corrugator activity: simple effect of content, F(1,37) 5 2.5, p 5 .12. Separate tests for each picture content indicated greater activity during concurrent task performance compared to free viewing, for neutral (p 5 .03, one-tailed) and positive (p 5 .004) series versus commensurate increases for aversive picture series (p4.05). Startle Magnitude Across picture contents, differences in startle magnitude as a function of task did not reach significance: task, F(1,48) 5 3.2, p 5 .08, Z2p ¼ :06. Larger startle blinks were elicited by probes presented during aversive picture series versus neutral (p 5 .01) or positive (p 5 .008) series: content, F(2,96) 5 6.4, p 5 .003, Z2p ¼ :117 (see Table 1), and this overall pattern did not differ as a function of distraction: Task ! Content, F(2,96) 5 1.23, p 5 .30. On the other hand, specific a priori comparisons indicated that startle magnitude was significantly larger during aversive series presented during free viewing as compared to distraction, t(48) 5 2.3, p 5 .025.
Late Positive Potential The amplitude of the LPP was smaller during distraction compared to free viewing: task, F(1,48) 5 61.8, po.001, Z2p ¼ :41 (Figure 1, bottom panel). Independently, LPP amplitude was larger for aversive compared to both the neutral (po.001) and the positive series (p 5 .002): content, F(2,96) 5 27.8, po.001, Z2p ¼ :37Þ. A marginal Task ! Content interaction, F(2,96) 5 3.2, p 5 .056, Z2p ¼ :059, suggested enhanced positivity for aversive compared to positive series during free viewing (p 5 .002), whereas no such difference emerged during distraction (p 5 .44). On the other hand, either aversive or positive series prompted significantly larger LPPs than the neutral series during both free viewing and distraction (pso.05). Task Performance Picture content did not significantly modulate task accuracy (Friedman’s w2 5 4.42, p 5 .11; see Table 1) or reaction time, F(2,92)o1.0).
Discussion Overall, the results show that sustained aversive picture viewing prompted emotional reflex activation and ongoing defensive engagement, evident in most measures regardless of whether a concurrent task was performed. Additionally, for any picture series, performance of a concurrent task was associated with heightened reactivity that was similar to, yet separate from, affective engagement. The interaction of these two processes differed across defense response components. Viewing of aversive pictures, compared to neutral or positive, prompted relative heart rate deceleration, a greater number of skin conductance responses, and startle reflex potentiation. These responses likely reflected enhanced attentional orienting, sympathetically mediated response mobilization, and heightened defensive action preparation, respectively. Such response modulation persisted despite distractionFa pattern that was also found for electrocortical responses indexing processing of motivational significance. Indeed, the viewing of aversive compared to neutral picture series was associated with an enhanced late positive potential, an effect also observed when distraction involves internal subtraction rather than an external visual task (Hajcak et al., 2007). These findings together suggest that attentional and sympathetic response mobilization components of defensive responding are activated despite concurrent distraction. Regarding selective cortical processing, it is noteworthy that, although LPP amplitude was modulated by picture content during distraction, the pattern was not identical during free viewing. Whereas arousing aversive pictures prompted a larger LPP than the moderately arousing positive series during free viewing (as expected), the LPP for aversive pictures did not differ from that for positive content during distraction. One interpretation is that distraction can attenuate selective processing of aversive pictures. Startle reflex data also suggested relative attenuation of somatic action preparation during distraction, whereas startle was potentiated during aversive processing regardless of task context; potentiation was larger when there was no distraction. Although effects suggesting attenuated aversive processing were relatively small here, they are in line with recent work reporting reduced amygdala activation during concurrent task performance (e.g., Van Dillen et al., 2009).
Concurrent task and defensive engagement
115
Figure 1. Top panel: Corrugator electromyographic activity for aversive, neutral, and positive picture series during free viewing (left) or distraction (right). Bottom panel: Event-related potentials for aversive, neutral, and positive pictures during free viewing (left) or distraction (right).
Differing from other affect measures, the specific facial frowning increase for aversive pictures during free viewing, relative to the neutral and positive series, was not seen during distraction. Instead, corrugator tension increased overall during the concurrent task, bringing all three series to the same level. These results are consistent with findings that facial muscle tension increases with greater task demand (Tassinary, Cacioppo, & Vanman, 2007), suggesting that facial tension associated with concentration on the number task overshadowed threat-specific facial action. In addition to increased facial muscle tension, task performance was associated with more skin conductance responses and increased heart rate overall. As enhanced autonomic activity occurs with task effort (e.g., Kalamas, Gruber, & Rypma, 1999), active task participation likely recruited orienting and response mobilization systems regardless of picture content. It is notable, however, that this did not seem to impact engagement of the same systems by ongoing aversive processing. A similar overall task effect was observed for late cortical positivity as LPP amplitude was reduced during distraction. From a resourcesharing perspective (e.g., Desimone, 1998), an index of motiva-
tional significance could be somewhat reduced when subjects are simultaneously processing other stimuli with instructed significance. In summary, although many components of ongoing defensive engagement were preserved during continuous visual distraction, evidence of attenuation was observed in startle reflex and electrocortical measures, and independent task-specific reactivity was also apparent. These findings are consistent with the view that motivational circuits are activated when stimuli have intrinsic survival significance as well as when motivational significance is instructed. The degree to which affective and task stimuli activate these systems may determine the level of defensive reactivity observed in a given response component. Indeed, considering high accuracy rates (over 80%) in the current study, it is likely that more difficult tasks could further reduce or even eliminate components of defensive responding other than facial expressivity, which is an interesting research avenue to pursue. Finally, future studies using this dual-task paradigm may prove useful in assessing emotion control in anxious and affective pathology.
REFERENCES Blair, K. S., Smith, B. W., Mitchell, D. G., Morton, J., Vythilingam, M., Pessoa, L., et al. (2007). Modulation of emotion by cognition and cognition by emotion. NeuroImage, 35, 430–440. Bradley, M. M. (2009). Natural selective attention: Orienting and emotion. Psychophysiology, 46, 865–873.
Bradley, M. M., Cuthbert, B. N., & Lang, P. J. (1996). Picture media and emotion: Effects of a sustained affective context. Psychophysiology, 33, 662–670. Cook, E. W. III (2001). VPM reference manual. Birmingham, AL: Author.
116 Desimone, R. (1998). Visual attention mediated by biased competition in extrastriate visual cortex. Philosophical Transactions of the Royal Society of London: Biological Sciences, 353, 1245–1255. Fridlund, A. J., & Cacioppo, J. T. (1986). Guidelines for human electromyographic research. Psychophysiology, 23, 567–589. Hajcak, G., Dunning, J., & Foti, D. (2007). Neural response to emotional pictures is unaffected by concurrent task difficulty: An event-related potential study. Behavioral Neuroscience, 121, 1156– 1162. Kalamas, A., Gruber, A., & Rypma, B. (1999). Autonomic physiological activity in mental rotation tasks. Perceptual and Motor Skills, 88, 211–214. Lang, P. J., & Bradley, M. M. (2010). Emotion and the motivational brain. Biological Psychology (in press). Lang, P. J., Bradley, M. M., & Cuthbert, B. N. (2008). International Affective Picture System (IAPS): Affective ratings of pictures and instruction manual. Technical Report A-8. Gainesville, FL: University of Florida. Lang, P. J., & Davis, M. (2006). Emotion, motivation, and the brain: Reflex foundations in animal and human research. Progress in Brain Research, 156, 3–29.
B. C. Wangelin et al. McRae, K., Hughes, B., Chopra, S., Gabrieli, J. D., Gross, J. J., & Ochsner, K. N. (2009). The neural bases of distraction and reappraisal. Journal of Cognitive Neuroscience, 22, 248–262. Ochsner, K. N., & Gross, J. J. (2005). The cognitive control of emotion. Trends in Cognitive Sciences, 9, 242–249. Schupp, H., Stockburger, J., Bublatzky, F., Jungho¨fer, M., Weike, A. I., & Hamm, A. O. (2007). Explicit attention interferes with selective emotion processing in human extrastriate cortex. BMC Neuroscience, 8, 16. Smith, J., Bradley, M., & Lang, P. (2005). State anxiety and affective physiology: Effects of sustained exposure to affective pictures. Biological Psychology, 69, 247–260. Tassinary, L. G., Cacioppo, J. T., & Vanman, E. J. (2007). The skeletomotor system: Surface electromyography. In J. T. Cacioppo, L. G. Tassinary, & G. G. Berntson (Eds.), Handbook of psychophysiology (3rd ed, pp. 267–302). Cambridge, UK: Cambridge University Press. Van Dillen, L., Heslenfeld, D. J., & Koole, S. (2009). Tuning down the emotional brain: An fMRI study of the effects of cognitive load on the processing of affective images. NeuroImage, 45, 1212–1219. (Received June 4, 2009; Accepted January 19, 2010)
Psychophysiology, 48 (2011), 117–120. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01043.x
BRIEF REPORT
Sigh rate and respiratory variability during mental load and sustained attention
ELKE VLEMINCX,a JOACHIM TAELMAN,b STEVEN DE PEUTER,a ILSE VAN DIEST,a and OMER VAN DEN BERGHa a
Research Group on Health Psychology, Department of Psychology, University of Leuven, Leuven, Belgium Biomedical Signal Processing Group, Department of Electrical Engineering, University of Leuven, Leuven, Belgium
b
Abstract Spontaneous breathing consists of substantial correlated variability: Parameters characterizing a breath are correlated with parameters characterizing previous and future breaths. On the basis of dynamic system theory, negative emotion states are predicted to reduce correlated variability whereas sustained attention is expected to reduce total respiratory variability. Both are predicted to evoke sighing. To test this, respiratory variability and sighing were assessed during a baseline, stressful mental arithmetic task, nonstressful sustained attention task, and recovery in between tasks. For respiration rate (excluding sighs), reduced total variability was found during the attention task, whereas correlated variation was reduced during mental load. Sigh rate increased during mental load and during recovery from the attention task. It is concluded that mental load and task-related attention show specific patterns in respiratory variability and sigh rate. Descriptors: Mental arithmetic, Sustained attention, Sighing, Respiratory variability
random variability enhancing respiratory sensitivity and adaptability (Bruce & Daubenspeck, 1995). The inconsistent findings above mostly relied on general measures of total variability, which can be interpreted as the sum of random and correlated variability. Increases in total variability during emotional states could be due to excessive random variability, indicating a lack of stability. In contrast, decreases in total respiratory variability could be the result of a lack of correlated variability, ensuing from sustained psychological processes supporting task-related attention or behavior inhibiting responsiveness to environmental changes (Thayer & Lane, 2000). A related aspect of respiratory variability that has largely been disregarded is sighing. We have previously theorized that sighing acts as a resetter of the respiratory system to restore healthy variability either when respiration progressively lacks variability or when respiratory variability becomes excessively random (Vlemincx, Van Diest, et al., 2010). In the present study sighing and random and correlated respiratory variability were assessed during a stressful mental load task, predicted to induce random variability and sighing, and a nonstressful attention task, predicted to reduce total variability, after which sighing will occur.
In contrast with the extensive literature on the effects of attention and emotion on cardiovascular variability and basic respiratory parameters, little is known about the effect on respiratory variability, and the few existing studies show inconsistent results. Spontaneous breathing during rest in healthy subjects shows considerable variability (Donaldson, 1992; Hughson, Yamamoto, & Fortrat, 1995; Small, Judd, Lowe, & Stick, 1999; Tobin, Yang, Jubran, & Lodato, 1995; Wysocki, Fiamma, Straus, Poon, & Similowski, 2006), and some findings suggest a reduction during anxiety and negative affect (Van Diest, Thayer, Vandeputte, Van de Woestijne, & Van den Bergh, 2006). Increased respiratory variability is characteristic of both positive affective states such as fun and amusement (Boiten, 1998) and negative emotional states, such as anger, resentment, guilt, and sorrow (Stevenson & Ripley, 1952), pain (Boiten, 1998), and panic disorder (Abelson, Weg, Nesse, & Curtis, 2001; Martinez et al., 2001; Wilhelm, Trabert, & Roth, 2001; Yeragani, Radhakrishna, Tancer, & Uhde, 2002). To integrate these conflicting findings, we previously proposed to distinguish between various types of variability: correlated and random (Vlemincx, Van Diest, Lehrer, Aubert, & Van den Bergh, 2010). From a dynamic systems perspective, healthy breathing is characterized by considerable correlated respiratory variability representing homeostatic capacity and respiratory stability and some
Methods Address correspondence to: Omer Van den Bergh, Research Group on Health Psychology, Department of Psychology, University of Leuven, Tiensestraat 102, B-3000 Leuven, Belgium. E-mail:
[email protected]
Participants Forty-three healthy students participated in the study (21 men, age 18–22 years). The experiment was approved by the Ethics 117
118 Committees of the Department of Psychology and of the Faculty of Medical Sciences. Apparatus Breathing data were measured continuously by means of respiratory plethysmography using the LifeShirt System (Vivometrics, Inc., Ventura, CA). Motion and posture were assessed by the LifeShirt accelerometers. Electrocardiogram, end-tidal pCO2 and electromyography data are discussed elsewhere (Vlemincx, Taelman, Van Diest, & Van den Bergh, 2010). Procedure Upon arrival, participants were informed about the course of the experiment, signed an informed consent form, and were presented with the written instructions. Participants were informed that the experiment consisted of different tasks. The first task was to watch the documentary The March of the Penguins during baseline. The film was also watched during recovery after each of the other tasks. They were ensured that no questions about the movie would be asked later on and they could relax and enjoy watching the film. The second task was a mental arithmetic task performed under stressor conditions. A continuous series of sums of five operations with a two- or three-digit number had to be performed without any verbalization (e.g., 36117 ! 24 " 2113). Participants were instructed not to speak, mumble, or move their lips. Using the mouse cursor, they indicated the correct answer by choosing between three alternatives, after which feedback was given. The five participants who achieved the most correct answers were rewarded with a movie ticket. The experimenter was seated next to the participant. Mental arithmetic is widely used to induce stress, as it affects several physiological indices of stress (Kelsey et al., 1999; Willemsen, Ring, McKeever, & Carroll, 2000). Moreover, specific task characteristics increase the stress level: high task difficulty, feedback, speedand accuracy-related evaluation and rewards, and near observation (Boiten, Frijda, & Wientjes, 1994; Gaillard & Wientjes, 1994; Kelsey et al., 2000). The third task was a nonstressful but attention engaging task during which participants were presented with three different numbers from which they had to indicate the largest number using the mouse cursor. Compared to the mental arithmetic task, this attention task required the same motor movement as well as sustained task attention, but task difficulty was extremely low, no time constraints were applied, and no performance rewards were given. Before the experiment started, participants were connected to the LifeShirt System, explicitly instructed again not to speak, to sit comfortably, not to change posture, and not to move except for their dominant hand using the mouse cursor. In summary, the experiment consisted of a series of seven 6min phases, starting with a baseline, which was followed by three 6-min tasks, each followed by a 6-min recovery period (RC). The three tasks were a nonstressful attention task (AT) and two mental arithmetic tasks (MT1 and MT2), presented in completely randomized order. Randomization was controlled by custom-made stimulus presentation and data acquisition software Affect 4.0 (Spruyt, Clarysse, Vansteenwegen, Baeyens, & Hermans, 2010). Data Analysis Respiratory measures. Respiratory waveforms were edited using dedicated Vivologic software (Vivometrics, Inc., Ventura, CA; for more details, see Vlemincx, Van Diest, et al., 2010). Because of unreliable data acquisition, data from 1 participant were excluded from analysis. Movement artifacts were controlled
E. Vlemincx et al. for by evaluating the accelerometer signals: All participants maintained the same posture during the whole experiment, mean motion value was 0.5 (range 0–3) on a scale from 0 (no movement at all) to 5 (resting) to 15 (walking) to 50 (running fast). Next, respiratory parameters were calculated breath by breath. Mean basic respiratory parametersFinspiratory volume (Vi), respiration rate (RR 5 60/total breath time), minute ventilation (MV 5 RR " Vi), and contribution of ribcage breathing to inspiratory volume (%RCi)Frespiratory variability, and the number of sighs were calculated within each 6-min phase. Sighs within each phase were defined as breaths with an inspiratory volume at least 2 times as large as the mean inspiratory volume during this phase. The coefficient of variation (CV) and autocorrelation (the correlation of a signal with itself) at one breath lag (AR) of Vi, RR, and MV were calculated as measures of total respiratory variability and correlated respiratory variability, respectively (Tobin et al., 1995). Both measures of respiratory variability were calculated including and excluding sighs. Statistical analysis. Respiratory measures were subjected to a repeated measures analysis (ANOVA) with phase (baseline, AT, MT1, MT2, RC after AT, RC after MT) as a within-subject variable. To explore further differences between tasks, baseline, and RC, post hoc contrasts were tested by means of Tukey comparisons. Reported p values are Greenhouse–Geisser corrected and e values are reported. Effect sizes are reported as Z2p . Results The effect of phase was significant for all respiratory measures (see Table 1). The following post hoc comparisons were significant at a 5 .01. Basic Respiratory Parameters Significantly higher Vi during baseline and MT1 was found compared to AT (po.0001) and RC after MT (po.001). Increased RR seemed characteristic of both AT and MT: RR during AT, MT1, and MT2 was significantly higher compared to baseline (po.0001) and RC periods (po.0001). MV during baseline and AT was significantly lower than MV during MT1 (po.0001), but was significantly higher than MV during RC phases (po.001). %RCi during MT1 did not differ from %RCi during baseline and MT2 and was significantly higher compared to AT (po.001) and RC periods (po.001), suggesting that increased %RCi was specific of MT. Respiratory Variability Compared to baseline, CV(Vi) was significantly higher during MT1 (po.0001), MT2 (po.0001), and RC after AT (po.0001), but did not differ from CV(Vi) during AT and RC after MT. Excluding sighs reduced overall CV(Vi), but did not change this pattern across phases. In line with the predicted results, total variation in Vi was increased during MT. Whereas correlated variability was predicted to be reduced during MT, AR(Vi) appeared to be significantly lower during MT2, but not during MT1. Compared to baseline, AR(Vi) was significantly lower during MT2 (po.0001) and RC after AT (po.001). After excluding sighs, AR(Vi) during baseline and during RC after AT did no longer differ. CV(RR) during ATwas significantly lower compared to MT1 (po.0001), MT2 (po.001), and RC after AT (po.001). When sighs were excluded, CV(RR) during AT became significantly lower compared to baseline (p 5 .01). Thus, as predicted, AT was
Sigh rate and respiratory variability
119
Table 1. Mean (SD) Basic Respiratory ParametersFVi (ml), RR (Breaths/Min), MV (l/Min), RCi (%)FSigh Rate (N), and Variability Measures (CV and AR) Including and Excluding Sighs during the Experimental Phases Phase F(5,205)
Mean Vi 9.31nnn RR 31.99nnn MV 30.51nnn RCi 7.33nnn Sigh rate 6.41nnn CV Vi 11.37nnn Vi ex. sighs 9.07nnn RR 5.93nnn RR ex. sighs 5.22nnn MV 5.76nnn MV ex. sighs 5.86nnn AR Vi 8.81nnn Vi ex. sighs 4.6nn RR 6.03nnn RR ex. sighs 6.04nn MV 8.4nnn MV ex. sighs 7.66nnn
e
Z2p
Baseline
AT
MT1
MT2
RC after AT
.60 0 .49 .70 .71
.19 .44 .43 .15 .14
375.48 a (168.57) 16.18 a (3.27) 5.78 a (2.17) 40.85 a,b (7.23) 0.79 a (1.35)
334.33 b (147.97) 18.76 c (3.16) 6.00 a (2.28) 39.14 b (8.18) 1.38 a,b (1.34)
375.31 a (163.93) 18.19 b,c (3.35) 6.47 c (2.71) 42.95 a (8.88) 1.93 b (2.10)
353.32 a,b (153.22) 17.41 b (3.25) 5.8 a (2.38) 41.66 a,b (8.46) 2.05 b (1.94)
351.47 a,b (154.93) 15.88 a (2.41) 5.34 b (2.16) 39.02 b (7.00) 2.12 b (1.89)
.76 .67 .81 .80 .67 .74
.22 .18 .13 .11 .12 .13
24.00 a (13.63) 18.54 a (6.63) 16.37 a,b (7.16) 15.95 a,b (7.10) 20.59 a (8.14) 19.13 a (6.19)
29.56 a,b (12.19) 20.01 a,b (6.25) 13.34 a (4.50) 12.40 a (4.39) 21.17 a,b (5.83) 19.4 a,b (5.30)
36.35 b,c (16.30) 24.28 c (7.09) 18.40 b (8.39) 17.28 b (8.44) 25.80 b,c (8.21) 23.72 c (6.97)
38.29 c (17.60) 23.35 c (7.96) 18.24 b (8.14) 16.86 b (8.23) 25.52 a,b,c (9.32) 22.28 a,b,c (7.95)
36.61 b,c (15.69) 21.98 b,c (7.08) 17.84 b (6.75) 16.54 b (6.75) 26.52 c (11.88) 23.13 b,c (8.79)
30.5 a,b (13.64) 21.41 a,b,c (7.69) 16.43 a,b (6.59) 15.62 a,b (6.28) 23.49 a,b,c (8.41) 21.39 a,b,c (6.77)
.65 .79 .82 .82 .75 .76
.18 .10 .13 .13 .17 .16
.20 a (.18) .24 a (.18) .26 a (.20) .47 a (.11) .28 a,b (.18) .31 a,b (.18)
.12 a,b (.20) .24 a (.19) .22 a,b (.17) .46 a,b (.10) .25 a,b (.21) .32 a,b (.21)
.11 a,b,c (.23) .22 a (.19) .12 b (.15) .40 b,c (.08) .30 b (.24) .37 b (.23)
.05 b,c (.13) .18 a,b (.17) .21 a,b (.18) .44 a,b,c (.10) .18 a,c (.16) .23 a,c (.17)
.14 a,b (.13) .21 a,b (.14) .22 a,b (.15) .46 a,b (.09) .23 a,b (.14) .26 a,b,c (.14)
.01 c (.13) .11 b (.15) .13 b (.13) .39 c (.06) .11 c (.16) .18 c (.18)
RC after MT
343.31 b (153.96) 15.92 a (2.93) 5.22 b (2.10) 38.88 b (7.57) 1.36 a,b (1.33)
Note: Vi: inspiratory volume, RR: respiration rate, MV: minute ventilation, RCi: portion ribcage breathing, CV: coefficient of variation, AR: autocorrelation, AT: attention task, MT: mental arithmetic task, RC: recovery. nn po.001, nnn po.0001; means with different subscripts are statistically different at a 5 .01 using Tukey-corrected p values.
marked by decreased total variability in RR. AR(RR) during baseline was significantly higher compared to MT1 (po.001) and MT2 (po.01), but did not differ from all other phases, suggesting that, in line with the hypotheses, correlated variability in RR was reduced during MT. Excluding sighs yielded the same pattern. CV(MV) was significantly lower during baseline, compared to MT1 (po.01) and RC after AT (po.001), but did not differ from CV(MV) during AT, MT2, and RC after MT. Thus, as hypothesized, total variability in MV increased during MT. However, consistent with variability in Vi, correlated variability in MV was decreased during MT2 but not during MT1: Higher AR(MV) was found during baseline and AT compared to MT2 (po.0001). AR(MV) during MT1 appeared to be significantly higher compared to MT2 (po.0001) and RC after AT (po.01). Excluding sighs yielded the same results. Sighing Sigh frequency was significantly higher during MT1 (po.01), MT2 (po.001), and RC after AT (po.0001) compared to baseline. Sigh frequency during baseline did not differ from that during AT and RC after MT. This suggests that, consistent with our predictions, sighs appeared characteristic of MT and RC following AT. DISCUSSION The aim of the present study was to investigate respiratory variability and sigh frequency during a stressful mental arithmetic task and a nonstressful attention task. Increased sighing was found during mental load, which was characterized by increased random breathing, and following task-related attention, which was characterized by a reduction of respiratory variability. Basic respiratory measures in this study showed rapid shallow breathing during sustained task attention, whereas more thoracic, faster, and (during the first task) deeper breathing was
found during mental arithmetic. The latter matches breathing responses during high-arousal negative affective states (Boiten et al., 1994), suggesting that mental stress was successfully induced by mental arithmetic. In addition, the mental load task and the attention task showed different patterns in respiratory variability measures. The attention task reduced total respiratory variability (of respiration rate excluding sighs) compared to baseline. During the mental load task, total breathing variability increased, but autocorrelation was reduced, which implies increased random respiratory variability. Together, these findings suggest that stressors might increase random variability, whereas sustained attention states might reduce total variability. As predicted, sigh rate strongly increased during the mental arithmetic task and following the attention task. Both findings fit the hypothesis that sighing acts as a psychophysiological resetter (Vlemincx, Van Diest, et al., 2010); sighing may (temporarily) reset physiological changes that characterize psychological states. On the one hand, negative emotional states elicit increasing tension, and, accordingly, breathing may become progressively random, which may be counteracted by sighing. Recent evidence shows that sighs occur toward increasingly random breathing and reset structured correlated respiratory variability (Baldwin et al., 2004; Vlemincx, Van Diest, et al., 2010). The result that sigh rate increases during the mental load task, which was characterized by decreased autocorrelation and more random breathing, fits this finding. On the other hand, sighing might also reset respiratory variability as it becomes reduced during sustained attention. A lack of respiratory variability elicits atelectasis, the progressive collapse of alveoli, which in turn causes a decrease in lung compliance and gas exchange efficiency. These physiological consequences are restored by sighing (Bendixen, Smith, & Mead, 1964; Caro, Butler, & Dubois, 1960; Cherniack, Euler, Glowgowska, & Homma, 1981;
120
E. Vlemincx et al.
Ferris & Pollard, 1960; McIlroy, Butler, & Finley, 1962; Mead & Collier, 1959; Reynolds, 1962). This suggests that participants recovered from reduced respiratory variability and its associated physiological consequences with sighing at the end of the attention task. It is likely that across the life span of a person, an intricate relationship develops between the physiological and psychological consequences of sighing, such that persons learn to use sighing as a coping response with aversive states to induce subjective relief and beneficial physiological effects. In line with this, increased sigh rates are found during relief of dyspnea and perceived restlessness (Hirose, 2000), relief of negative affectivity
and craving (McClernon, Westman, & Rose, 2004), and relief of stress (Soltysik & Jelen, 2005; Vlemincx et al., 2009). In this study, correlated variability quantified by autocorrelation at one breath lag holds only one level of correlated variability and therefore reflects only a fraction of all variability of a correlated nature. Therefore, the interpretation of our results is limited to one component of correlated variability as quantified by autocorrelation at one breath lag. The present study shows that it is important to consider measures of respiratory variability and sighing in addition to mean basic respiratory parameters when investigating the influence of emotion upon respiration.
REFERENCES Abelson, J. L., Weg, J. G., Nesse, R. M., & Curtis, G. C. (2001). Persistent respiratory irregularity in patients with panic disorder. Biological Psychiatry, 49, 588–595. Baldwin, D. N., Suki, B., Pillow, J. J., Roiha, H. L., Minnocchieri, S., & Frey, U. (2004). Effects of sighs on breathing memory and dynamics in healthy infants. Journal of Applied Physiology, 97, 1830–1839. Bendixen, H. H., Smith, G. M., & Mead, J. (1964). Pattern of ventilation in young adults. Journal of Applied Physiology, 19, 195–198. Boiten, F. A. (1998). The effects of emotional behaviour on components of the respiratory cycle. Biological Psychology, 49, 29–51. Boiten, F. A., Frijda, N. H., & Wientjes, C. J. E. (1994). Emotions and respiratory patterns: Review and critical analysis. International Journal of Psychophysiology, 17, 103–128. Bruce, E. N., & Daubenspeck, J. A. (1995). Mechanisms and analysis of ventilatory stability. In J. A. Dempsey & A. I. Pack (Eds.), Regulation of breathing (pp. 285–313). New York: Marcel Dekker. Caro, C. G., Butler, J., & Dubois, A. B. (1960). Some effects of restriction of chest cage expansion on pulmonary function in man: An experimental study. Journal of Clinical Investigation, 39, 573–583. Cherniack, N. S., Euler, C., Glowgowska, M., & Homma, I. (1981). Characteristics and rate of occurrence of spontaneous and provoked augmented breaths. Acta Physiologica Scandinavica, 111, 349–360. Donaldson, G. C. (1992). The chaotic behaviour of resting human respiration. Respiration Physiology, 88, 313–321. Ferris, B. G., & Pollard, D. S. (1960). Effect of deep and quiet breathing on pulmonary compliance in man. Journal of Clinical Investigation, 39, 143–149. Gaillard, A. W. K., & Wientjes, C. J. E. (1994). Mental load and work stress as two types of energy mobilization. Work and Stress, 8, 141–152. Hirose, S. (2000). Restlessness or respiration as a manifestation of akathisia: Five case reports of respiratory akathisia. Journal of Clinical Psychiatry, 61, 737–741. Hughson, R. L., Yamamoto, Y., & Fortrat, J. O. (1995). Is the pattern of breathing at rest chaotic? A test of Lyapunov exponent. Advances in Experimental Medicine and Biology, 393, 15–19. Kelsey, R. M., Blascovich, J., Leitten, C. L., Schneider, T. R., Tomaka, J., & Wiens, S. (2000). Cardiovascular reactivity and adaptation to recurrent psychological stress: The moderating effects of evaluative observation. Psychophysiology, 37, 748–756. Kelsey, R. M., Blascovich, J., Tomaka, J., Leitten, C. L., Schneider, T. R., & Wiens, S. (1999). Cardiovascular reactivity and adaptation to recurrent psychological stress: Effects of prior task exposure. Psychophysiology, 36, 818–831. Martinez, J. M., Kent, J. M., Coplan, J. D., Browne, S. T., Papp, L. A., Sullivan, G. M., et al. (2001). Respiratory variability in panic disorder. Depression and Anxiety, 14, 232–237. McClernon, F. J., Westman, E. C., & Rose, J. E. (2004). The effects of controlled deep breathing on smoking withdrawal symptoms in dependent smokers. Addictive Behaviors, 29, 765–772. McIlroy, M. B., Butler, J., & Finley, T. N. (1962). Effects of chest compression on reflex ventilatory drive and pulmonary function. Journal of Applied Physiology, 17, 701–705. Mead, J., & Collier, C. (1959). Relation of volume history of lungs to respiratory mechanics in anesthetized dogs. Journal of Applied Physiology, 14, 669–678.
Reynolds, L. B. (1962). Characteristics of an inspiration-augmenting reflex in anesthetized cats. Journal of Applied Physiology, 17, 683–688. Small, M., Judd, K., Lowe, M., & Stick, S. (1999). Is breathing in infants chaotic? Dimension estimates for respiratory patterns during quiet sleep. Journal of Applied Physiology, 86, 359–376. Soltysik, S., & Jelen, P. (2005). In rats, sighs correlate with relief. Physiology and Behavior, 85, 589–602. Spruyt, A., Clarysse, J., Vansteenwegen, D., Baeyens, F., & Hermans, D. (2010). Affect 4.0: A free software package for implementing psychological and psychophysiological experiments. Experimental Psychology, 57, 36–45. Stevenson, I., & Ripley, H. S. (1952). Variations in respiration and in respiratory symptoms during changes in emotion. Psychosomatic Medicine, 14, 476–490. Thayer, J. F., & Lane, R. D. (2000). A model of neurovisceral integration in emotion regulation and dysregulation. Journal of Affective Disorders, 61, 201–216. Tobin, M. J., Yang, K. L., Jubran, A., & Lodato, R. F. (1995). Interrelationship of breath components in neighboring breaths of normal eupneic subjects. American Journal of Respiratory Critical Care Medicine, 152, 1967–1976. Van Diest, I., Thayer, J. F., Vandeputte, B., Van de Woestijne, K. P., & Van den Bergh, O. (2006). Anxiety and respiratory variability. Physiology and Behavior, 89, 189–195. Vlemincx, E., Van Diest, I., De Peuter, S., Bresseleers, J., Bogaerts, K., Fannes, S., et al. (2009). Why do you sigh? Sigh rate during induced stress and relief. Psychophysiology, 46, 1005–1013. Vlemincx, E., Van Diest, I., Lehrer, P. M., Aubert, A. E., & Van den Bergh, O. (2010). Respiratory variability preceding and following sighs: A resetter hypothesis. Biological Psychology, 84, 82–87. Vlemincx, E., Taelman, J., Van Diest, I., & Van den Bergh, O. (2010). Take a deep breath: The relief effect of spontaneous and instructed sighs. Physiology and Behavior (in press, doi: 10.1016/j.physbeh. 2010.04.015). Wilhelm, F. H., Trabert, W., & Roth, W. T. (2001). Physiologic instability in panic disorder and generalized anxiety disorder. Biological Psychiatry, 49, 596–605. Willemsen, G., Ring, C., McKeever, S., & Carroll, D. (2000). Secretory immunoglobulin A and cardiovascular activity during mental arithmetic: Effects of task difficulty and task order. Biological Psychology, 52, 127–141. Wysocki, M., Fiamma, M.-N., Straus, C., Poon, C.-S., & Similowski, T. (2006). Chaotic dynamics of resting ventilatory flow in humans assessed through noise titration. Respiratory Physiology and Neurobiology, 153, 54–65. Yeragani, V. K., Radhakrishna, R. K. A., Tancer, M., & Uhde, T. (2002). Nonlinear measures of respiration: Respiratory irregularity and increased chaos of respiration in patients with panic disorder. Neuropsychobiology, 46, 111–120.
(Received August 13, 2009; Accepted January 19, 2010)
Psychophysiology, 48 (2011), 121–135. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01044.x
Airway response to emotion- and disease-specific films in asthma, blood phobia, and health
THOMAS RITZ,a FRANK H. WILHELM,b ALICIA E. MEURET,a ALEXANDER L. GERLACH,c and WALTON T. ROTHd a
Department of Psychology, Southern Methodist University, Dallas, Texas, USA Department of Psychology, University of Basel, Basel, Switzerland Department of Psychology, University of Mu¨nster, Mu¨nster, Germany d Psychiatry Service, Stanford University School of Medicine and VA Palo Health Care System, Palo Alto, California, USA b c
Abstract Earlier research found autonomic and airway reactivity in asthma patients when they were exposed to blood-injectioninjury (BII) stimuli. We studied oscillatory resistance (Ros) in asthma and BII phobia during emotional and diseaserelevant films and examined whether muscle tension counteracts emotion-induced airway constriction. Fifteen asthma patients, 12 BII phobia patients, and 14 healthy controls viewed one set of negative, positive, neutral, BII-related, and asthma-related films with leg muscle tension and a second set without. Ros, ventilation, cardiovascular activity, and skin conductance were measured continuously. Ros was higher during emotional compared to neutral films, particularly during BII material, and responses increased from healthy over asthmatic to BII phobia participants. Leg muscle tension did not abolish Ros increases. Thus, the airways are particularly responsive to BII-relevant stimuli, which could become risk factors for asthma patients. Descriptors: Asthma, Blood-injection-injury phobia, Respiratory resistance, Emotion, Respiration, Autonomic nervous system, Electrodermal activity
Ritz, Tho¨ns, Fahrenkrug, & Dahme, 2005; von Leupoldt & Dahme, 2005). Although this indicates that under emotional challenge the airways do not provide strong evidence for diseasespecific responding in asthma, airway responses of asthma patients to negative experimental film stimuli have been found to be correlated with lung function decline in strong states of negative affect in daily life (Ritz & Steptoe, 2000). In addition, responses to both negative and positive film stimuli correlate with patients’ perceived importance of emotions as triggers of asthma in their daily lives (Ritz, Steptoe, Bobb, Harris, & Edwards, 2006). Previously we observed that asthma patients show particularly strong emotional arousal and increases in respiratory resistance to blood, injection, or injury (BII) stimuli, which consisted of a series of pictures depicting blood, injuries, and mutilated bodies (Ritz, Steptoe, et al., 2000). Similar stimuli have been associated with reductions in heart rate in healthy individuals (e.g., Carruthers & Taggart, 1973; Lang, Greenwald, Bradley, & Hamm, 1993) and vasovagal fainting in individuals who are sensitive or phobic to BII stimuli (Engel, 1978; Graham, Kabler, & Lunsford, 1961). Because vagal excitation is a powerful constrictor of the airways, it has been speculated that the airways of asthma patients may be more sensitive to blood and injury stimuli (Lehrer, Isenberg, & Hochron, 1993). Given the uniqueness of our findings, we sought to replicate them and compare the size of the response to BII stimuli to that elicited by general emotional stimuli, in particular unpleasant emotional states. In the previous study (Ritz, Steptoe, et al., 2000), a direct comparison with
Bronchoconstriction is one of the key clinical features of asthma (National Heart, Lung, and Blood Institute [NHLBI], 2002). Prior studies have demonstrated that experimental emotion induction is capable of constricting the airways and that this reaches clinically relevant levels in a considerable proportion of asthma patients (for reviews, see Isenberg, Lehrer, & Hochron, 1992; Ritz & Kullowatz, 2005). Although the weight of the evidence suggests that states of negative affect are particularly potent in eliciting airway responses, instances of airway constriction in positive emotional states have also been reported (Liangas, Morton, & Henry, 2003; Ritz, Steptoe, De Wilde, & Costa, 2000; von Leupoldt & Dahme, 2005), suggesting a nonspecific arousal modulation of airway smooth muscle tone. Both asthma patients and healthy controls show qualitatively, and in many studies quantitatively, comparable responses of the airways to such stimulation (e.g., Lehrer et al., 1996; Ritz, Steptoe, et al., 2000; This study was supported by the German Research Society (DFG Ri 957/2-1), the National Institute of Mental Health (MH56094), the Palo Alto Research Institute (PAIRE), and the Department of Veterans Affairs. We thank David Rosenfield and Bernhard Dahme for helpful comments, Tana Bliss, Mark Rothkopf, and Alysha Khavarian for their assistance in recruitment and data collection, and Antje Kullowatz, Nino Wessolowski, Chris Burrows, and Barnes McKenzie for their help in data reduction procedures. Address correspondence to: Thomas Ritz, Department of Psychology, Southern Methodist University, Hyer Hall 306C, 6424 Hilltop Lane, Dallas, TX 75205, USA. E-mail:
[email protected] 121
122 effects of general emotional stimuli was not possible, because these stimuli were presented as films rather than still pictures as the BII stimuli. Differences in stimulus type were thus confounded with the emotion induction method. For the current study, we used films only. Intense emotional responses to BII stimuli, as they are observed in BII phobia patients, can potentially provide an interesting model for studying the specificity of this emotional stimulus type on the airways in asthma. These patients have healthy airways but are affected by strong phobic responses to BII stimulus material. For the purpose of this study we measured respiratory resistance continuously in patients with asthma and BII phobia as well as in healthy controls during general emotional films and blood- and injury-related films. To control for the disease relevance of the material, we also administered asthma-related film scenes portraying asthma attacks and labored breathing, because such disease-relevant material had also been suggested to be particularly potent in its effect on the airways in asthma (Levenson, 1979). A specific airway responsiveness of asthma patients to BII relevant stimuli would be demonstrated if resistance increases were (a) stronger in asthma compared to BII patients and healthy control and (b) stronger in asthma during BII stimuli than any other emotional or diseaserelevant stimulus. Physical exercise and skeletal muscle tension have an immediate dilatory effect on the airways (Kagawa & Kerr, 1970; Mansfield, McDonnell, Morgan, & Souhrada, 1979; Warren, Jennings, & Clark, 1984) in both health and asthma. Whereas exercise is a well-known trigger of airway obstruction in asthma (McFadden & Gilbert, 1994), the typical exercise response is bronchodilation in the early phase of exercise followed by a constriction in later phases and following exercise (Beck, Offord, & Scanlon, 1994). Similarly, brief static contraction of facial or arm muscles has been shown to reduce respiratory resistance below baseline levels (Ritz, Dahme, & Wagner, 1998). There is also solid evidence from animal studies that skeletal muscle contractions dilate the airways (e.g., Kaufman, Rybicki, & Mitchell, 1985; Longhurst, 1984; Padrid, Haselton, & Kaufman, 1990), with vagal withdrawal as the major mechanism behind these changes. Although increases in sympathetic activation may contribute to autonomic exercise effects, they most probably play a subordinate role on the airways, as no direct functional innervation of the airway smooth muscles is observed in humans (Barnes, 1986; Canning & Fischer, 2001). Circulating catecholamines can dilate the airways through b2-adrenergic receptors on the smooth muscles, but this pathway is less likely in shorter or mild to moderate levels of exercise (Lewis et al., 1985; Wasserman, Whipp, & Casaburi, 1986). Thus, tensing skeletal muscles could be a simple behavioral maneuver to reduce emotion-induced airway constriction and thereby help limit airway obstructions and symptoms in patients with emotion-induced asthma. However, this has never been tested. In BII phobia, voluntary contractions of the skeletal muscles have been used to counteract vasovagal fainting (Kozak & Miller, 1985; O¨st & Sterner, 1987). Such contraction may also serve to dampen any stronger airway response to blood and injury stimuli in this patient group. Therefore, we studied respiratory resistance during conditions of emotional film viewing both with and without voluntary skeletal muscle contraction. Some of the data from blood phobia patients and healthy controls in this study have been presented before (Ritz, Wilhelm, Gerlach, Kullowatz, & Roth, 2005; Ritz, Wilhelm, Meuret,
T. Ritz et al. Gerlach, & Roth, 2009). The focus of those analyses was to explore hyperventilation in blood phobia patients during exposure to feared stimuli. In contrast, the data reported here focus on the additional asthma group and on airway responses to both general emotional and disease-relevant film clips. In addition to respiratory resistance, we explored a number of autonomic and ventilatory parameters that are potentially linked to airway obstruction (Ritz et al., 2002). For instance, strong vagal excitations, as the most prominent autonomic pathway to airway obstruction in asthma, could be manifested in marked reduction in heart rate, which is typically observed in response to BII stimuli as part of the vasovagal response (e.g., Engel, 1978; Graham et al., 1961). In addition, marked increase in ventilation could constrict the airways in asthma patients either by drying of the airways (Freed, 1995) or by effects of reduced carbon dioxide partial pressure (PCO2; van den Elshout, van Herwaarden, & Folgering, 1991). To explore such patterns of ventilatory and autonomic change associated with airway constriction, we planned to study between-individuals and within-individual correlations of Ros change with these parameters. We also sought to study changes in skin conductance relative to those observed in respiratory resistance. In previous studies of emotion induction, we found that skin conductance level (SCL) or skin conductance response closely matched the responses seen in respiratory resistance (Ritz, George, & Dahme, 2000; Ritz, Steptoe, et al., 2000). Similarly, asthma patients with more airway hyperresponsiveness to methacholine have been previously found to respond with greater SCL increases during stressful tasks (Lehrer et al., 1996). A link between these parameters may seem less obvious, given that vagal excitation is the major source of airway constriction (Barnes, 1986; Canning & Fischer, 2001) and sympathetic activation is responsible for skin conductance increases. Both systems, however, have cholinergic neurotransmission in common, and a link between allergic status and these two manifestations of cholinergic activity had been proposed earlier (Kaliner, 1976; Marshall, 1989). Thus, we compared response patterns of both parameters as well as their betweenindividuals and within-individual associations.
Method Parts of the methods have been presented in detail before (Ritz et al., 2009; Ritz, Wilhelm, et al., 2005). In short, 15 asthma patients (11 women, 4 men; mean age 40.1 years, range 20–55 years), 12 BII phobia patients (9 women, 3 men; mean age 37.3 years, range 21–57 years) and 14 nonanxious controls (10 women, 4 men; mean age 36.4 years, range 22–57 years) participated. Across groups, 75.6% of participants indicated being Caucasian, 7.3% Hispanic, 12.2% Asian, and 4.9% African American. General selection criteria were age between 18 and 60 years and no history of epilepsy or seizures. All participants were screened using the Structured Clinical Interview for DSM-IV, Patient Edition (First, Spitzer, Gibbon, & Williams, 1994). BII phobia patients were required to meet Axis I criteria for simple phobia, BII type. Seven phobic patients reported prior fainting in BII situations and 2 more had almost fainted. BII phobia patients also were required to have a normal 12-lead electrocardiogram (1 patient with abnormality of the T wave was excluded for suspected inferior ischemia). Healthy participants and patients with BII phobia had to be free of current psychological disorders. They were only included if they reported no acute or
Airways and emotion in asthma and blood phobia chronic respiratory diseases and no current smoking. Asthma patients had intermittent to moderately persistent asthma (NHLBI, 2002) and had not received systemic corticosteroids in the previous 3 months. All but 1 asthma patient had undergone skin testing for allergies, with all of them positive. Symptoms were present more than two times per week in 53.3% of the asthma patient sample, nighttime symptoms more then two times per month in 46.7%, and restrictions in daily activities by asthma more than two times per week by 26.7%. Seasonal variations in symptoms were reported by 66.7% of the sample and a family history of asthma or allergies by 66.7%. Anti-asthmatic medication was used by all patients, with 80% using short-acting bronchodilators, 33.3% long-acting bronchodilators, 80% inhaled corticosteroids, 40% leukotriene inhibitors, 20% antihistaminics, and 13.3% mast cell stabilizers. We paid subjects $60 for their participation in the experiment. Local ethics committees approved the study, and we obtained informed consent from all participants. Emotion Induction Film sequences extracted from movies and medical education material were shown (lasting approximately 150 to 300 s; Gross & Levenson, 1995; Ritz, Steptoe, et al., 2000). Two films of each category were selected: negative (bullying scene, boy who cries about the death of his father), neutral (economics lecture, screen saver), positive (scenes from British and American comedy series), asthma-related films (scene of an asthma attack, scenes of intense emotion with labored breathing and wheezing), and BII phobia-relevant (scenes from educational surgery films with needle/injection images, cutting tissue, and spilling blood). For one set of films (constituted by one film from each category) the instruction was to view all sequences for their entire length and for the other set to view the films while tensing the leg muscles. Participants were advised that some of the material they were to view would be of a medical nature and would contain scenes with blood and injuries. Participants were also asked to view the film sequences for their entire length and not to close their eyes. However, they were allowed to stop if they felt unable to continue viewing a film. They signaled that to the experimenter by taking out the tube for respiration measurement and raising their right hand. It was stressed, however, that the scientific value of the assessment would be greatest if they continued viewing for as long as possible. The experimenter confirmed that the patients’ eyes were open by observing them through a one-way mirror during the surgery films. Two BII phobia patients refused to view the first surgery film in full (total viewing time 58 and 192 s) and 1 patient refused to watch both surgery films in full (total viewing time 32 and 64 s). For 1 more BII patient, the experimenter stopped the presentation of the second surgery films because of signs of extreme distress. These patients, as well as 2 who were close to fainting, were provided with ample additional recovery time after finishing the ratings. In addition, the experimenter did not initiate the next film presentation before physiological values were well within the initial range and participants indicated feeling sufficiently recovered. Instructions for Muscle Tension Leg muscle tension was kept at a subjective level of approximately 30%–50% of the individual’s maximum possible effort. In addition, the legs were to be crossed below the knees and pressed together at the same level of effort. We restricted voluntary contractions to leg muscles to avoid possible interferences of
123 upper body muscle tension (in particular respiratory muscle tension) with breathing and respiratory resistance measurements.
Physiological Measurements Oscillatory resistance (Ros, a measure of respiratory impedance, expressed in kPa ! l " 1 ! s) was measured continuously with the single-frequency (10 Hz) forced oscillation technique (Siemens Siregnost FD 5). Participants breathed air through an elastic tube (approximately 80 cm long, 85 ml deadspace; Ritz et al., 2002) using a mouthpiece and nose clip. To reduce shunt characteristics of the upper airways, a padded elastic strap was attached to stabilize the cheeks and base of the mouth. Recordings of Ros were averaged after swallowing artifacts (rapid and brief 0.5–1.5-s increases) had been removed manually in an interactive scoring program. The respiratory pattern was monitored using a thoracic and abdominal pneumatic belt system (James Long Company, Caroga Lake, NY). The system was calibrated using a fixed volume bag (800 ml). Breath-by-breath respiration rate (RR) and tidal volume (VT ) was extracted from the calibrated respiration curve, and minute ventilation (V’E) was calculated by RR # VT. End-tidal PCO2 was measured with an infrared capnograph (Datex B, Puritan-Bennett Corporation, San Ramon, CA). Exhaled air was sampled continuously through a plastic tube (1.2 mm diameter) at a flow rate of 150 ml/min. Only breaths with distinct plateaus (Gardner, 1994) were scored by raters blind to the group assignment and experimental sequence. Hyperventilation is known to increase Ros in asthma, although in healthy individuals a certain intensity of hyperventilation (drops of approximately 10 mmHg or more) seems to be needed to affect the airways significantly (van den Elshout et al., 1991). We also counted frequency of sigh breaths/min (breaths 2 times the average VT of the individual). Sighing is thought to be related to hyperventilation (Wilhelm, Trabert, & Roth, 2001), which could increase Ros. However, deep breaths have also been shown to reduce airway tone in health, a mechanism impaired in asthma (e.g., Scichilone et al., 2007). For electrocardiogram measurements, three Ag-AgCl electrodes were attached, active ones on the sternum and laterally on the left costal arch at the level of the 10th rib and a ground electrode on the left clavicle. Data were lost for 1 BII patient during one set of films because of a defective lead. The time between successive R-waves of the electrocardiogram was converted to heart rate (HR) and mean HR was calculated across the entire film sequence. In addition, to explore whether HR slowing related to vasovagal responses (Engel, 1978; Graham et al., 1961) had occurred, HRminimum was identified from successive 10-s means throughout each film. Although HR is under both sympathetic and parasympathetic influence and therefore does not allow unambiguous inferences on the underlying autonomic regulation (Berntson, Cacioppo, & Quigley, 1991), HR slowing under BII stimulus exposure is thought to reflect a predominance of vagal excitation. Beat-to-beat systolic blood pressure (SBP) and diastolic blood pressure (DBP) were monitored continuously with a Finapres model 2300 (Ohmeda, Madison, WI). The cuff was fitted to the middle phalanx of the middle finger of the left hand. Due to connector problems, BP was not recorded for 1 patient during the first three films. However, the experimenter estimated average values from a numerical display. A second patient felt highly uncomfortable with the cuff, so the experimenter discontinued
124 recording. In addition, recordings were incomplete in 2 healthy participants because of cold fingers or equipment malfunction. Muscle tension was monitored by electromyographic activity with two Ag-AgCl electrodes (3 mm inner diameter) from an approximate soleus placement (Basmajian & Blumenstein, 1982) on the right leg. Defective leads led to data loss for 1 BII phobia patient and 3 healthy controls. SCL was measured with two AgAgCl electrodes (6 mm contact area diameter, filled with 0.05 M NaCl in Unibase) attached to the thenar and hypothenar eminences of the left hand. Data were lost for 1 control. Biosignals were recorded with a Vitaport 2 digital recorder/ analyzer (16-bit A/D converter, 512 Hz sample rate) attached to an IBM-compatible microcomputer (Intel Pentium II processor). For storage, sampling rate was reduced to 256 Hz for the electrocardiogram and plethysmographic pulse wave and to 32 Hz for respiratory signals. Raw EMG was rectified and integrated with a time constant of 62.5 ms. Biosignals were edited for artifacts and analyzed using customized MATLAB software (Gerlach et al., 2006). Psychological Measures Following each presentation, participants rated their emotions and symptoms during each film on a list of bipolar and unipolar rating scales. For the current analysis, we focused on dimensional bipolar ratings of pleasantness and arousal from the Self Assessment Manikin (Hodes, Cook, & Lang, 1985). The two scales were scored with 1 assigned to the unpleasant and calm pole and 9 to the pleasant and excited pole. Analysis of unipolar scales was focused on the asthma-relevant symptoms of shortness of breath and chest tightness, both rated on 11-point scales (0 5 not at all, 10 5 extremely). Ratings were incomplete for 1 asthma patient. Before the session, participants also filled out the Medical Fears Survey (MFS; Kleinknecht, Thorndike, & Walls, 1995), which measures the intensity of fear (item scale from 1 to 5, label range from no fear to terror) toward BII situations. Subscales with 10 items each were extracted for Injection & Blood Draws, Sharp Objects, Mutilations, Blood, and Examinations and Symptoms as Intimation of Illness. In addition, to explore potential group differences in habitual affect, we administered the Affect Intensity Measure (Larsen & Diener, 1987) and the Toronto Alexithymia Scale-20 (Bagby, Parker, & Taylor, 1994), the latter with the three subscales for identification of emotions, communication of emotions, and externally oriented thinking. Data from 2 asthma patients, 1 BII patient, and 1 control were incomplete. Procedure Laboratory assessments were scheduled in the afternoon. Reliever medication (bronchodilators) were to be withheld for at least 8 h before the laboratory session. Participants viewed films in a sound-attenuated chamber sitting in a comfortable armchair with a TV screen (approximately 40 cm diagonal) at a distance of approximately 1.5 m. The experimenter observed the participants through a one-way mirror from an adjacent room and communicated by intercom. Following sensor attachment and calibration, the first set of five film sequences was shown with the scheduled instruction, viewing only or viewing with muscle tension. The film set order was counterbalanced between groups. Both the order of the film sequences within sets as well as the assignment of sequences to sets were randomized. Each film presentation was followed by a 1-min recovery, during which recording of physiological parameters continued. Flexible
T. Ritz et al. amounts of time were allowed after each film for ratings of emotions and symptoms, after which the experimenter inquired whether the participant was relaxed and ready to continue with the protocol. After completion of the first set of films, the second set followed with the other of the two instructions, viewing only or viewing with muscle tension. After each surgery film, the experimenter inquired about symptoms suggestive of vasovagal responses (such as dizziness or faintness) and participants’ general emotional state. Flexible recovery time was granted in cases of apparent distress or faintness. In 1 patient who showed extreme distress and signs of faintness following recovery measurements, the laboratory chair was tilted into the horizontal position and legs were elevated above the level of the head by cushions at this point. Data Analysis Three-way repeated measures analyses of variance (ANOVAs) with three groups (asthma, blood phobia, control) as the between-individuals variable and two instructions (viewing only vs. film and tension), and five film categories (unpleasant, neutral, pleasant, surgery, and asthma) as within-individual variables were calculated with absolute values of physiological measures as dependent variables. Greenhouse–Geisser corrected degrees of freedom were used when appropriate. For post hoc comparison of means, the Tukey HSD test was used in all ANOVAs (significance level po.05). For comparing individual film effects within groups, these tests were performed for the average instruction condition effect, whenever no Instruction ! Film or Instruction ! Film ! Group effects were found. The analytic strategy largely followed steps related to the major questions outlined in the introduction. 1. Initially, we explored differences in physiological parameters between the three groups under neutral film viewing only conditions using one-way ANOVAs followed by post hoc tests. We expected no group differences for most parameters, except for higher Ros values and potentially elevated ventilation in asthma. 2. We then tested for arousal effects by quadratic trend across negative (average of unpleasant, surgery, and asthma films) versus neutral versus positive films. We expected higher Ros, arousal, and SCL for negative and positive versus neutral films. 3. Next, we tested for specific responding of asthma patients to BII-relevant material, expecting a Film ! Group interaction with evidence for stronger Ros responses for surgery films in asthma patients compared to the other groups and compared to other films within asthma patients. For the latter, we calculated a priori contrasts comparing the surgery film with the average of all other films (across conditions) within each group. Under the specificity assumption, this contrast would be significant only in asthma patients. Additional post hoc comparisons were made between the surgery and negative films, which controlled for simple effects of elevated negative affect (which is well known to increase Ros; Ritz, 2004), as well as the asthma-relevant film to control for disease-relevance of the material. Supplementary analysis was also planned to include calculation of effect sizes (g) for surgery film Ros changes (relative to neutral films and an initial 3-min Ros baseline), dependency of Ros change on initial values, and comparison of Ros changes with absolute thresholds for just noticeable obstruction in added
Airways and emotion in asthma and blood phobia resistive load studies (asthma: 0.076 kPa ! l " 1 ! s, BII phobia and healthy controls: 0.061 kPa ! l " 1 ! s). These criteria were derived from a review of added load studies and were medians of the thresholds found across these studies (Dahme, Richter, & Ma!, 1996). Such analysis was informative for learning more about airway responses to BII stimuli and their potential clinical relevance. 4. To explore similarities in response patterns with Ros, additional analyses of other autonomic, respiratory, and self-report parameters (beyond those analyzed in 2) followed within the three-way ANOVA design. No specific hypotheses were held except for HRminimum, which was expected to be particularly low in BII phobia patients during surgery films. All other means comparisons were post hoc. To reduce the number of comparisons for rating scales of emotion and symptoms, we restricted post hoc tests to differences between the BII-relevant, negative, and asthma-relevant films. 5. We then explored effects of muscle tension on Ros and other parameters, expecting significant instruction effects as evidence of general bronchodilation, as well as Instruction # Film effects, which would have indicated particular effectiveness in reducing Ros responses to the surgery film. 6. An additional set of three-way repeated measures ANOVAs was calculated to explore whether film effects on Ros and other parameters lasted into the recovery period. This would potentially inform about more tonic effects of airway constriction that may have greater disease relevance then shortlived changes in activation (Levenson, 1979). 7. We then explored associations of airway responses with other physiological and psychological parameters, between-individuals correlations (Spearman’s rho, two-tailed) were calculated between Ros changes on the one hand and ventilatory (RR, VT, V’E, PCO2, sighs), autonomic (HR, HRminimum, SBP, DBP, SCL), emotion, and symptom changes on the other hand (difference scores calculated for negative, positive, surgery, and asthma films, each minus neutral film). To reduce the possibility of Type 1 error, patterns of correlation were interpreted only when one physiological or psychological parameter yielded two or more significant coefficients (out of a possible total of 24 coefficients: 3 Groups # 2 Instructions # 4 Film Change Scores). In addition, for individual participants who showed marked distress and autonomic changes suggestive of vasovagal fainting in response to surgery films, we planned to conduct exploratory within-individual correlation analyses for concurrent and prospective (Lag 1) associations. The focus was on associations of Ros with ventilatory and autonomic parameters across successive 10-s averages during surgery film presentation and recovery. We also calculated within-individual multiple regression analyses controlling for the Ros Lag 1 autocorrelation, which provide a conservative estimate of such within-individual associations.
Results Physiological and Psychological Characteristics before the Experiment and during the Neutral Condition Analysis of the MFS yielded a significant overall group effect, Wilks lambda 5 .234, F(10,58) 5 6.19, po.001, pZ2 5 .588, with univariate tests showing significance for four of the five subscales (excluding Examinations and Symptoms), F(2,33) 5 6.70–36.32,
125 ps 5 .004–.001, pZ2s 5 .289–.688. Post hoc comparisons showed higher values in BII phobia patients than in asthmatic and healthy participants, the latter two not being significantly different from each other. Analysis of neutral film viewing only as baseline showed significantly higher Ros in asthma patients than the other groups, F(2,37) 5 6.12, p 5 .005 (Table 1), which is compatible with the airway disease activity in this group. In addition, they showed a significantly lower PCO2 level than the BII patients, F(2,37) 5 4.24, p 5 .002, with healthy controls taking an intermediary position. None of the other physiological or psychological parameters distinguished the three groups during the neutral viewing condition. A 3-min baseline measurements preceding the film viewing protocol was also available for Ros. Two-way repeated measures ANOVA showed increases from baseline to neutral film Ros, F(1,38) 5 4.76, p 5 .035, pZ2 5 .111. Values of Ros increased during neutral film for asthma by 6.9% (range –25% to 38%) and for BII phobia by 4.9% (range –14% to 21%), whereas for controls they decreased by 1.2% (range –12% to 11%). Overall Arousal Effects As expected, nonspecific arousal effects were demonstrated by significant quadratic trend tests with higher values during the average of negative and positive films compared to the neutral films for Ros, F(1,38) 5 11.74, po.001 (Figure 1, upper panel), SCL, F(1,37) 5 6.17, p 5 .018, and arousal ratings, F(1,38) 5 53.37, po.001. For arousal ratings, an additional interaction of Film # Group was found, F(8,148) 5 6.03, po.001, e 5 .84, pZ2 5 .246; post hoc tests indicated higher values for surgery films compared to asthma-relevant and negative films (as well as all other films) in BII phobia patients (Figure 2, lower panel). For other groups, no difference between disease-relevant and negative films was found. Specificity of Airway Response to BII Stimuli in Asthma Surgery films had the strongest effect on Ros; however, this effect was not specific to participants with asthma, as the overall Film # Group interaction was not significant (Table 1). Within all groups, surgery film Ros was significantly higher than the average of all other films, F(1,38) 5 13.00, 26.41, and 9.54, p 5 .004, .001, and .015, for asthma, BII phobia, and controls, respectively. In addition, post hoc tests showed higher values for the surgery film compared to asthma-relevant and negative films in all groups. Supplementary analyses are shown in Table 2, with difference scores and percentage change, effect sizes relative to neutral films or initial baseline, as well as the number of participants exceeding typical thresholds for just noticeable differences. On average, Ros increased for surgery films compared to neutral films in asthma by 12.5%, in BII phobia by 21.5%, and in controls by 9.5%. Effect sizes generally were large for BII phobia, small to medium for asthma, and small for healthy controls. Relative to baseline, differences in average Ros increase to both surgery films was significantly stronger in asthma and BII phobia patients compared to healthy controls, but relative to the neutral film Ros increases were only significantly different between BII phobia patients and healthy controls, with asthmatics taking an intermediate position (one-way ANOVAs, F(2,38) 5 4.29 and 3.62, p 5 .021 and .037, followed by post hoc tests). Correlations of Ros change scores with Ros baseline between and within groups were generally low for all sequences viewing only and viewing
126
T. Ritz et al.
Table 1. Means, SEs (in Parentheses), and ANOVA Effects for Film Presentation in Asthma Patients (n 5 14–15), BII Phobia Patients (n 5 11–12), and Healthy Controls (n 5 12–14) Pleasant #1
Ros (kPa " l Asthma BII phobia Controls
" s)
0.482 (0.284) 0.327 (0.318) 0.342 (0.294)
Neutral 0.470 (0.303) 0.319 (0.339) 0.353 (0.314)
Unpleasant 0.502 (0.315) 0.344 (0.352) 0.353 (0.326)
Surgery 0.522 (0.328) 0.389 (0.367) 0.381 (0.340)
Asthma
Film effect
Group effect
Film ! Group effect
0.484 (0.307) 0.324 (0.343) 0.341 (0.318) F(4,152) 5 24.59, po.001, F(2,38) 5 7.29, F(8,152) 5 1.55, p 5 .178, e 5 .66, pZ2 5 .393 p 5 .002, pZ2 5 .277 e 5 .66, pZ2 5 .075
RR (breaths/min) Asthma BII phobia Controls
16.3 (0.75) 17.6 (0.84) 16.0 (0.78)
15.5 (0.76) 16.0 (0.85) 15.9 (0.79)
17.6 (0.85) 17.1 (0.95) 17.6 (0.88)
16.3 (0.94) 16.8 (1.05) 15.7 (0.97)
15.5 (0.80) 17.2 (0.90) 16.1 (0.82) F(4,152) 5 7.31, po.001, F(2,38) 5 0.56, F(8,152) 5 1.08, po.001, e 5 .68, pZ2 5 .161 p 5 .946, pZ2 5 .003 e 5 .68, pZ2 5 .054
VT (ml) Asthma BII phobia Controls
421.6 (65.0) 435.7 (72.7) 472.0 (67.3)
428.8 (60.2) 419.1 (67.3) 442.3 (62.3)
397.1 (58.5) 435.7 (65.5) 437.6 (60.6)
423.8 (76.9) 614.0 (86.0) 474.7 (79.6)
429.0 (65.4) 423.3 (73.2) 478.3 (67.7) F(4,152) 5 9.19, po.001, F(2,38) 5 0.13, F(8,152) 5 5.58, po.001, p 5 .883, pZ2 5 .007 e 5 .51, pZ2 5 .227 e 5 .51, pZ2 5 .195
V’E (l/min) Asthma BII phobia Controls
6.8 (0.95) 7.1 (1.10) 7.1 (1.06)
6.4 (0.95) 6.8 (1.10) 6.8 (0.99)
6.9 (1.03) 7.1 (1.15) 7.3 (1.06)
6.6 (1.10) 10.1(1.23) 7.0 (1.14)
6.6 (0.95) 7.4 (1.06) 7.2 (0.98) F(4,152) 5 8.91, po.001, F(2,38) 5 0.26, F(8,152) 5 8.47, po.001, p 5 .769, pZ2 5 .014 e 5 .51, pZ2 5 .380 e 5 .51, pZ2 5 .190
Sighs (freq/min) Asthma BII phobia Controls
0.14 (0.055) 0.33 (0.061) 0.07 (0.057)
0.16 (0.077) 0.28 (0.086) 0.14 (0.080)
0.20 (0.065) 0.39 (0.072) 0.11 (0.067)
0.18 (0.103) 0.67 (0.115) 0.13 (0.107)
0.09 (0.040) 0.23 (0.044) 0.04 (0.041) F(4,152) 5 5.09, p 5 .003, F(2,38) 5 7.47, F(8,152) 5 2.23, p 5 .051, e 5 .69, pZ2 5 .118 p 5 .002, pZ2 5 .282 e 5 .69, pZ2 5 .105
PCO2 (mmHg) Asthma BII phobia Controls
33.3 (1.68) 39.1 (1.82) 37.1 (1.68)
33.8 (1.60) 39.9 (1.72) 37.4 (1.60)
33.0 (1.59) 38.0 (1.72) 35.9 (1.59)
33.2 (1.72) 35.3 (1.86) 37.6 (1.78)
33.4 (1.64) 38.4 (1.77) 36.2 (1.64) F(4,148) 5 7.98, po.001, F(2,37) 5 2.26, F(8,148) 5 6.49, po.001, p 5 .119, pZ2 5 .109 e 5 .65, pZ2 5 .260 e 5 .65, pZ2 5 .177
HR (beats/min) Asthma BII phobia Controls
73.8 (2.17) 72.0 (2.54) 74.2 (2.25)
74.0 (2.17) 71.1 (2.53) 74.9 (2.24)
72.9 (2.29) 70.0 (2.67) 73.0 (2.37)
73.4 (2.45) 75.5 (2.86) 72.4 (2.54)
73.7 (2.33) 72.1 (2.72) 73.4 (2.41) F(4,148) 5 3.47, p 5 .022, F(2,37) 5 0.11, F(8,148) 5 4.59, po.001, e 5 .69, pZ2 5 .086 p 5 .898, pZ2 5 .006 e 5 .69, pZ2 5 .199
HRmin (beats/min) Asthma BII phobia Controls
68.2 (2.20) 62.7 (2.57) 66.8 (2.28)
67.8 (2.14) 64.5 (2.50) 68.9 (2.22)
66.7 (2.23) 63.6 (2.61) 66.0 (2.31)
66.9 (2.31) 67.3 (2.70) 65.0 (2.39)
68.1 (2.31) 65.5 (2.70) 67.0 (2.39) F(4,148) 5 2.88, p 5 .047, F(2,37) 5 0.36, F(8,148) 5 4.39, po.001, e 5 .65, pZ2 5 .072 p 5 .698, pZ2 5 .019 e 5 .65, pZ2 5 .191
SBP (mmHg) Asthma BII phobia Controls
148.9 (4.53) 147.0 (5.29) 148.3 (5.06)
146.7 (5.27) 149.7 (6.15) 141.8 (5.89)
149.5 (4.70) 144.7 (5.48) 143.7 (5.29)
147.4 (5.17) 153.5 (6.03) 142.9 (5.78)
149.7 (4.70) 148.5 (5.49) 144.4 (5.26) F(4,140) 5 0.82, p 5 .493, F(2,35) 5 0.24, F(8,140) 5 1.94, p 5 .076, e 5 .80, pZ2 5 .023 p 5 .789, pZ2 5 .013 e 5 .80, pZ2 5 .100
DBP (mmHg) Asthma BII phobia Controls
85.4 (3.04) 83.0 (3.55) 79.7 (3.40)
83.7 (3.52) 86.2 (4.12) 76.8 (3.94)
85.8 (3.52) 82.3 (4.11) 77.0 (3.94)
85.4 (3.45) 87.0 (4.03) 77.8 (3.86)
85.9 (3.70) 84.7 (4.32) 77.8 (4.14) F(4,140) 5 0.46, p 5 .713, F(2,35) 5 1.39, F(8,140) 5 0.88, p 5 .519, p 5 .264, pZ2 5 .073 e 5 .84, pZ2 5 .048 e 5 .84, pZ2 5 .013
SCL (mS) Asthma BII phobia Controls
11.9 (2.68) 14.1 (3.13) 12.0 (2.88)
11.9 (2.59) 13.7 (3.03) 11.7 (2.78)
12.1 (2.82) 14.7 (3.29) 11.9 (3.03)
12.5 (2.83) 16.5 (3.30) 12.3 (3.03)
12.4 (2.86) 15.2 (3.34) 12.6 (3.07) F(4,144) 5 5.42, po.001, F(2,36) 5 0.26, F(8,144) 5 1.43, p 5 .206, e 5 .81, pZ2 5 .131 p 5 .775, pZ2 5 .014 e 5 .81, pZ2 5 .074
Note: Ros: respiratory resistance; RR: respiration rate; VT: tidal volume, V’E: minute ventilation; PCO2min: minimum partial pressure of carbon dioxide; HR: heart rate; HRmin: heart rate minimum; SBP: systolic blood pressure; DBP: diastolic blood pressure; SCL: skin conductance level.
Airways and emotion in asthma and blood phobia
127
Oscillatory resistance
8.00 Ratings (1–9)
0.55
kPa*l–1*s
0.50 0.45 0.40
Baseline Negative
Neutral
Positive
Surgery
Asthma
Asthma Blood phobia Control
4.00 3.00
Negative Neutral Positive Surgery Asthma
Arousal
8.00 Ratings (1–9)
0.50 kPa*l–1*s
5.00
9.00
Oscillatory resistance (recovery)
0.55
0.45
7.00 6.00
Asthma Blood phobia Control
5.00 4.00 3.00 2.00
0.40
1.00
0.35 0.30 0.25
6.00
1.00
0.30
0.60
7.00
2.00
0.35
0.25
Pleasantness
9.00
0.60
Negative Neutral Positive Surgery Asthma
Figure 2. Impact of emotion and disease-relevant films on experienced affective valence (upper panel) and arousal (lower panel). Baseline Negative
Neutral
Positive
Surgery
Asthma
Figure 1. Ros during (upper panel) and following (lower panel; 1-min recovery) emotion and disease-relevant films.
with tension, the number of significant associations not exceeding chance level. Additional Film Effects on Ventilation, Autonomic Parameters, and Self-Report of Emotion and Symptoms Ventilation. RR yielded a significant film effect (Table 1), with the post hoc testing indicating higher values for positive films compared to all other films. For VT, V’E, and PCO2, the Film ! Group interactions were significant, whereas for sighs it just missed significance. These interactions mostly reflected strong increases in ventilation and sighing and a reduction in PCO2 during surgery films in BII phobia patients, as demonstrated by post hoc tests that showed significantly higher VT and V’E, more sighing, and lower PCO2 for BII patients during surgery films than during any other films or during surgery films in other groups. Cardiovascular activity. Mean HR yielded a significant Film ! Group interaction, with post hoc tests showing higher HR in the BII phobia group during surgery as compared to all other films (Table 1). Similarly, HRminimum showed a significant Film ! Group interaction. The a priori contrast test for the surgery film against all other films was significant in both BII phobia patients and healthy controls, F(1,37) 5 7.35 and 4.15, p 5 .010 and .049, in the former because of higher values and in the latter because of lower values for the surgery film. For SBP, the Film ! Group effect was only borderline significant. Additional effects on self-report of emotion and symptoms. Pleasantness yielded a strong effect for films,
F(4,144) 5 46.13, po.001, e 5 .66, pZ2 5 .562, with a linear increase in pleasantness from negative to neutral to positive films, linear trend F(1,35) 5 164.0, po.001 (Figure 2, upper panel). In addition, the interaction of Film ! Group was significant, F(8,144) 5 3.98, po.001, e 5 .86, pZ2 5 .181. Post hoc tests did not show differences between negative, surgery, and asthma-related films, but blood phobia patients showed lower values in pleasantness during surgery films compared to other films. Both shortness of breath and chest tightness showed significant film effects, F(4,148) 5 19.40 and 9.19, pso.001, e 5 .71 and .64, pZ2 5 .344 and .199, respectively, as well as Film ! Group interactions, F(8,148) 5 9.92 and 3.80, po.001 and p 5 .003, e 5 .71 and .64, pZ2 5 .349 and .170, respectively. This was because of higher ratings of these symptoms for surgery films than
Table 2. Indices of Ros Change during Surgery Films Relative to Neutral Films or Baseline in Asthma Patients, BII Phobia Patients, and Healthy Controls Reference: neutral film D Asthma Viewing Viewing & tension BII phobia Viewing Viewing & tension Control Viewing Viewing & tension
%
n(t)
g
Reference: baseline D
%
n(t)
g
0.060 13.4 0.045 11.7
5 5
0.34 0.092 21.5 0.27 0.094 23.2
8 8
0.61 0.64
0.077 23.4 0.062 19.6
8 6
1.26 0.092 29.7 1.02 0.076 24.7
8 8
1.49 1.35
0.031 11.4 0.024 7.6
2 2
0.25 0.027 10.6 0.19 0.049 17.4
2 4
0.22 0.39
Note: D: difference score; %: percent change; n(t): number of participants with D exceeding just noticeable difference threshold; g: within-individual effect size.
128
T. Ritz et al.
any other film, whereas no differences between disease-relevant and negative films were found for the other groups. Effects of Leg Muscle Tension Leg EMG was significantly elevated during films with instructions to tense leg muscles (Table 3). Leg muscle tension did not change Ros substantially. RR, VT, and V’E were higher during viewing with leg muscle tension than during viewing only. Also, HR and HRminimum, SBP, DBP, and SCL were higher during leg muscle tension. For arousal ratings, a significant Instruction ! Group effect, F(1,37) 5 3.29, p 5 .048, pZ2 5 .151, was due to higher values during leg muscle tension in blood phobia patients and controls but lower values in asthma patients. None of the parameters yielded a three-way interaction of Instruction ! Film ! Group. Replication of Film and Instruction Effects during Recovery During 1-min recovery, Ros was still higher for the surgery film than any other film for all three groups (Figure 1, lower panel), film effect F(4,152) 5 8.19, po.001, e 5 .81, pZ2 5 .177; all post hoc tests were significant. VT, V’E, and PCO2 still yielded significant Group ! Film interactions, F(8,152) 5 6.96, 2.34, and 7.49, po.001, .049 and .001, e 5 .58, .61, and .57, pZ2 5 .146, .110, and .294, which were mainly because of continued hyperventilation in BII phobia patients, with post hoc tests showing differences between surgery and most other films in BII phobia or surgery films in other groups. Similarly, post hoc tests showed that mean HR was still elevated in BII phobia patients during recovery from surgery films, Film ! Group effect F(8,148) 5 2.63, p 5 .010, e 5 .84, pZ2 5 .124, as was SBP, Film ! Group effect F(8,140) 5 2.79, p 5 .016, e 5 .72, pZ2 5 .137. Instruction effects were still seen in recovery for HR and SCL, F(1,37) 5 10.45 and 5.95, p 5 .003 and .020, pZ2 5 .220 and
.203, as well as VT, F(1,38) 5 6.37, p 5 .016, pZ2 5 .144, V’E, F(1,38) 5 12.24, po.001, pZ2 5 .203, and DBP, F(1,35) 5 8.91, p 5 .005, pZ2 5 .203, with higher values for the viewing with leg muscle tension condition. Autonomic, Ventilatory, and Self-Report Correlates of Airway Responses Between-individual correlations for Ros with other parameters. Airway responses (Ros change for emotional or diseaserelevant minus neutral film) were positively associated with V’E for blood phobia patients during the surgery film with tension, r(12) 5 .74, p 5 .006, and during the asthma film viewing only, r(12) 5 .68, p 5 .015. Ros change was also positively associated with change in SCL for asthma patients during negative films both during viewing only and viewing with tension, r(14) 5 .64 and .56, p 5 .010 and .030, respectively, as well as for the positive film viewing only, r(14) 5 .65, p 5 .009. For BII phobia patients, SCL change was positively associated with Ros change for the surgery film viewing only, r(12) 5 .64, p 5 .026, but negatively associated during the asthma film with tension, r(12) 5 " .62, p 5 .003. Overall, the number of significant and almost significant (po.10) coefficients for physiological parameters was 6% and 8% of the total number of calculated coefficients. For correlations of changes in dimensional mood and symptom ratings with Ros changes, only one coefficient was significant across all three groups and all films, which may have been due to chance. Oscillatory resistance during vasovagal episodes: withinindividual associations for two cases. One female BII phobia participant showed massive signs of distress and nausea during the surgery film with tension. Her HR increased and fluctuated markedly from initially 96 to 115 b/min after 80 s into the film, and her SBP showed an increase from 128 to 150 mmHg after
Table 3. Means and Standard Deviations of Physiological Parameters during the Average Film Viewing versus Film Viewing with Tension Conditions in Asthma, BII Phobia, and Control Participants Ros (kPa # l " 1 # s) Asthma Viewing M 0.492 SD 1.18 Viewing & tension M 0.492 SD 1.18 BII phobia Viewing M 0.344 SD 1.18 Viewing & tension M 0.337 SD 1.18 Healthy Viewing M 0.345 SD 1.18 Viewing & tension M 0.362 SD 1.18 a Instruction effect p level .544 .010 pZ2 a
EMGleg (mV)
RR (br/min)
3.63 2.93
15.6 2.95
5.56 2.95
V’E (l/min)
PCO2 (mmHg)
HR (bpm)
414.7 243.8
6.21 3.58
33.3 5.89
71.9 8.74
16.8 3.07
425.5 250.7
7.04 4.00
33.4 6.09
4.10 2.93
15.9 2.94
437.2 243.8
7.02 3.58
6.04 2.95
17.3 3.06
479.8 249.4
3.67 2.93
16.1 2.95
4.65 2.95
16.5 3.07
.001 .316
.001 .250
VT (ml)
SBP (mmHg)
DBP (mmHg)
66.1 8.4
143.1 15.98
82.6 12.46
11.8 9.5
75.2 8.82
69.0 8.8
153.7 21.59
87.9 13.85
12.5 11.8
37.7 6.02
70.3 8.76
63.0 8.4
146.0 16.04
81.9 12.48
14.4 9.5
8.35 4.00
38.6 6.09
73.9 8.86
66.5 8.8
151.3 21.64
87.4 13.88
15.3 11.8
433.3 244.0
6.53 3.58
36.9 6.02
71.7 8.75
65.0 8.4
142.0 16.02
76.6 12.46
10.7 9.5
488.6 249.6
7.59 4.00
36.8 6.10
75.4 8.83
68.5 8.8
146.5 21.59
79.0 13.87
13.5 11.8
.001 .300
.001 .534
.223 .040
.001 .503
HRminimum (bpm)
.001 .464
.004 .223
.006 .199
Significance levels and pZ2 refer to F tests for ANOVA main effects of instruction (film viewing vs. film viewing with tension); df: 1,34–38.
SCL (mS)
.009 .173
Airways and emotion in asthma and blood phobia
129
Oscillatory resistance and respiration during surge film, BII phobia patient #175
Oscillatory resistance Tidal volume
Ros (kPa • l–1 • s), VT (liter) and RR ((breaths / min) / 100)
0.8
Respiration rate
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22
Blood pressure and heart rate during surgery film, BII phobia patient #175 Systolic blood pressure Diastolic blood pressure
HR (bpm), SBP and DBP (mmHg)
160
Heart rate
140
120
100
80
60
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22
Figure 3. Autonomic and respiratory response (in consecutive 10-s means) of the 1 healthy control who felt close to fainting during one surgery film (viewing only). Upper panel: respiratory resistance and breathing pattern. Lower panel: heart rate and blood pressure. Break with vertical line indicates the beginning of recovery. The break before that in Ros indicates data loss by repeated swallowing.
60 s into the film, fluctuated until 110 s, and then gradually fell again (Figure 3). The experimenter terminated the surgery film presentation after 170 s. Following recovery, she reported feeling faint and was brought into the horizontal position. Ros increased markedly during the film from its initial lowest level at 0.345 kPa ! l " 1 ! s to 0.573 kPa ! l " 1 ! s (66% increase), with both VT and RR varying between very high and low values. One healthy male participant showed a steady drop in SBP and fluctuations in HR during the surgery films (viewing only), which continued into the recovery (HR from 61 to 48 bpm, SBP from 123 to 94 mmHg; Figure 4). He subsequently reported having been close to fainting. Ros increased gradually from the lowest initial level of 0.241 kPa ! l " 1 ! s to 0.382 kPa ! l " 1 ! s (59% increase) in the first half of the film and then dropped back to initial levels. At the same time, VT increased massively and RR dropped. At the end of the recovery, Ros was again markedly increased to 0.473 kPa ! l " 1 ! s (96% increase).
Table 4 shows within-individual Pearson correlations across successive 10-s intervals for both participants. Negative concurrent associations of VT and SCL with Ros and a negative prospective association of HR Lag 1 with Ros were found for the healthy participant. In within-individual multiple regression analyses controlling for the Ros Lag 1 autocorrelation, these concurrent and prospective associations were still significant. For the BII phobia patient, Ros was concurrently associated with HR and SBP and prospectively with DBP. These associations did not remain significant when accounting for the Ros Lag 1 autocorrelation.
Discussion Bronchoconstriction in Response to Blood and Injury Stimuli In this study, we found that the airways responded particularly strongly to film presentation of blood and injury stimuli. Surgery films also elicited stronger responses than asthma-related film
130
T. Ritz et al. Oscillatory resistance and respiration during surgery film, healthy participant #189
0.9
Ros (kPa • l–1 • s)), VT (ml) and RR ((breaths / min) / 100)
0.8
Oscillatory resistance Tidal volume
0.7
Respiration rate
0.6 0.5 0.4 0.3 0.2 0.1 0
1
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33 35 37
Blood pressure and heart rate during surgery film, healthy participant #189 140
Systolic blood pressure Diastolic blood pressure
HR (bpm), SBP and DBP (mmHg)
Heart rate
120
100
80
60
40
1
3
5
7
9
11 13 15 17 19 21 23 25 27 29 31 33 35 37
Figure 4. Autonomic and respiratory response (in consecutive 10-s means) of the 1 BII phobia patient who felt close to fainting during one surgery film (viewing with tension). Upper panel: respiratory resistance and breathing pattern. Lower panel: heart rate and blood pressure. Break with vertical line indicates the beginning of recovery.
sequences in asthma patients and were the only film category that elicited resistance increases that lasted well into the 1-min recovery period. However, the exaggerated airway responses were not limited to asthma patients. BII patients, who showed Table 4. Concurrent and Prospective Within-Individual Associations of Autonomic and Respiratory Parameters with Ros for Two Participants Close to Fainting during One Surgery Film RR
VT
V’E PCO2 HR SBPDBP SCL
Participant 189 (healthy control) Concurrent association .15 ! .35 ! .27 .11 ! .01 Lag 1 association .07 ! .04 .03 ! .26 ! .41 Participant 175 (blood phobia) Concurrent association .17 ! .07 .20 ! .19 .46 Lag 1 association ! .30 .01 ! .19 ! .05 .26
.15 .29 ! .39 .07 .02 ! .27 .61 .44 .22 .44 .47 ! .07
Note: Numbers in bold: po.05 (note that correlations are within individuals).
strong emotional arousal to the surgery films, showed similarly strong airway responses. This observation extends our earlier findings, which had only suggested a strong response of asthma patients to BII-related stimuli (Ritz, Steptoe, et al., 2000). In fact, expressed in overall percent from neutral film presentation levels, BII phobia patients exhibited the strongest increase (420%) in Ros. We also found particularly strong increases in two individual participants (one of which was a healthy control) who were close to fainting during one of the surgery films. In both cases, the time course of Ros showed a gradual increase from initial levels that reached a high plateau (approximately 60% increase from initial values) 70 to 120 s into the film. These findings add to earlier speculations about potential similarities in autonomic regulation between asthma and blood phobia (Graham et al., 1961; Knapp & Nemetz, 1960; Lehrer et al., 1993). Our third group, healthy controls, also showed stronger Ros increases to the surgery films than to any other film category, suggesting a nonspecific responding of the airways to this type of stimulus.
Airways and emotion in asthma and blood phobia Self-report only partly reflected the findings in Ros: Although BII patients rated unpleasantness and arousal significantly higher for the surgery film than for other films, this was not the case for the asthma patients and healthy controls. Thus, BIIrelated material is a particularly potent bronchoconstrictor, which is not necessarily reflected in elevated individual distress levels. However, when self-reported or behavioral distress (fainting behavior) is specifically high with this type of stimulus, as in our BII phobia group, airway constriction is potentiated. In this study, we were not able to determine the contribution of a hyperresponsiveness of the airways to cholinergic agents in asthma (Barnes, 1986). However, our findings of equally strong airway responses in BII phobia patients with healthy airways suggests that vagal excitation rather than sensitivity at the endorgan level determined the outcome in Ros. Alternatively, vagal exitation may have dominated in BII phobia patients, whereas airway hyperreactivity may have been the factor in asthma patients. The former interpretation is supported by another recent study in which we did not find an association between airway hyperresponsiveness to methacholine and airway constriction to the same BII stimuli in asthmatic individuals, but were able to attenuate the airway response by cholinergic blockade with ipratropium bromide (Ritz et al., 2010). The present study is unique in demonstrating that a particular type of stimulus material elicits direct, immediate, strong, and at least 1-min-long effects on airway obstruction, a major clinical end point in asthma diagnosis. The effect sizes of airway responses to the surgery films were comparable or greater than those observed in studies using the forced oscillation technique of varying stimulus materials (Levenson, 1979; McQuaid et al., 2000; Ritz, 2004). Compared to the typical absolute threshold for the perception of airway obstruction from added resistive load studies (Dahme et al., 1996), 33% of our asthma patients, 58% of BII phobia patients, and 14% of healthy controls experienced resistance increases during the average surgery film that were of potential clinical relevance. These numbers were even larger when the preexperimental baseline, rather than the neutral film condition, was taken as a reference (53%, 67%, and 21%, respectively).1 Such resistance changes in the range of just noticeable differences did not necessarily translate into typical asthma symptoms, such as shortness of breath, in our study. However, in the daily life of symptomatic asthma patients, such effects could aggravate bronchoconstriction elicited by other asthma triggers and bring these patients closer to airway obstruction levels typical for asthma exacerbations. Patients comorbid with asthma and BII phobia might be at a particular risk, especially considering the BII-related stimuli impact on airways that are already preconstricted by other triggers. Although there is evidence for an elevated prevalence of psychiatric comorbidity, including anxiety disorders, in asthma and an associated greater health care burden (Goodwin, 2003), direct effects of specific anxiety disorders on the airway response to thematically relevant stress have rarely been explored. 1 Whereas the comparison of an emotional with the neutral films provided the effect of the emotion minus the nonspecific effect of film viewing, the comparison with baseline informed about the total effect of viewing emotional films relative to a state of relaxed inactivity. Thus, researchers studying the differential impact of specific emotional states on the airways would find the comparison with the neutral film most relevant, whereas those interested in the total effect of a daily life activity such as watching an emotional film would find the comparison with the baseline most relevant.
131 Autonomic and Ventilatory Correlates of Airway Responses A multitude of psychosocial pathways to asthma have been considered to date, including direct effects on airway pathophysiology through the autonomic nervous system regulation (e.g., Isenberg et al., 1992; Lehrer et al., 1996; Ritz, Steptoe, et al., 2000), ventilation (e.g., Clarke, 1982; Ritz et al., 2008), inflammatory processes (e.g., Joachim et al., 2003; Kullowatz et al., 2008; Liu et al., 2002), endocrine pathways (Buske-Kirschbaum et al., 2003; Wamboldt, Laudenslager, Wamboldt, Kelsay, & Hewitt, 2003), modification of receptor gene expression (Miller & Chen, 2006), and upper respiratory tract infection (Wright, Rodriguez, & Cohen, 1998), as well as indirect effects through asthma management (e.g., Feldman et al., 2005; Kaugars, Klinnert, & Bender, 2004). These pathways vary greatly with respect to the timing of their impact on asthma-relevant outcome measures. In the present study, we explored a number of ventilatory and autonomic parameters that might have provided clues to mechanisms involved in the fast-onset airway responses observed. In general, Ros increases followed an arousal pattern similar to that of SCL and V’E, with emotional and diseaserelevant films significantly exceeding neutral film levels. This is in line with prior studies that demonstrated arousal modulation of respiratory resistance increases (e.g., Ritz, George, et al., 2000; Ritz, Steptoe, et al., 2000; von Leupoldt & Dahme, 2005), V’E (Gomez, Zimmermann, Guttormsen-Scha¨r, & Danuser, 2005), and skin conductance (e.g., Lang et al., 1993; Winton, Putnam, & Krauss, 1984). V’E and SCL were also found to be associated consistently with Ros changes in correlational analyses, particularly in asthma and BII phobia patients who responded with stronger bronchoconstriction. Participants showing stronger overall Ros increases also showed significantly stronger increases in SCL and V’E in more than one film presentation. Despite the consistencies, the sign of the associations varied across betweenindividuals and within-individual correlations. In particular, within-individual correlations also yielded negative associations of SCL and V’E with Ros. Whereas a positive association between Ros and V’E could be interpreted as coupling between respiratory drive and resistance (Baker & Don, 1988) or increases in ventilation affecting the airways negatively (and thus increasing Ros by drying, cooling, or irritation), it is more difficult to conceptualize the association between Ros and SCL in terms of underlying physiological mechanisms. Although common cholinergic neurotransmission and a potential alteration of this system by allergic processes could be invoked as an explanation (Kaliner, 1976; Lehrer et al., 1996; Marshall, 1989), the seemingly divergent directions of autonomic activation may simply be another instance of fractionation of response direction discussed previously by psychophysiologists (Lacey & Lacey, 1974). They could be part of an integrated pattern of response to environmental challenge not following the traditional view of reciprocity in sympathetic and parasympathetic activity but may be of functional significance (e.g., defensive protection of the airways and skin). In any case, our findings encourage further exploration of the link between these two activation parameters at two different organ sites to improve our understanding of stress-related airway constriction. It is still possible that the positive between-individual associations of Ros with SCL and V’E were because the induced stress raised levels of all three through separate mechanisms. Vagal excitation has long been thought to be the major mechanism of airway constriction to psychological stimuli (Boushey, 1981; Isenberg et al., 1992; Miller & Wood, 2003; Miller, Wood,
132 Lim, Ballow, & Hsu, 2009). Studies with pharmacological blockade have demonstrated involvement of the vagal pathway in airway obstruction when suggestions of bronchoconstriction were made to experimental participants (McFadden, Luparello, Lyons, & Bleeker, 1969) and more recently in our research, when film or picture stimuli were presented (Ritz et al., 2010). However, the present findings do not suggest a strong overall vagal excitation during surgery films, which would have been indexed by a pronounced bradycardia. Only healthy controls showed lower minimum HR during surgery films than all other films. In none of the groups was participants’ minimum HR related to their extent of airway obstruction.2 At least three factors could have contributed to the lack of association between Ros and HR. First, vagal components may indeed not have been very substantial at least in BII phobia and asthma patients during the 5-min surgery films. Only healthy controls showed significant decreases in minimum HR during the surgery films, whereas asthma patients showed only nonsignificant decreases. Longer exposure may be necessary to observe a more solid eventual transition into vagal excitation, particularly in BII phobia patients. These patients showed mainly pronounced SBP, DBP, and HR increases during the surgery films similar to a classic sympatho-adrenergic anxiety response, and some refused to continue viewing the surgery episode. It should also be noted that the role of the vagal component in BII phobic responses is increasingly being debated (Sarlo, Buodo, Munafo`, Stegagno, & Palomba, 2008). Second, the vagal system may not act in a unitary fashion across organ systems. Vagal excitation may have constricted airways more than it lowered HR. Instances of organ-specific vagal activation have been reported in the literature and in our own studies, which compared Ros changes with changes in indices of cardiac vagal control (for review, see Ritz, 2009). Third, dissociations between observable instances of vagal excitation at various organ sites could also result from differences in gain and timing parameters in different branches of the vagal system or different end organs. Vagal excitation elicited by our BII-relevant material may thus have translated into small cardiac decelerations and greater airway constrictions. Effects also may take longer to be manifested in the airways than in the heart, which would be consistent with the negative prospective within-individual correlation between HR and Ros observed for the healthy control participant. Finally, if vagal and sympathetic coactivation had taken place, which is supported by SCL increases, the greater importance of vagal versus sympathetic motor control of the airway smooth muscle (Barnes, 1986; Canning & Fischer, 2001) could also have resulted in stronger effects on Ros than HR. In general, reciprocity of sympathetic and parasympathetic activation is increasingly considered as only one of many possible activation patterns of these systems (Berntson et al., 1991). Overall, a working model of autonomic and ventilatory pathways to psychologically induced airway responses will require further exploration and refinement by future studies. Major open 2
However, as shown in the prospective analyses for the 1 healthy participant close to fainting, higher HR at time 0 predicted lower Ros at time 0110 s. This lag association could be related to activation of the baroreflex, which has been associated with bronchodilation in animal studies (Nadel & Widdicombe, 1962; Schultz, Pissari, Coleridge, & Coleridge, 1987). Consistent with this finding, we observed in an earlier study (Ritz, Steptoe, et al., 2000) a negative association between an index of baroreflex sensitivity and respiratory resistance change in asthma patients during a mental arithmetic task.
T. Ritz et al. questions deal with the exact impact of sympathetic arousal indexed both by adrenergic and cholinergic postganglionic transmission (the latter explored here and in previous studies by skin conductance) as well as circulating catecholamine effects, which may also come into play in more extreme cases of emotional arousal. Further exploration of the vagal pathway is also indicated. A limitation of the current study was that we could not include an analysis of respiratory sinus arrhythmia for technical reasons. However, given the high degree of specificity in the vagal system (Ritz, 2009), pharmacological blockade of cholinergic airway receptors by anticholinergic inhalers would be a more promising strategy to explore airway-specific vagal excitation (see, e.g., Ritz et al., 2010). Finally, the impact of major ventilatory changes such as hyperventilation may be more visible in extreme emotional situations or stronger symptomatic episodes in asthma patients. This may be because of the influence of stronger ventilatory changes on airways that are hyperreactive to irritation, drying, cooling, or fall in PCO2 (McFadden & Gilbert, 1994; van den Elshout et al., 1991). Effect of Leg Muscle Tension on Airway Responses to Films In this study we had shown one set of five films under the instruction to tense leg muscles. Voluntary muscle contractions could theoretically be a simple and effective behavioral maneuver to counteract emotion-induced airway constriction by vagal withdrawal (Ritz, Dahme, et al., 1998). Despite reasonable expectations, we did not observe any substantial influence of static leg muscle tension on Ros. On average, only BII phobia patients showed lower Ros values during viewing and tension instructions across all films. Perhaps increases in ventilation counteracted potential decreases in Ros. Typically, increases in ventilation have been linked to increases in resistance, but reflex bronchodilation to skeletal muscle tension, which also leads to ventilation increases, has been shown to override this relationship in studies with cats (Baker & Don, 1988). Given interspecies variations in airway regulation, it is conceivable that these opposing response directions cancel each other out in humans, at least under some conditions. Indeed, we were able to demonstrate bronchodilation in our static muscle tension study in all conditions except for the arm muscle tension conditions in asthma patients, in which we also observed substantial increases in ventilation (Ritz, Dahme, et al., 1998). Another possibility could be the longer duration of static muscle tension. Other studies that have examined tonic effects of static arm, shoulder, of facial muscle tension over minutes found no changes in resistance or even increases on average (Lehrer, Generelli, & Hochron, 1997; Ritz, Wiens, & Dahme, 1998). Muscle tension levels may also have been too low or the muscle mass that was tensed too small to lead to sustained changes in Ros. Finally, dynamic rather than static exercise could yield stronger effects on the airways, given that exercise-induced bronchodilation has been demonstrated in humans almost exclusively using the former. Limitations Our study was also limited by the small sample size, which may have rendered some effects nonsignificant because of a lack of power. In addition to the muscle tension effects, this may have affected group differences in airway responses and associations of Ros with other variables. Another limitation could have been potential order effects. The fact that film presentations were randomized could have led to some nonspecific effects of tense expectation in BII phobia patients depending on the position of
Airways and emotion in asthma and blood phobia
133
the feared stimulus in the film order: Although participants were not informed on the number of BII relevant films, late presentation of this material within the series of film clips could have led to more tension and thus elevated levels in some physiological parameters, such as blood pressure. Variations in length of our films could have been another limitation; however, this limitation is shared with most film studies of emotion induction (Rottenberg, Ray, & Gross, 2007). In addition, premature termination of the surgery films by some of our BII phobia patients also contributed to variations in film length. Given the lack of consensus among theorists about the duration of an emotion and probable variation between individual emotional states in parameters of activation such as latency, rise time, peak, and/or duration, the averaging across a certain period of time is a pragmatic approach. A more promising method for future studies may involve continuous tracking of particular aspects of emotional experience of participants using manual tracking devices (e.g., Levenson, 1988) and off-line extraction key scenes across channels. The arousal matching of film clips was also not optimal. In BII phobia patients and healthy controls, the arousal values for the BII-relevant film exceeded those of most other films. The high surgery film Ros values in both groups could thus be a function of the higher arousal value, rather than being a BII content specific effect. On the other hand, in asthma patients, levels of arousal were comparable between all emotional films, including the surgery film, yet Ros values were again higher for surgery than for all other films. Finally, our 1-min recovery periods following film presentations were relatively short. Longer recovery periods would have
allowed for a more detailed exploration of how emotion-induced airway responses resolve following stimulus presentation. However, in planning the study we sought to balance gain in information with potential adverse effects of boredom on physiology in those participants who were less affected by the films. The short duration of the recovery most likely did not lead to any notable carryover effects following films. After each 1-min recovery recording, participants filled in the emotions and symptoms rating sheet and briefly interacted with the experimenter, who typically explored whether they felt okay, were relaxed, and were ready to continue with the next film. Participants with BII phobia were given extra time to recover after the surgery film ratings. Conclusion We have shown that BII stimuli are particularly potent in eliciting airway responses in asthma patients, BII phobia patients, and healthy controls. This stimulus material is more potent than material with general positive or negative emotional valences or with asthma-relevant themes depicting asthma attacks. In general, a heightened sensitivity to BII stimuli leads to particularly strong airway constriction, as seen in our BII patient group and in 2 participants who were close to fainting. Excitation of the sudomotor system seems to have a particularly tight association with airway regulation during emotional stimulation. Further research on autonomic and ventilatory mechanisms of airway constriction to stress, in particular BII-related distress, and potential behavioral countermeasures against ensuing airway obstructions is indicated.
REFERENCES Bagby, R. M., Parker, J. D. A., & Taylor, G. J. (1994). The twenty-item Toronto Alexithymia ScaleFI. Item selection and cross-validation of the factor structure. Journal of Psychosomatic Research, 38, 23–32. Baker, D. G., & Don, H. (1988). Reversal of relation between respiratory drive and airway tone in cats. Respiration Physiology, 73, 21–30. Barnes, P. J. (1986). Neural control of human airways in health and disease. American Review of Respiratory Disease, 134, 1289–1314. Basmajian, J. V., & Blumenstein, R. (1982). Electrode placement in electromyographic biofeedback. In J. V. Basmajian (Ed.), Biofeedback. Principles and practice for clinicians (3rd ed, pp. 369–382). Baltimore, MD: Williams & Wilkins. Beck, K. C., Offord, K. P., & Scanlon, P. D. (1994). Bronchoconstriction occurring during exercise in asthmatic subjects. American Journal of Respiratory and Critical Care Medicine, 149, 352–357. Berntson, G. G., Cacioppo, J. T., & Quigley, K. S. (1991). Autonomic determinism: The modes of autonomic control, the doctrine of autonomic space, and the laws of autonomic constraint. Psychological Review, 98, 459–487. Boushey, H. A. (1981). Neural mechanisms in asthma. In H. Weiner, M. A. Hofer, & A. J. Stunkard (Eds.), Brain, behavior, and bodily disease (pp. 27–44). New York: Raven Press. Buske-Kirschbaum, A., von Auer, K., Krieger, S., Weis, S., Rauh, W., & Hellhammer, D. (2003). Blunted cortisol responses to psychosocial stress in asthmatic children: A general feature of atopic disease? Psychosomatic Medicine, 65, 806–810. Canning, B. J., & Fischer, A. (2001). Neural regulation of airway smooth muscle tone. Respiration Physiology, 125, 113–127. Carruthers, M., & Taggart, P. (1973). Vagotonicity of violence: Biochemical and cardiac response to violent films and television programmes. British Medical Journal, 3, 384–389. Clarke, P. S. (1982). Emotional exacerbation in asthma caused by overbreathing. Journal of Asthma, 19, 249–251. Dahme, B., Richter, R., & Ma!, R. (1996). Interoception of respiratory resistance in asthmatic patients. Biological Psychology, 42, 215–229.
Engel, G. L. (1978). Psychological stress, vasodepressor (vasovagal) syncope, and sudden death. Annals of Internal Medicine, 89, 403–412. Feldman, J. M., Siddique, M. I., Morales, E., Kaminski, B., Lu, S. E., & Lehrer, P. M. (2005). Psychiatric disorders and asthma outcomes among high-risk inner-city patients. Psychosomatic Medicine, 67, 989–996. First, M. B., Spitzer, R. L., Gibbon, M., & Williams, J. B. W. (1994). Structured clinical interview of DSM IV: Patient edition (SCID-I/P). Version 2.0. New York: Biometrics Research Department. Freed, A. N. (1995). Models and mechanisms of exercise-induced asthma. European Respiratory Journal, 8, 1770–1785. Gardner, W. (1994). Measurement of end-tidal pCO2 and pO2. Biofeedback and Self-Regulation, 19, 103–113. Gerlach, A. L., Spellmeyer, G., Vo¨gele, C., Huster, R., Stevens, S., Hetzel, G., et al. (2006). Blood-injury phobia with and without a history of fainting: Disgust sensitivity does not explain the fainting response. Psychosomatic Medicine, 68, 331–339. Gomez, P., Zimmermann, P., Guttormsen-Scha¨r, S., & Danuser, B. (2005). Respiratory responses associated with affective processing of film stimuli. Biological Psychology, 68, 223–235. Goodwin, R. D. (2003). Asthma and anxiety disorders. Advances in Psychosomatic Medicine, 24, 51–71. Graham, D. T., Kabler, J. D., & Lunsford, L. (1961). Vasovagal fainting: A diphasic response. Psychosomatic Medicine, 23, 493–507. Gross, J. J., & Levenson, R. W. (1995). Emotion elicitation using films. Cognition and Emotion, 9, 87–108. Hodes, R. L., Cook, E. W. III, & Lang, P. J. (1985). Individual differences in autonomic response: Conditioned association or conditioned fear? Psychophysiology, 22, 545–557. Isenberg, S. A., Lehrer, P. M., & Hochron, S. (1992). The effects of suggestion and emotional arousal on pulmonary function in asthma: A review and a hypothesis regarding vagal mediation. Psychosomatic Medicine, 54, 192–216. Joachim, R. A., Quarcoo, D., Arck, P., Herz, U., Renz, H., & Klapp, B. (2003). Stress enhances airway reactivity and airway inflammation in
134 an animal model of allergic bronchial asthma. Psychosomatic Medicine, 65, 811–815. Kagawa, J., & Kerr, H. D. (1970). Effects of brief graded exercise on specific airway conductance in normal subjects. Journal of Applied Physiology, 28, 138–144. Kaliner, M. (1976). The cholinergic nervous system and immediate hypersensitivity. 1. Eccrine sweat responses in allergic patients. Journal of Allergy and Clinical Immunology, 58, 308–315. Kaufman, M. P., Rybicki, K. J., & Mitchell, J. H. (1985). Hindlimb muscular contraction reflexly decreases total pulmonary resistance in dogs. Journal of Applied Physiology, 59, 1521–1526. Kaugars, A. S., Klinnert, M. D., & Bender, B. G. (2004). Family influences on pediatric asthma. Journal of Pediatric Psychology, 29, 475–491. Kleinknecht, R. A., Thorndike, R. M., & Walls, M. M. (1995). Factorial dimensions and correlates of blood, injury, injection and related medical fears: Cross validation of the Medical Fears Survey. Behavior Research and Therapy, 34, 323–331. Knapp, P. H., & Nemetz, S. J. (1960). Acute bronchial asthma. I. Concomitant depressionand excitement, and varied antecedent patterns in 406 attacks. Psychosomatic Medicine, 22, 42–56. Kozak, M. J., & Miller, G. A. (1985). The psychophysiological process of therapy in a case of injury-scene-elicited fainting. Journal of Behavior Therapy and Experimental Psychiatry, 16, 139–145. Kullowatz, A., Rosenfield, D., Dahme, B., Magnussen, H., Kanniess, F, & Ritz, T. (2008). Effects of stress on lung function in asthma are mediated by changes in airway inflammation. Psychosomatic Medicine, 70, 468–475. Lacey, B. C., & Lacey, J. I. (1974). Studies of heart rate and other bodily processes in sensorimotor behavior. In P. A. Obrist, A. H. Black, & J. Brener (Eds.), Cardiovascular psychophysiology (pp. 538–564). Chicago: Aldine. Lang, P. J., Greenwald, M. K., Bradley, M. M., & Hamm, A. O. (1993). Looking at pictures: Affective, facial, visceral, and behavioral reactions. Psychophysiology, 30, 261–273. Larsen, R. J., & Diener, E. (1987). Affect intensity as an individual difference characteristic: A review. Journal of Research in Personality, 21, 1–39. Lehrer, P., Generelli, P., & Hochron, S. (1997). The effect of facial and trapezius muscle tension on respiratory impedance in asthma. Applied Psychophysiology and Biofeedback, 22, 43–54. Lehrer, P. M., Hochron, S., Carr, R., Edelberg, R., Hamer, R., Jackson, A., & Porges, S. (1996). Behavioral task-induced bronchodilation in asthma during active and passive tasks: A possible cholinergic link to psychologically induced airway changes. Psychosomatic Medicine, 58, 413–422. Lehrer, P. M., Isenberg, S., & Hochron, S. M. (1993). Asthma and emotion: A review. Journal of Asthma, 30, 5–21. Levenson, R. W. (1979). Effects of thematically relevant and general stressors on specificity of responding in asthmatic and nonasthmatic subjects. Psychosomatic Medicine, 41, 28–39. Levenson, R. W. (1988). Emotion and the autonomic nervous system: A prospectus for research on autonomic specificity. In H. L. Wagner (Ed.), Social psychophysiology and emotion: Theory and clinical application (pp. 17–42). Oxford: Wiley. Lewis, S. E., Snell, P. G., Taylor, W. F., Hamra, M., Graham, R. M., Pettinger, W. A., et al. (1985). Role of muscle mass and mode of contraction in circulatory responses to exercise. Journal of Applied Physiology, 58, 146–151. Liangas, G., Morton, J. R., & Henry, R. L. (2003). Mirth-triggered asthma: Is laughter really the best medicine? Pediatric Pulmonology, 36, 107–112. Liu, L. Y., Coe, C. L., Swenson, C. A., Kelly, E. A., Kita, H., & Busse, W. W. (2002). School examinations enhance airway inflammation to antigen challenge. American Journal of Respiratory and Critical Care Medicine, 165, 1062–1067. Longhurst, J. C. (1984). Static contraction of hindlimb muscles in cats reflexly relaxes tracheal smooth muscle. Journal of Applied Physiology, 57, 380–387. Mansfield, L., McDonnell, J., Morgan, W., & Souhrada, J. F. (1979). Airway response in asthmatic children during and after exercise. Respiration, 38, 135–143. Marshall, P. (1989). Attention deficit disorder and allergy: A neurochemical model of the relation between the illnesses. Psychological Bulletin, 106, 434–446.
T. Ritz et al. McFadden, E. R., Luparello, T., Lyons, H. A., & Bleeker, E. (1969). The mechanism of action of suggestion in the induction of acute asthma attacks. Psychosomatic Medicine, 31, 134–143. McFadden, E. R. Jr., & Gilbert, I. A. (1994). Exercise-induced asthma. New England Journal of Medicine, 330, 1362–1367. McQuaid, E. L., Fritz, G. K., Nassau, J. H., Lilly, M. K., Mansell, A., & Klein, R. B. (2000). Stress and airway resistance in children with asthma. Journal of Psychosomatic Research, 49, 239–245. Miller, B. D., & Wood, B. L. (2003). Emotions and family factors in childhood asthma: Psychobiologic mechanisms and pathways of effect. Advances in Psychosomatic Medicine, 24, 131–160. Miller, B. D., Wood, B. L., Lim, J., Ballow, M., & Hsu, C. (2009). Depressed children with asthma evidence increased airway resistance: ‘‘Vagal bias’’ as a mechanism? Journal of Allergy & Clinical Immunology, 124, 66–73. Miller, G. E., & Chen, E. (2006). Life stress and diminished expression of genes encoding glucocorticoid receptor and beta2-adrenergic receptor in children with asthma. Proceedings of the National Academy of Sciences, USA, 103, 5496–5501. Nadel, J. A., & Widdicombe, J. G. (1962). Effect of changes in blood gas tension and carotid sinus pressure on tracheal volume and total lung resistance to airflow. Journal of Physiology, 163, 13–33. National Heart, Lung, and Blood Institute and World Health Organization (2002). NHLBI/WHO workshop report: Global strategy for asthma management and prevention. NIH Publication 02-3659. Bethesda, MD: National Institute of Health. O¨st, L. G., & Sterner, U. (1987). Applied tension. A specific behavioral method for treatment of blood phobia. Behaviour Research and Therapy, 25, 25–29. Padrid, P. A., Haselton, J. R., & Kaufman, M. P. (1990). Ischemia potentiates the reflex bronchodilation evoked by static muscular contraction in dogs. Respiration Physiology, 81, 51–62. Ritz, T. (2004). Probing the psychophysiology of the airways. Physical activity, experienced emotion, and facially expressed emotion. Psychophysiology, 41, 809–821. Ritz, T. (2009). Studying non-invasive indices of vagal control: The need for respiratory control and the problem of target specificity. Biological Psychology, 80, 158–168. Ritz, T., Dahme, B., DuBois, A. B., Folgering, H., Fritz, G.K, Harver, A. R., et al. (2002). Guidelines for mechanical lung function measurements in psychophysiology. Psychophysiology, 39, 546–567. Ritz, T., Dahme, B., & Wagner, C. (1998). Effects of static forehead and forearm muscle tension on total respiratory resistance in healthy and asthmatic participants. Psychophysiology, 35, 549–562. Ritz, T., George, C., & Dahme, B. (2000). Respiratory resistance during emotional stimulation: Evidence for a nonspecific effect of emotional arousal? Biological Psychology, 52, 143–160. Ritz, T., & Kullowatz, A. (2005). Effects of stress and emotion on lung function in health and asthma. Current Respiratory Medicine Reviews, 1, 208–219. Ritz, T., Kullowatz, A., Bobb, C., Dahme, B., Kanniess, F., Magnussen, H., et al. (2008). Psychological asthma triggers and symptoms of hyperventilation. Annals of Allergy, Asthma, and Immunology, 100, 426–432. Ritz, T., Kullowatz, A., Goldman, G. D., Smith, H.-J., Kanniess, F., Dahme, B., et al. (2010). Airway response to emotional stimuli in asthma: The role of the cholinergic pathway. Journal of Applied Physiology, 108, 1542–1549. Ritz, T., & Steptoe, A. (2000). Emotion and pulmonary function in asthma: Reactivity in the field and relationship with laboratory induction of emotion. Psychosomatic Medicine, 62, 808–815. Ritz, T., Steptoe, A., Bobb, C., Harris, A., & Edwards, M. (2006). The Asthma Trigger Inventory: Development and evaluation of a questionnaire measuring perceived triggers of asthma. Psychosomatic Medicine, 68, 956–965. Ritz, T., Steptoe, A., De Wilde, S., & Costa, M. (2000). Emotions and stress increase respiratory resistance in asthma. Psychosomatic Medicine, 62, 401–412. Ritz, T., Tho¨ns, M., Fahrenkrug, S., & Dahme, B. (2005). The airways, respiration, and respiratory sinus arrhythmia during picture viewing. Psychophysiology, 42, 568–578. Ritz, T., Wiens, S., & Dahme, B. (1998). Stability of total respiratory resistance under multiple baseline conditions, isometric arm exercise, and voluntary deep breathing. Biological Psychology, 49, 187–213.
Airways and emotion in asthma and blood phobia Ritz, T., Wilhelm, F. H., Gerlach, A., Kullowatz, A., & Roth, W. T. (2005). End-tidal pCO2-levels in blood phobia during viewing of emotional and disease-relevant films. Psychosomatic Medicine, 67, 661–668. Ritz, T., Wilhelm, F. H., Meuret, A. E., Gerlach, A., & Roth, W. T. (2009). Do blood phobia patients hyperventilate during exposure by breathing faster, deeper, or both? Depression and Anxiety, 26, E60– E67. Rottenberg, J., Ray, R. R., & Gross, J. J. (2007). Emotion elicitation using films. In J. A. Coan & J. J. B. Allen (Eds.), The handbook of emotion elicitation and assessment (pp. 9–28). New York: Oxford University Press. Sarlo, M., Buodo, G., Munafo`, M., Stegagno, L., & Palomba, D. (2008). Cardiovascular dynamics in blood phobia: Evidence for a key role of sympathetic activity in vulnerability to syncope. Psychophysiology, 45, 1038–1045. Schultz, H. D., Pisarri, T. E., Coleridge, H. M., & Coleridge, J. C. (1987). Carotid sinus baroreceptors modulate tracheal smooth muscle tension in dogs. Circulation Research, 60, 337–345. Scichilone, N., Marchese, R., Soresi, S., Interrante, A., Togias, A., & Bellia, V. (2007). Deep inspiration-induced changes in lung volume decrease with severity of asthma. Respiratory Medicine, 101, 951–956. van den Elshout, F. J. J., van Herwaarden, C. L. A., & Folgering, H. T. M. (1991). Effects of hypercapnia and hypocapnia on respiratory resistance in normal and asthmatic subjects. Thorax, 46, 28–32.
135 von Leupoldt, A., & Dahme, B. (2005). Emotions and airway resistance in health and asthma: Study with whole body plethysmography. Psychophysiology, 42, 92–97. Wamboldt, M. Z., Laudenslager, M., Wamboldt, F. S., Kelsay, K., & Hewitt, J. (2003). Adolescents with atopic disorders have an attenuated cortisol response to laboratory stress. Journal of Allergy and Clinical Immunology, 111, 509–514. Warren, J. B., Jennings, S. J., & Clark, T. J. H. (1984). Effect of adrenergic and vagal blockade on normal human airway response to exercise. Clinical Science, 66, 79–85. Wasserman, K., Whipp, B. J., & Casaburi, R. (1986). Respiratory control during exercise. In N. S. Cherniack & J. G. Widdicombe (Eds.), Handbook of physiology, (pp. 595–619). Bethesda, MD: American Physiological Society. Wilhelm, F. H., Trabert, W., & Roth, W. T. (2001). Characteristics of sighing in panic disorder. Biological Psychiatry, 49, 606–614. Winton, W. M., Putnam, L. E., & Krauss, R. M. (1984). Facial and autonomic manifestations of the dimensional structure of emotion. Journal of Experimental Social Psychology, 20, 195–216. Wright, R. J., Rodriguez, M., & Cohen, S. (1998). Review of psychosocial stress and asthma: An integrated biopsychosocial approach. Thorax, 53, 1066–1074. (Received January 15, 2009; Accepted January 11, 2010)
Psychophysiology, 48 (2011), 136–141. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01034.x
Self-pacing in interval training: A teleoanticipatory approach
ANDREW M. EDWARDS,a MARIA B. BENTLEY,b MICHAEL E. MANN,b and TIMOTHY S. SEAHOLMEb a
Institute of Sport & Exercise Science, James Cook University, Cairns, Australia Department of Exercise and Sport Science, UCOL Institute of Technology, Palmerston North, New Zealand
b
Abstract The aim of the present study was to investigate whether the concurrent use of Rating of Perceived Exertion (RPE) and a new Perceived Readiness (PR) scale facilitates optimal interval training performance outcomes. Eleven competitive male runners completed outdoor interval track-running trials at a pre-set RPE. The PR scale was used to facilitate selfdetermined recovery, while minimum heart rate (HR) and work to rest ratio (WR) strategies were used as comparative conditions. Duplicate PR trial performances were similar but intercondition comparisons identified that the HR trial was significantly slower than both WR and PR conditions. There was no difference in performance between WR and PR, but recoveries for both PR trials were significantly shorter than for WR. Since the aim of interval training is to sustain performance with the shortest possible recovery time, the concurrent use of RPE and PR scales appears to be a useful psychophysiological technique to self- determine both work and rest in interval training. Descriptors: Fatigue, RPE, Perceived readiness, Endurance, Central governor
(Lambert, St. Clair Gibson, & Noakes, 2005; Noakes, St. Clair Gibson, & Lambert, 2004; St. Clair Gibson, Lambert, Rauch, & Noakes, 2006). It has been suggested this is best achieved in circumstances where the athlete is able to self-regulate effort in the knowledge of the demands and duration of the activity (Lambert et al., 2005; Noakes et al., 2004). However, in interval training, the demands of the activity (and effort applied) are also consequent to the available inter-interval recovery (Fox, Bartels, Billings, & O’Brien, 1979; Laursen & Jenkins, 2002) indicating both components should be self-regulated if a teleoanticipatory (self-paced) approach is applied to interval training. A primary objective of interval training is to balance the quality of performance across all interval bouts with the shortest possible recovery to maximize the training stimulus (Esteve-Lanao, Foster, Seiler, & Lucia, 2007; Fox et al., 1979; Karp, 2000). Training sessions are also usually prescribed using manipulations of exercise duration (distance or time), rest duration, the required intensity of effort, and the number of work bouts (Karp, 2000; Seiler & Hetlelid, 2005; Thibault & Marion, 1999). Consequently, the ability of athletes to self-pace each bout in the context of completing the entire session is an important feature of this form of training. This is also the case in competitive endurance events, which require athletes to respond to situational demands within the confines of successfully completing the race according to the individual’s capabilities and level of tolerable fatigue (Abbiss & Laursen, 2008; Foster, Schrager, Snyder, & Thompson, 1994). The National Strength and Conditioning Association (NSCA) guidelines for interval training (Baechle & Earle, 2008) currently suggest a work to rest ratio of 1:1 is desirable for most occasions, although other inter-repetition recovery strategies such as temporal
The interaction between physiological feedback signals, perception, and pacing of effort over a given duration has been termed teleoanticipation (Ulmer, 1996). While contemporary studies have experimentally extended this psychophysiological concept (e.g., Edwards, Wells, & Butterly, 2008; Faulkner, Parfitt, & Eston, 2008; Lander, Butterly, & Edwards, 2009; Tucker, Bester, Lambert, & Noakes, 2006) it has rarely been considered in response to field-based interval exercise (Edwards et al., 2008). This is surprising as a teleoanticipatory approach to interval training presents a potentially useful new strategy. In the teleoanticipatory model, a ‘central governor’ (brain) operates to control exercise performance in response to feedback and feed forward mechanisms utilizing a range of physiological systems (e.g., muscle, blood pH, skin temperature, respiratory sensations) to regulate neuromuscular recruitment (St. Clair Gibson & Noakes, 2004). The model predicts that performance is continually regulated by the subconscious brain (via constant manipulations of motor unit recruitment) to enable the athlete to perform exercise in the fastest possible time, without excessively stressing physiological responses, thus ensuring homeostasis is maintained and preventing premature termination of exercise
This study was funded by a grant from the Sport and Recreation Council of New Zealand (SPARC) and was supported by Athletics New Zealand. The authors would like to thank Denis Owen, Linda Simpson, Rakai Timutimu, and Raewyn Walker for their technical support over the course of this study. Address correspondence to: A. M. Edwards, PhD, James Cook University, Institute of Sport & Exercise Science, Cairns, Australia. E-mail:
[email protected] 136
Self-pacing in interval training
137
methods and heart rate thresholds (e.g., where athletes might recommence exercise at 130 b ! min " 1) have also been used (Astrand, Astrand, Christensen, & Hedman, 1960; Christensen, Hedman, & Saltin, 1960, Fox, 1979; Fox et al., 1979; Seiler & Hetlelid, 2005). However, each of these current methodologies require the athlete to adapt to an externally controlled stimulus, which may not account for individuals’ perceived readiness for exercise or reflect day-to-day variations in physical, environmental, or psychological factors. As a recent study from our group (Lander et al., 2009) has demonstrated that externally paced work presents a greater physiological stress when compared to matched-intensity self-paced exercise, it is possible that both inadequate and excessive recovery between intervals may compromise training outcomes unless the athlete self-determines exercise intensity and consciously perceives readiness to recommence exercise. The concurrent use of scalar methodologies may prove a practical means of self-judging both running pace and the duration of inter-interval recoveries. Numerous studies have utilized the 6–20 Ratings of Perceived Exertion (RPE) scale (Borg, 1982) to standardize self-judgment of effort (e.g., Chen, Fan, & Moe, 2002; Edwards, Wells, & Butterly, 2008; Faulkner et al., 2008; Noble, Borg, Jacobs, & Ceci, 1983; Tucker et al., 2006), but none have used similar scalar responses to self-determine recovery duration. This is surprising as the relationship between work and rest is symbiotic in interval training (Edwards et al., 2008). Therefore, in this study, a new perceived readiness (PR) scale has been developed for the self-determination of recovery duration which has been applied concurrently with the RPE scale for individuals to self-control both running effort and recovery in comparison with existing training methodologies. It is hypothesized that the concurrent use of RPE and PR scales will provide a practical means of optimizing endurance interval training. As the PR scale is a new concept, an additional trial using the PR scale has been included in the test series to investigate evidence for possible learning effects and also to examine test-retest reproducibility of performance outcomes (Hopkins, 2000) in response to this new psychophysiological technique.
Methods Participants Eleven healthy, well-trained male runners volunteered for this study (Table 1). Participants were all competitive runners who trained for a minimum of four sessions per week. Each was informed of the procedures in advance and informed consent was provided prior to any data collection. The study was approved by the local Institutional Review Board and the Central Regional Ethics Committee of New Zealand. Experimental Design Baseline testing was completed in our exercise laboratory for the assessment of maximal aerobic power (VO2 peak), maximal heart rate, and body mass. A field test of 1000 m ‘all-out’ time-
7 – Exhausted
(unable to exercise)
6 – Very tired
(unable to exercise at the required intensity)
5 – Tired
(not yet able to exercise at the required intensity)
4 – Adequately recovered
(able to exercise at the required intensity)
3 – Well recovered
(able to exercise above the required intensity)
2 – Very well recovered
(well able to exercise above the required intensity)
1 – Fully recovered
(able to exercise at maximal intensity)
Figure 1. Perceived Readiness (PR) Scale.
trial performance was completed individually by each participant on an outdoor all-weather athletics track (Table 1). All subsequent familiarization and interval running sessions (four sessions of 5 # 1000 m) were completed on the same athletics track. Prior to the completion of the four experimental trials, all participants individually completed a supervised interval training session to familiarize themselves with the concept of using the RPE scale for exercise pacing and our new PR scale (Figure 1) as a means of self-determining readiness to recommence exercise. Familiarization and experimental trials were performed in similar environmental conditions in the summer period to enhance the ecological validity of the study. Participants individually completed a series of four (5 # 1000 m) track running sessions in a randomized order each at the standardized perceived exertion of RPE 17 (Very Hard) on the Borg 6–20 RPE scale. Each interval session was separated by approximately 5–7 days and included a rest day preceding each trial. In all familiarization and experimental sessions, participants underwent a standardized warm-up and performed the same inter-interval recovery activity of steady walking in pre-marked areas (5 m # 5 m grid) at the finish area for each bout (Karp, 2000). Participants were continually prompted to maintain a steady walking pace throughout recovery periods in all experimental conditions. The study comprised three experimental conditions in response to the same interval running session (5 # 1000 m). All trials were completed individually: 1) Two identical PR trials (PR1 & PR2), in which participants self-determined the duration of their own inter-interval recovery using the new PR scale to recommence when their PR rating reached a score of 4 in accordance with performing 1000 m running at the required perceived exertion (RPE 17). 2) Heart rate threshold trial (HR), in which participants were required to recommence 1000 m running at the required exertion (RPE 17) when their heart rates reached a minimum threshold of 130 b ! min " 1. 3) Work to rest ratio trial (WR), in which participants were required to recommence 1000 m running at the required exertion (RPE 17) when the duration of their inter-repetition recovery reached a 1:1 ratio.
Table 1. Participant Characteristics and Baseline Performances Condition Part A (n 5 11) Part B (n 5 9)
Age (years)
Height (cm)
Body mass (kg)
VO2 peak (ml kg " 1 ! min " 1)
HR max (b ! min " 1)
1 # 1000 m Time-trial (s)
27 $ 6.9 26 $ 6.8
179.7 $ 7.7 180.1 $ 8.3
73.0 $ 5.8 73.6 $ 6.2
63.7 $ 3.7 64.2 $ 3.3
190.7 $ 3.4 190.2 $ 3.4
181.1 $ 11.1 179.3 $ 11.3
138
A. M. Edwards et al.
All four exercise trials (PR1, PR2, WR, and HR) were completed in a randomized order. A second PR trial (PR2) was added to the test sequence to enable evaluation of test-retest reliability of selected physiological and performance outcomes. Although only of relatively small scale, the sample size for test-retest evaluation (n 5 11) was similar to previous observations of reliability (n 5 10) (e.g., Edwards, Claxton, & Fysh, 2008) but was considered an important additional trial to the study as the PR scale forms the basis of a new concept for interval training. This element of the study has been labelled Part A of the investigation. Inter-condition comparisons of physiological and performance outcomes were made between the three experimental conditions (Part B). Standardized instructions were provided for participants when using the PR and RPE scales, and, in all trials, participants were blinded to each experimental condition until after the completion of the warm-up. Two participants did not complete all four trials due to reasons external to this study. However, both participants had completed the two PR trials from their individually randomized order and so were retained in the test-retest reliability element of the study (Part A). Therefore, for the purposes of test-retest reliability in Part A, there were 11 participants in PR1–2 trials, and, for inter-condition comparisons in Part B of the study, there were 9 participants who completed all four conditions. The new PR scale (Figure 1) was designed to operate concurrently with the RPE scale as a means of quickly self-determining readiness for exercise between bouts. In all familiarization and experimental trials using the PR scale, participants were requested to self-determine recovery duration based on their individual readiness to recommence exercise at a PR scale point of 4 on the scale (Figure 1). Outcome Measurements Interval running was assessed according to performance times, mean repetition heart rates (b ! min " 1) and velocities (km ! h " 1) (Heart rate & GPS Pod system, Suunto, Munich, Germany). Performance and recovery durations were recorded by duplicate use of stopwatches among trained timers. Blood lactate concentrations were taken on arrival at the athletics track to ensure subjects were in a rested state (o2 mmol ! L " 1) before commencing each interval session and immediately at the cessation of each interval in the test sequence. Blood Sampling Whole blood was sampled from the fingertip at rest prior to any testing. Thirty ml of whole blood was immediately analyzed for lactate concentration (Lactate Pro, KDK Corporation, Kyoto, Japan) at rest, at the conclusion of each 1000 m bout, and at the conclusion of the VO2 peak test. Statistical Analysis Analysis of the repeated measurement on all primary dependent variables was performed by repeated measures analysis of variance (ANOVA) with Greenhouse-Geisser correction (SPSS version 17.0). Post-hoc Tukey tests of honest significant differ-
ence were used to examine where differences existed. Statistical significance was accepted at po.05. All results are expressed as mean # SD unless otherwise stated. The test-retest reliability of each outcome measurement derived from Part A of the study was determined by coefficient of variation (CV) between the two tests (Edwards et al., 2003; Hopkins, 2000). The CV methodology was considered the most suitable description of test-retest reliability in this study as it enables both valid and practical comparisons between test parameters from a single variable (%) (Hopkins, 2000). The CV is expressed as a percentage and calculated as: CV 5 100 $ SDdiff/X. The SDdiff indicated the standard deviation of the difference between the duplicate measurements, and X the mean reading of the measurement (Hopkins, 2000). Systematic bias was assessed to investigate possible learning effects from PR1 to PR2. This was expressed as a percentage from the average difference between the teststhe mean of Test 1 and Test 2 $ 100.
Results Part A Mean 5 $ 1000 m performance times and running velocities for PR1 and PR2 were similar, as were the physiological variables of exercise heart rates and post-interval blood lactate concentrations (Table 2). Fatigue Index (%) was also similar across both PR trials (PR1: 5.9 # 3%; PR2: 6.7 # 3.1%). Statistical analysis did not identify any significant difference in performance, duration of inter-interval recovery, or physiological variable between PR1 and PR2 (p4.05). The CV for mean performance times (1.5%) and mean exercise heart rate (1.3%) were the most reproducible variables across trials PR1–2. Despite the similarity in mean performance times, immediate post-repetition blood lactate concentrations demonstrated greater variability (16.8%), as did the mean duration of recovery (28.3%) (Table 3). There was no systematic bias or evidence of learning effects (exercise or recovery) between tests (Table 3). There were no significant correlations between performance times and recovery durations for trials PR1 and PR2. Blood lactate concentration also did not correlate with any other performance outcome measurement. Part B The results of the inter-condition comparisons identified HR trial performances as the slowest compared to all other conditions (po.05). There were no differences in performance times or running velocities between PR trials and the WR trial. There was no difference in heart rates or post-repetition blood lactate concentration between any experimental conditions (Table 4). The trajectory of performance across 5 $ 1000 m intervals were similar for each experimental condition, insofar as the performance corresponding to RPE 17 diminished from bout one to
Table 2. Performance Characteristics of Test-Retest Reliability Study (PR Trials 1–2: 5 $ 1000 m) PR1 (n 5 11) PR2 (n 5 11)
Mean 1000 m (s)
Recovery duration (s)
Lactate (mmol ! L " 1)
Velocity (km ! H " 1)
HR (b ! min " 1)
Fatigue index (%)
200.6 # 8.5 201.8 # 9.4
185.8 # 58.1 172.9 # 51.7
10.9 # 2.6 10.6 # 2.7
18.3 # 1.1 18.1 # 1.5
176.7 # 7.5 175.3 # 8.6
5.9 # 3.0 6.7 # 3.1
Note: Fatigue Index (%) 5 Percentage change from fastest to slowest performance time in the five repetition sequence.
Self-pacing in interval training
139
Table 3. Test-Retest Reliability Measurements for Trials PR1 and PR2 According to Coefficient of Variation (CV) and Systematic Bias for Performance and Physiological Outcomes 1000 m
Recovery duration
Lactate
Velocity
Heart rate
CV (%)
Bias (%)
CV (%)
Bias (%)
CV (%)
Bias (%)
CV (%)
Bias (%)
CV (%)
Bias (%)
1.5
0.6
28.3
# 7.1
16.8
# 3.9
3.5
# 1.3
1.3
# 0.8
R1–2
2002), the concurrent use of RPE and PR scales appears advantageous compared with HR and WR conditions. Performance times for bouts 1–5 within the two PR sessions were well sustained, and the similarity of performance outcomes indicates that the participants were able to complete the two interval sessions without significant change to their perceived effort (RPE 17). This appears to be due to the freedom to (centrally) control the balance of work and rest using the scalar methods for the entire session, rather than simply controlling a single (running) component. Recovery duration (CV 5 28.3%) was manipulated via behavioral control (using the PR scale) to manage performance at a sustainable effort across all bouts, and this did not result in either impeded performance or an unduly long recovery. Therefore, the participants were able to maintain a sustainable and idealized view of effort corresponding to a fixed perceived exertion (RPE 17) by varying recovery in accordance with their readiness for exercise (Faulkner et al., 2008). This suggests the PR scale is a useful practical means of reflecting intrinsic processes to self-judge recovery in relation to individual requirements. The variation in blood lactate concentrations (CV 5 16.8%) was consistent with previous studies, which suggest it is a largely ineffectual marker of exercise intensity although it is also commonly used for this purpose (Edwards et al., 2008; Seiler & Sjursen, 2004). In this study, there was no evidence of any relationship between blood lactate concentration and any other variable, which further suggests lactate is not useful in this context beyond the broad indication of considerable anerobic flux during high intensity interval running. As pacing is largely dependent on the immediate conscious sensations prior to and during exercise (hence the use of RPE and PR scales) (Hampson, St. Clair Gibson, Lambert, & Noakes, 2001; St. Clair Gibson & Noakes, 2004), it seems likely that performance can be accurately replicated via teleoanticipation when in the knowledge of 1) the immediate demands of 1000 m running, and 2) the overall demands of completing the 5 " 1000 session (Edwards et al., 2008; Edwards & Noakes, 2009; Robertson, 1982). Inter-condition comparisons revealed the performances and physiological responses to interval running at RPE 17 were
bout five (Figure 2). However, deterioration of performance was only of statistical significance for the HR trial, in which bouts 3–5 of the interval sequence were significantly slower than the first interval, and those efforts were also significantly slower than bouts 3–5 of the other experimental conditions (Figure 2). Interestingly, the final (fifth) interval in the test sequence tended to be slightly quicker than the fourth effort in each experimental condition (p 5 .09). Trial-to-trial ANOVA analysis did not identify a learning effect. The shortest mean inter-interval recovery duration was recorded in the HR trial (po.01) (Table 4), and the greatest Fatigue Index was also recorded in that condition (po.01) (Table 4). Mean recovery duration was significantly longer in the WR condition compared to both PR1 and PR2 trials (po0.05) (Table 4). All recovery periods were consistently longer in the WR condition compared to both PR trials (Figure 3).
Discussion The purpose of this study was to examine whether the new PR scale provided a reliable and useful means of self-controlling performance in interval training in accordance with teleoanticipatory theory. Evaluation of test-retest reliability demonstrated similar performance times (CV 5 1.5%), running velocity (CV 5 3.5%), and heart rates (CV 5 1.3%) using the combination of RPE and PR scales. This suggests performances can be well replicated when using the new methodology. Conditionspecific comparisons identified improved performances using the teleoanticipatory approach compared to a heart rate threshold condition (! 6 s faster for each 1000 m), and, although performances were similar between PR and WR conditions, inter-interval recovery durations for PR were significantly shorter than the work to rest strategy (! 40 s shorter for each recovery period). Since a key aim of interval training is to repeatedly sustain high intensity effort with the shortest possible recovery time (Esteve-Lanao et al., 2007; Fox et al., 1979; Laursen & Jenkins,
Table 4. Performance and Physiological Responses to the Inter-Condition Comparisons (Part B) for the 5 " 1000 m Test Sequence Condition PR1 (n 5 9) PR2 (n 5 9) HR (n 5 9) W:R (n 5 9)
Mean 5 " 1000 m (s) 198.9 199.8 204.6 198.8
% % % %
7.5 8.8 8.8n 8.2
Recovery duration (s) 162.3 158.2 113.8 198.1
% % % %
31.3w 45.5w 48.0nn 8.2w
Lactate (mmol $ L # 1) 10.4 10.3 10.9 11.3
% % % %
2.5 2.9 3.2 3.0
Velocity (km $ h # 1) 18.4 18.2 18.1 18.5
% % % %
1.1 1.6 1.0 0.8
Mean HR (b $ min # 1) 176.9 176.5 178.06 178.22
% % % %
6.8 7.4 7.2 5.5
Mean intensity (%) 91.3 90.7 91.4 91.6
% % % %
3.8 3.9 3.4 3.6
Fatigue index (%) 6.0 6.2 9.3 6.6
% % % %
3.0 3.3 5.0nn 3.4
Note: Fatigue Index (%) 5 Percentage difference between fastest and slowest performance times of the five bouts of interval running. Mean intensity 5 the mean heart rate derived from the five-bout sequence, expressed as a percentage of the baseline maximum heart rate attained in the VO2 peak test. n Significant difference from all other conditions (po.05). nn significant difference from all other trials (po.01). w significant difference between PR trials and WR trials (po.05).
140
A. M. Edwards et al. 220.00
Performance time (s)
∗∗††
∗†
215.00
were no additional gains in running performance compared to PR. Therefore, although WR training sessions are, to some extent, individualized whereby faster performers receive a shorter rest, recovery periods are also extrinsically controlled (by time) and do not reflect immediate (bout to bout) individual requirements or subconscious/conscious processes. A range of ecological factors could also have been influential over the athletes’ perception of readiness to commence each bout, including psychological, physiological, and environmental factors each contributing to the timing of each athlete’s individual sensation of perceived readiness (Edwards & Noakes, 2009). It seems likely that the PR scale might provide a practical means for individuals to accurately self-determine recovery requirements and so may be preferable to HR and WR conditions in well-trained runners. The teleoanticipation model predicts that continual central (brain) regulation of various physiological systems ensure none are ever maximally taxed (Lambert et al., 2005; Noakes et al., 2004). As such, it is unlikely that a single physiological variable will accurately reflect this process. Despite this, it was speculated that performance could be related to an individualized minimum heart rate ‘threshold’ recorded in the immediate period prior to recommencing exercise, where participants were free to self-determine the duration of recovery (e.g., in the PR trials). However, correlation analysis did not identify a significant relationship between the minimum recovery heart rates and performance variables, or any indices of aerobic fitness taken from baseline assessments. The threshold for minimum heart rates selected for the HR condition was standardized at 130 b ! min " 1 based on earlier studies (Baechle & Earle, 2008; Fox, 1979; Fox et al., 1979) and from colloquial discussion with coaches, but there is no doubt participants found the HR condition to be the most challenging. Impaired performance has often been reported without significant changes to physiological markers of fatigue (e.g., Hargreaves, 2008), but the absence of an identifiable physiological mechanism supporting this observation is a limitation to this study. It has been suggested that measures of cardiac vagal
∗∗††
210.00 205.00 200.00 195.00
HR WR PR1 PR2
190.00 185.00 180.00
0
1
2
3 Interval number
4
5
Figure 2. Comparison of the mean 5 # 1000 m repetition performance times across each of the experimental conditions (n 5 9). nSignificant difference from all other conditions (po.05), nnsignificant difference from all other conditions (po.01); wsignificantly different from bout 1 in the 5-bout sequence (po.05), wwsignificantly different from bout 1 in the 5bout sequence (po.01). Results $ SE.
similar between WR and both of the PR trials. This indicates that using either method is effective for sustaining high intensity interval running across five exercise bouts. Nevertheless, recovery durations for the WR trial were the longest of all experimental conditions, and this session was also rated as the easiest among participants, despite the mean performance time being 5.8 s faster than that of the HR condition. Therefore, it is possible that the brief recovery periods (approx. 2.5–3 min) may have been overly long in the WR condition. Each recovery period provided repeated opportunities to reassess homeostasis (Edwards & Noakes 2009) in consideration of existing fuel sources, fluid balance, heat storage, and metabolic acidosis among other factors (Lambert et al., 2005; Noakes et al., 2004). For the welltrained runners in this study, matching rest to work duration resulted in excessively long recoveries for this process as there 220.00 200.00 180.00
*
Recovery duration (s)
160.00 140.00
**
**
**
**
*
*
120.00 HR 100.00
WR PR1
80.00
PR2
60.00 40.00 20.00 0.00 1
2
3
4
Recovery periods Figure 3. Comparison of mean inter-repetition recovery duration across all four intervals in each experimental condition. nSignificant difference between conditions (po0.05); nnsignificant difference between conditions (po.01). Results $ SE.
Self-pacing in interval training
141
modulations may be indicative of central-peripheral neural feedback mechanisms (Thayer & Lane, 2000), which could reflect regulatory control in the PR trials. However, measurement of heart rate variability was incompatible with the field-based experimental aims and design of this study, and so future evaluations of heart rate variability in self-regulated interval training would make useful contributions to the literature. Nevertheless, in a complex system of metabolic control, it seems unlikely that a single (cardiac) variable will reflect multiorgan homeostatic central (brain) regulation (Noakes et al., 2004; Lambert et al., 2005; St. Clair Gibson et al., 2006; Tucker et al., 2006). In conclusion, we have shown that the new PR scale appears to be a novel and useful technique for practical implementation in interval training. It is evident from the trajectory of performance times in all sessions that pacing is a feature of interval training, and each bout is considered within the context of available recovery so as to avoid catastrophic fatigue and
premature cessation of exercise (Edwards & Noakes, 2009; Lambert et al., 2005; Noakes et al., 2004; St. Clair Gibson et al., 2006). We suggest that (RPE) effort and readiness (PR) based training may be a useful means of individualizing endurance interval sessions for athletes using a psychophysiological approach, while also facilitating practical flexibility of exercise prescription for the coach. The concept of self-pacing facilitates greater self-awareness of physical capabilities among athletes, and it is our contention that the combination of using RPE to gauge interval effort and PR scale to gauge recovery may be a useful means of organizing interval training according to individual conditioning requirements. The RPE demands of interval training sessions can be easily adjusted using the simple scalar methods according to the planned outcomes of the interval training session. Further studies are required to determine the usefulness and reliability of the new PR scale across a range of exercise challenges.
REFERENCES Abbiss, C. R., & Laursen, P. B. (2008). Describing and understanding pacing strategies during athletic competition. Sports Medicine, 38, 239–252. Astrand, I., Astrand, P. O., Christensen, E. H., & Hedman, R. (1960). Intermittent muscular work. Acta Physiologica Scandinavica, 48, 448–453. Baechle, T. R., & Earle, R. W. (2008). National strength and conditioning association: Essentials of strength and conditioning (3rd ed, p. 487– 503). Champaign, IL: Human Kinetics. Borg, G. A. V. (1982). Psychophysical bases of perceived exertion. Medicine and Science in Sports and Exercise, 14, 377–381. Chen, M. J., Fan, X., & Moe, S. T. (2002). Criterion-related validity of the Borg ratings of perceived exertion scale in healthy individuals: A meta-analysis. Journal of Sports Science, 20, 873–899. Christensen, E. H., Hedman, R., & Saltin, B. (1960). Intermittent and continuous running. Acta Physiologica Scandinavica, 50, 269–286. Edwards, A. M., Claxton, D. B., & Fysh, M. L. (2003). A comparison of two time domain analysis procedures in the determination of VO2 kinetics by PRBS exercise testing. European Journal of Applied Physiology, 88, 411–416. Edwards, A. M., Wells, C., & Butterly, R. (2008). Concurrent inspiratory muscle and cardiovascular training differentially improves both perceptions of effort and 5000 m running performance compared with cardiovascular training alone. British Journal of Sports Medicine, 42, 823–827. Edwards, A. M., & Noakes, T. D. (2009). Dehydration: Cause of fatigue or sign of pacing in elite soccer? Sports Medicine, 39, 1–13. Esteve-Lanao, J., Foster, C., Seiler, S., & Lucia, A. (2007). Impact of training intensity distribution on performance in endurance athletes. Journal of Strength and Conditioning Research, 21, 943–949. Faulkner, J., Parfitt, G., & Eston, R. (2008). The rating of perceived exertion during competitive running scales with time. Psychophysiology, 45, 977–985. Foster, C., Schrager, M., Snyder, A., & Thompson, N. N. (1994). Pacing strategy and athletic performance. Sports Medicine, 17, 77–85. Fox, E. L. (1979). Interval training. Bulletin of the Hospital for Joint Diseases Orthopaedic Institute, 40, 64–71. Fox, E. L., Bartels, R. L., Billings, C. E., & O’Brien, R. (1979). Frequency and duration of interval training and changes in aerobic power. Journal of Applied Physiology, 38, 481–484. Hampson, D. B., St.Clair, A. S. G., Lambert, M. I., & Noakes, T. D. (2001). The influence of sensory cues on the perception of exertion during exercise and central regulation of exercise performance. Sports Medicine, 31, 935–952. Hargreaves, M. (2008). Fatigue mechanisms determining exercise performance: Integrative physiology in systems biology. Journal of Applied Physiology, 104, 1541–1542. Hopkins, W. G. (2000). Measures of reliability in sports medicine and science. Sports Medicine, 30, 1–15. Karp, J. R. (2000). Interval training for the fitness professional. Strength and Conditioning Association Journal, 22, 64–69.
Lambert, E. V., St. Clair Gibson, A., & Noakes, T. D. (2005). Complex systems model of fatigue: Integrative homoeostatic control of peripheral physiological systems during exercise in humans. British Journal of Sports Medicine, 39, 52–62. Lander, P. J., Butterly, R. J., & Edwards, A. M. (2009). Self-paced exercise is less physically challenging than enforced constant pace exercise of the same intensity: Influence of complex central metabolic control. British Journal of Sports Medicine, 43, 789–795. Laursen, P. B., & Jenkins, D. G. (2002). The scientific basis for highintensity interval training: Optimising training programmes in highly trained endurance athletes. Sports Medicine, 32, 53–73. Noakes, T. D., St. Clair Gibson, A., & Lambert, E. V. (2004). From catastrophe to complexity: A novel model of integrative central neural regulation of effort and fatigue during exercise in humans. British Journal of Sports Medicine, 38, 511–514. Noble, B. J., Borg, G. A. V., Jacobs, I., & Ceci, P. (1983). A category-ratio perceived exertion scale: Relationship to blood and muscle lactates and heart rate. Medicine and Science in Sports and Exercise, 15, 523–528. Robertson, R. J. (1982). Central signals of perceived exertion during dynamic exercise. Medicine and Science in Sports and Exercise, 14, 390–396. St. Clair Gibson, A., Lambert, E. V., Rauch, L. H., & Noakes, T. D. (2006). The role of information processing between the brain and peripheral physiological systems in pacing and perception of effort. Sports Medicine, 36, 705–722. St. Clair Gibson, A., & Noakes, T. D. (2004). Evidence for complex system integration and dynamic neural regulation of skeletal muscle recruitment during exercise in humans. British Journal of Sports Medicine, 38, 797–806. Seiler, S., & Hetlelid, K. J. (2005). The impact of rest duration on work intensity and RPE during interval training. Medicine and Science in Sports and Exercise, 37, 1601–1607. Seiler, S., & Sjursen, J. E. (2004). Effect of work duration on physiological and rating scale of perceived exertion responses during selfpaced interval training. Scandinavian Journal of Medicine and Science in Sports, 14, 318–325. Thayer, J. F., & Lane, R. D. (2000). A model of neurovisceral integratuin in emotion regulation and dysregulation. Journal of Affective Disorders, 61, 201–216. Thibault, G., & Marion, A. (1999). Interval trainingFa practical model. Coaches Report, 6, 16–20. Tucker, R., Bester, A., Lambert, E. V., & Noakes, T. D. (2006). Nonrandom fluctuations in power output during self-paced exercise. British Journal of Sports Medicine, 40, 912–917. Ulmer, H. V. (1996). Concept of an extracellular regulation of muscular metabolic rate during heavy exercise in humans by psychophysiologial feedback. Experimentia, 52, 416–420.
(Received October 13, 2009; Accepted December 22, 2009)
Psychophysiology, 48 (2011), 142–148. Wiley Periodicals, Inc. Printed in the USA. Copyright r 2010 Society for Psychophysiological Research DOI: 10.1111/j.1469-8986.2010.01045.x
Blunted cardiac reactions to acute psychological stress predict symptoms of depression five years later: Evidence from a large community study
ANNA C. PHILLIPS,a KATE HUNT,b GEOFF DER,b and DOUGLAS CARROLLa a
School of Sport and Exercise Sciences, University of Birmingham, Birmingham, England MRC Social and Public Health Sciences Unit, Glasgow, Scotland
b
Abstract We recently reported a cross-sectional negative relationship between cardiovascular reactivity and depressive symptoms. The present analyses examined the prospective association between reactivity and symptoms of depression 5 years later. At the earlier time point, depressive symptoms, measured using the Hospital Anxiety and Depression Scale (HADS), and cardiovascular reactions to a standard mental stress were measured in 1,608 adults comprising three distinct age cohorts: 24-, 44-, and 63-year-olds. Depression was reassessed using the HADS 5 years later. Heart rate reactions to acute psychological stress were negatively associated with subsequent depressive symptoms; the lower the reactivity the higher the depression scores. This association withstood adjustment for symptom scores at the earlier time point and for sociodemographic factors and medication status. The mechanisms underlying this prospective relationship remain to be determined. Descriptors: Blood pressure, Depression, Heart rate, Psychological stress, Prospective study
First, exaggerated cardiovascular (blood pressure and heart rate) reactions to acute psychological challenge have long been considered a risk factor for cardiovascular pathology (Lovallo & Gerin, 2003; Schwartz et al., 2003) and several prospective studies have now shown consistently that higher cardiovascular reactivity (most commonly blood pressure and heart rate, and additionally cardiac output, stroke volume, total peripheral resistance, and preejection period when measured) confers a modest additional risk for a range of cardiovascular outcomes, such as high blood pressure, carotid atherosclerosis, carotid intimathickness, and increased left ventricular mass (e.g., Allen, Matthews, & Sherman, 1997; Barnett, Spence, Manuck, & Jennings, 1997; Carroll, Ring, Hunt, Ford, & Macintyre, 2003; Kamarck et al., 1997; Lynch, Everson, Kaplan, Salonen, & Salonen, 1998; Markovitz, Raczynski, Wallace, Chettur, & Chesney, 1998; Treiber et al., 2003). Second, a meta-analysis of 11 relevant studies found moderate effect sizes indicative of a positive relationship between depressive symptomatology and heart rate reactivity and small effect sizes linking more depressive symptoms with higher systolic and diastolic blood pressure reactions to acute psychological stress (Kibler & Ma, 2004). However, none of the aggregate effect sizes from the meta-analyses were statistically significant at conventional levels. In addition, the studies included in the meta-analysis generally tested fairly small samples and were often conducted on patients with established cardiovascular disease. Further, few of these studies adjusted for potential confounding variables such as demographic factors and medication status.
Depression has been linked prospectively to mortality in general and death from cardiovascular disease in particular (for reviews, see Hemingway & Marmot, 1999; Wulsin, Vaillant, & Wells, 1999). However, the mechanisms underlying this association have yet to be established. Autonomic dysregulation remains a possibility, and depression has been associated with a variety of adaptations that suggest altered autonomic function. For example, enhancement of cardiac sympathetic activity relative to vagal tone has been reported in those with depression and subclinical depressive symptoms (Carney et al., 1988; Light, Kothandapani, & Allen, 1998), as have increased plasma noradrenalin concentrations in patients with major depression (Rudorfer, Ross, Linnoila, Sherer, & Potter, 1985). Thus, the hypothesis that such autonomic dysregulation in depression may also be manifest as exaggerated cardiovascular reactivity (Kibler & Ma, 2004), which in turn increases the risk of cardiovascular pathology, is intuitively appealing.
The West of Scotland Twenty-07 Study is funded by the UK Medical Research Council (WBS U.1300.80.001.00001), and the data were originally collected by the MRC Social and Public Health Sciences Unit. We are grateful to all of the participants in the Study and to the survey staff and research nurses who carried it out. The data are employed here with the permission of the Twenty-07 Steering Group (Project No. EC0503). Kate Hunt and Geoff Der are also funded by the MRC. Address correspondence to: Anna C. Phillips, Ph.D., School of Sport and Exercise Sciences, University of Birmingham, Birmingham B15 2TT, England. E-mail:
[email protected] 142
Cardiac reactivity and depression Two subsequent larger scale studies have addressed the issue, one in a large community sample of over 1,600 participants from the West of Scotland Twenty-07 Study measuring depressive symptoms and blood pressure and heart rate reactions to acute stress (Carroll, Phillips, Hunt, & Der, 2007) and the other in a coronary artery disease patient sample of over a 100 people again measuring depressive symptoms and blood pressure and heart rate reactivity (York et al., 2007). In both studies, higher depressive symptom scores were associated with lower, not higher, cardiovascular reactions to acute psychological stress. In the former, cardiovascular reactivity and symptoms of anxiety were also negatively related. What is especially compelling about these findings is that the associations were still evident following adjustment for a relatively comprehensive range of covariates. Nevertheless, both studies were cross-sectional, and so direction of causality was impossible to determine. Below we report a prospective analysis from the Twenty-07 Study. The associations between cardiovascular reactivity and symptoms of depression and anxiety 5 years later were examined. Based on our previous cross-sectional findings, it was hypothesized that participants with higher reactivity would be less, not more, likely to report symptoms of depression and anxiety at subsequent follow-up.
Methods Participants Participants were resident in Glasgow and the surrounding areas in Scotland at the baseline survey of the West of Scotland Twenty-07 Study in 1987; they have been followed up at regular intervals since (Benzeval et al., 2009). The achieved sample size at entry to the Regional study was 3,036. Cardiovascular reactions to an acute psychological challenge were measured at the third follow-up in 1995–1997 (Carroll et al., 2000, 2003). Reactivity data were available for 1,647 participants, with scores for depression and anxiety symptomatology recorded for 1,608 and 1,607 of these, respectively. Participants comprised three distinct age cohorts born in the early 1970s, 1950s, and 1930s: of the 1,608, 575 (36%) were aged around 24 years, 606 (38%) 44 years, and 427 (26%) 63 years; 875 (54%) were women and 733 (46%) men; and 746 (47%) were from manual and 851 (53%) nonmanual occupation households. Household occupational status, an accepted index of socioeconomic position, was classified as manual or nonmanual from the occupational status of the head of household, using the Registrar General’s (1980) classification of occupations. For the youngest of the three cohorts, head of household was either the participant, if working and living independently, or the parent, if the participant was a student or lived with his or her parents. For the other two cohorts, head of household was either the participant or his or her spouse or partner, depending on which of the two held or had held the highest occupational status; this was usually the man. A comparison of the three cohorts with equivalent samples drawn from the 1991 UK census revealed equivalence in terms of sex, occupational group, and home ownership (Der, 1998). The sample was almost entirely Caucasian, reflecting the West-of-Scotland population from which it was drawn. As revealed during structured interviews conducted by nurses trained in survey techniques, only 71 (4%) and 46 (3%) of the sample were taking antidepressants and anxiolytic medication, respectively. The mean age of the whole sample at the third follow-up was 42.3 (SD 5 15.48) years.
143 Measurement of Depression and Anxiety Symptoms of depression and anxiety were measured at both the third and fourth follow-ups using the Hospital Anxiety and Depression Scale (HADS; Zigmond & Snaith, 1983). At the fourth follow-up, data were available for 1,245 participants. Thus, the attrition rate was 23% between these two time points. The mean (SD) temporal lag between the two follow-ups was 5.5 (SD 5 1.00) years. The HADS is a well-recognized assessment instrument that comprises 14 items, 7 measuring depression and 7 measuring anxiety. The depression subscale emphasizes anhedonia and largely excludes somatic items. Items are scored on a 4-point scale, 0 to 3; the higher the score, the greater the depression and anxiety. The HADS has good concurrent validity (Bramley, Easton, Morley, & Snaith, 1988; Herrmann, 1997), performs well as a psychiatric screening device (Bjelland, Dahl, Haug, & Neckelmann, 2002; Herrmann, 1997), and boasts acceptable psychometric properties; for example, a Cronbach’s a of .90 for the depression items and .93 for the anxiety items has been reported (Moorey et al., 1991) and test–retest reliability coefficients as high as .85 for depression and .84 for anxiety have been found over 2 weeks, and .70 for over periods greater than 6 weeks (Herrmann, 1997). Apparatus and Procedure Participants were tested in a quiet room in their own homes by trained nurses. Demographic information was obtained by interview. At the end of a lengthy interview and questionnaire session, participants undertook an acute psychological challenge: the paced auditory serial addition test (PASAT), which has been shown in numerous studies to reliably perturb the cardiovascular system (Ring, Burns, & Carroll, 2002; Ring et al., 1999; Winzer et al., 1999) and to demonstrate good test–retest reliability (Willemsen et al., 1998). Participants were presented with a series of single-digit numbers by audiotape and requested to add sequential number pairs while retaining the second of the pair in memory for addition to the next number presented, and so on throughout the series. Answers were given orally, and, if participants faltered, they were instructed to recommence with the next number pair. The total number of correct answers was recorded as a measure of performance. The first sequence of 30 numbers was presented at a rate of one every 4 s, and the second sequence of 30 at one every 2 s. The whole task took 3 min, 2 min for the slower sequence and 1 min for the faster sequence. The nurses were all trained in the PASAT protocol by the same trainer. They followed a written protocol during every testing session. In the present sample, all participants registered a score and completed the whole PASAT. Out of a possible score of 60, the median score was 45 (interquartile range 5 11). Systolic blood pressure (SBP), diastolic blood pressure (DBP), and heart rate (HR) were determined by an Omron (model 705CP) sphygmomanometer. This is one of the semiautomatic blood pressure measuring devices recommended by the European Society of Hypertension (O’Brien, Waeber, Parati, Staessen, & Myers, 2001). Following the interview (which took at least an hour), there was then a formal 5-min period of relaxed sitting, at the end of which a resting baseline reading of SBP, DBP, and HR was taken. Task instructions for the PASAT were then given and the participants were allowed a brief practice to ensure that they understood task requirements. Two further SBP, DBP, and HR readings were taken during the task, the first initiated 20 s into the task (during the slower sequence of numbers), and the second initiated 110 s later (at the same point
144
A.C. Phillips et al.
during the fast sequence). For all readings, the nurses ensured that the participant’s elbow and forearm rested comfortably on a table at heart level. The two task readings were averaged, and the resting baseline value was subsequently subtracted from the resultant average task value to yield reactivity measures for SBP, DBP, and HR for each participant. Data Reduction and Analyses Differences in depression and anxiety at the fourth follow-up between age cohorts, sexes, and household occupational groups were explored using ANOVA. The main analysis was by regression, where symptoms of depression and anxiety at the later follow-up were the dependent variables. Initially, the bivariate associations between cardiovascular reactivity and symptomatology were examined using linear regression. Sensitivity analyses, using logistic regression, were then undertaken using binary variables derived from applying a cutoff of !8 on the depression and anxiety subscales of the HADS as an indicator of possible pathology (Zigmond & Snaith, 1983). Significant associations between reactivity and symptomatology emerging from the initial simple linear regression were then revisited adjusting for HADS scores at the earlier follow-up. This was entered at Step 1 in the hierarchical model, with reactivity entered at Step 2. Next, a model was tested which additionally adjusted for age cohort, sex, household occupational group, PASAT performance score as an indicator of task engagement, antidepressive medication status, and antihypertensive medication status. Results Sociodemographics, Depression, and Anxiety Because we have already reported the sociodemographic patterning of symptoms of depression and anxiety at the third follow-up in our earlier paper (Carroll et al., 2007), the analyses reported below are necessarily restricted to the fourth follow-up HADS scores. The overall mean depression and anxiety scores at the fourth follow-up were 3.72 (SD 5 3.14) and 6.77 (SD 5 3.77), respectively. Analysis of variance (ANOVA) yielded main effects for all three independent variables: the older two cohorts recorded higher depression scores than the youngest cohort, F(2,1242) 5 3.26, p 5 .04, Z2 5 .005; women had higher scores than men, F(1,1243) 5 6.66, p 5 .01, Z2 5 .005; those from manual occupational households had higher scores than those from nonmanual households, F(1,1243) 5 9.12, p 5 .003, Z2 5 .003. The summary statistics are presented in Table 1. Table 1. Mean (SD) HADS Depression and Anxiety Scores at the Third and Fourth Follow-Ups by Cohort, Sex, and Occupational Status Wave 3
Wave 4
Depression Anxiety Depression Anxiety Age cohort Youngest (n 5 419) Middle (n 5 497) Eldest (n 5 329) Sex Male (n 5 559) Female (n 5 686) Occupational group Manual (n 5 550) Nonmanual (n 5 690)
3.1 (2.60) 7.3 (3.78) 3.4 (3.00) 6.9 (3.72) 3.9 (3.00) 7.4 (3.88) 4.0 (3.38) 7.1 (3.85) 4.0 (2.94) 6.7 (3.71) 3.7 (2.91) 6.0 (3.62) 3.5 (2.71) 6.5 (3.56) 3.5 (2.92) 5.9 (3.48) 3.8 (2.98) 7.7 (3.93) 3.9 (3.30) 7.5 (3.86) 3.9 (2.96) 7.3 (3.98) 4.0 (3.27) 7.1 (3.89) 3.5 (2.77) 7.1 (3.65) 3.5 (3.02) 6.5 (3.67)
Table 2. Mean (SD) SBP, DBP, and HR, Baseline and Stress Values for the Sample as a Whole SBP
DBP
HR
Baseline
Stress
Baseline
Stress
Baseline
Stress
129.1 (20.50)
140.6 (21.63)
78.9 (11.60)
85.7 (12.30)
66.7 (10.80)
74.8 (12.33)
Analyses of the anxiety data also revealed significant main effects for cohort, F(2,1242) 5 8.78, po.001, Z2 5 .014, sex, F(1,1243) 5 53.94, po.001, Z2 5 .042, and occupational status, F(1,1243) 5 6.54, p 5 .01, Z2 5 .005. As with depression, women and those in the manual occupational group reported more symptoms of anxiety than men and those in the nonmanual group. However, in the case of anxiety, the oldest cohort had lower scores than the other two cohorts. There were no significant interactions between age cohort, sex, and household occupational group for depressive or anxiety symptoms. Cardiovascular Reactions to Acute Psychological Stress Two-way (baseline " task) repeated measures ANOVAs indicated that, on average, the PASAT significantly increased cardiovascular activity: for SBP, F(1,1607) 5 1564.08, po.001, Z2 5 .493, for DBP, F(1,1607) 5 1048.71, po.001, Z2 5 .395, and for HR, F(1,1607) 5 1108.77, po.001, Z2 5 .408. The mean (SD) increases were 11.6 (11.72) mmHg SBP, 7.0 (8.61) mmHg DBP, and 8.2 (9.82) bpm HR. The mean baseline and stress values are presented in Table 2. Cardiovascular Reactivity and Future Depression and Anxiety Symptomatology In bivariate analyses, neither SBP nor DBP reactivity were related to HADS depression and anxiety scores 5 years later. However, HR reactivity was negatively associated with both depression, b 5 # .10, t 5 3.65, po.001, R2 5 .010, and anxiety symptomatology, b 5 # .07, t 5 2.50, p 5 .01, R2 5 .005. These associations are shown in Figure 1. It is worth noting that all significant reported associations remained significant following supplementary analysis without individuals who could be deemed as outliers, although their reactivity values were physiologically possible. The same associations were evident in the analyses of possible caseness; 180 (11%) and 606 (43%) met the possible caseness criteria for depression and anxiety, respectively, at wave 3, and 158 (13%) and 469 (38%) participants at Wave 4, respectively. High HR, but not blood pressure, reactivity was protective against future depression and anxiety caseness; OR for each unit increase in HR reactivity (95%CI) 5 .98 (.97–1.00), p 5 .05, and OR (95%CI) 5 .99 (.97–1.00), p 5 .02, respectively. Next we revisited the linear associations for HR reactivity above, but this time adjusting for HADS scores at the earlier follow-up. As indicated, we have already shown a cross-sectional relationship between cardiovascular reactivity and depression and anxiety (Carroll et al., 2007). HADS scores at the two follow-ups were, as expected, correlated, r(1218) 5 .53, po.001, and r(1217) 5 .60, po.001 for depression and anxiety scores, respectively. There was a significant increase in depressive symptoms, F(1,1219) 5 6.57, p 5 .01, Z2 5 .005, and decrease in anxiety scores, F(1,1218) 5 7.51, p 5 .01, Z2 5 .006, for the whole cohort over time. Thus, adjusting for HADS scores at the third follow-up will indicate whether HR reactivity predicts symp-
Cardiac reactivity and depression
145
HADS depression score at wave 4
Table 3. Final Regression Models Predicting Depressive Symptoms at Wave 4, Adjusting for Depressive Symptoms at Baseline and Covariates Model
HADS depression score at wave 4
Heart rate reactivity at wave 3
Heart rate reactivity at wave 3 Figure 1. Associations between cardiovascular reactivity at Wave 3 and HADS scores at Wave 4.
tomatology at the fourth follow-up independently. In essence, this amounts to examining whether reactivity predicts the change in depression and anxiety symptoms between follow-ups. For depression scores, a significant negative association with HR reactivity was still evident following such adjustment, b 5 ! .05, t 5 2.19, p 5 .03, DR2 5 .003. However, the prospective association between HR reactivity and future anxiety symptoms was no longer significant, b 5 ! .03, t 5 1.61, p 5 .11, DR2 5 .001. In a model that additionally adjusted for age cohort, sex, household occupational group, PASAT performance score, antidepressant medication, and antihypertensive medication, HR reactivity continued to predict future depression scores, b 5 ! .05, t 5 2.05, p 5 .04, DR2 5 .002. The statistics for the full model are shown in Table 3. Discussion The mean HADS depression and anxiety scores for participants in the present study at the fourth follow-up were much the same as those we reported previously at the third follow-up (Carroll et al., 2007). They were also broadly similar to those reported by others in large nonclinical adult samples (Crawford, Henry, Crombie, & Taylor, 2001). In addition, the sociodemographic patterning of symptoms of depression and anxiety reported ear-
1. HADS depression baseline 2. HADS depression baseline Age cohort Sex Occupational group 3. HADS depression baseline Age cohort Sex Occupational group PASAT score Antidepressant medication Antihypertensive medication 4. HADS depression baseline Age cohort Sex Occupational group PASAT score Antidepressant medication Antihypertensive medication HR reactivity
b
t
p
R2
.53 .53 ! .02 .03 .06 .52 ! .02 .03 .06 .002 .06 .01 .51 ! .03 .03 .06 .01 .06 .01 ! .05
21.75 21.43 0.81 1.37 2.46 20.74 ! 0.95 1.27 2.40 0.06 2.27 0.36 20.61 ! 1.21 1.17 2.29 0.45 2.22 0.28 ! 2.05
o.001 o.001 .42 .17 .01 o.001 .34 .21 .02 .95 .02 .72 o.001 .23 .24 .02 .66 .03 .78 .04
.281
.285
.288
.291
lier (Carroll et al., 2007) was also evident at the fourth follow-up. Depression scores were higher in women, participants from manual occupational households, and those in the middle and oldest cohorts. Variations in depression and depressive symptomatology with sex (e.g., Maier et al., 1999; Piccinelli & Wilkinson, 2000; Weissman & Merikangas, 1986) and socioeconomic status (e.g., Bruce, Takeuchi, & Leaf, 1991; Dohrenwend et al., 1992; Stansfeld, Head, Fuhrer, Wardle, & Cattell, 2003) are well documented. However, previous data on age and depression are inconsistent. Although studies have observed an increase in depression in the elderly (Kessler, Foster, Webster, & House, 1992; Mirowsky & Reynolds, 2000), others report a negative relationship between age and depression and symptoms of depression (Charles, Reynolds, & Gatz, 2001; Turner & Noh, 1988). It has been argued that the former can be attributed to age variations in chronic health and that normally functioning older adults are at no greater risk for depression than younger adults (Roberts, Kaplan, Shema, & Strawbridge, 1997). It is worth noting here that the association between age cohort and depressive symptom score at Wave 4 in the present study was attenuated when self-reported disability status was taken into account (analysis not reported). Anxiety levels varied similarly with sex and occupational status: again results not without precedent (e.g., Fryers, Melzer, & Jenkins, 2003; Reich, 1986). In the case of HADS anxiety, levels declined with age; this has also been reported by others (e.g., Weissman & Merikangas, 1986). As indicated in earlier reports from this study (Carroll et al., 2007; Carroll, Phillips, Ring, Der, & Hunt, 2005; Phillips, Carroll, Hunt, & Der, 2006; Phillips, Carroll, Ring, Sweeting, & West, 2005), the acute stress was successful in perturbing cardiovascular activity. In addition, in earlier cross-sectional analyses of these data, cardiovascular reactivity was found to be negatively related to symptoms of depression and anxiety (Carroll et al., 2007). The present analyses, though, are the first we know to demonstrate a prospective association between HR reactivity and symptoms of depression; those with high HR reactivity exhibited lower depression scores 5 years later. This association remained significant following adjustment for de-
146 pression scores at the earlier time point. Thus, our results amount to more than a replication of our earlier findings. Even taking into account the correlation between symptom scores at the two follow-ups, HR reactivity was still associated with depressive symptomatology at the later time point. The analogous negative association between HR reactivity and symptoms of anxiety was no longer statistically significant following such adjustment. In addition, the relationship with depression additionally survived adjustment for both sociodemographics and medication status. The relative effect sizes for HR reactivity compared to sociodemographic variables also suggest that when earlier symptoms of depression are taken into account, low reactivity is as important, if not more so, than age and gender in the prediction of future depressive symptoms, at least in the present sample. Whereas high cardiovascular reactivity would appear to hold implications for the development and course of inflammatory disease (Carroll, Phillips, & Lovallo, 2009), low reactivity is not without its health and behavioral correlates. For example, we have recently found that individuals who mounted a poor antibody response to influenza vaccination showed lower and less sustained cardiovascular reactions to acute stress than those who mounted a good antibody response (Phillips, Carroll, Burns, & Drayson, 2009). Further, habitual smokers have been consistently found to show blunted cardiovascular reactivity (Girdler, Jamner, Jarvik, Soles, & Shapiro, 1997; Phillips et al., 2009; Roy, Steptoe, & Kirschbaum, 1994). It is unlikely that these effects reflect temporary abstinence during stress testing, as low cardiovascular reactivity has been observed in female smokers regardless of whether they were wearing a nicotine replacement patch or not (Girdler et al., 1997). In addition, blunted reactivity has been found to predict relapse among smokers who have recently quit smoking (al’Absi, 2006; al’Absi, Hatsukami, & Davis, 2005). Those addicted to alcohol have also been found to exhibit blunted cardiovascular reactivity (Lovallo, Dickensheets, Myers, Thomas, & Nixon, 2000; Panknin, Dickensheets, Nixon, & Lovallo, 2002), and relatively low reactivity would appear to be a characteristic of nonalcoholics with a family history of alcoholism (Sorocco, Lovallo, Vincent, & Collins, 2006). Thus, low reactivity not only characterizes those addicted to smoking and alcohol, but it may also be a risk marker of some prognostic significance (Lovallo, 2006). In short, cardiovascular disease outcomes aside, low reactivity would appear to be associated with a number of negative health outcomes. What might be the mechanisms underlying this prospective link between HR reactivity and symptoms of depression? One possibility is altered sympathetic nervous system function. However, the prevailing wisdom is that depression and symptoms of depression are associated with increased, not decreased, sympathetic nervous system activity, as indexed by a shift-enhanced cardiac sympathetic activity relative to vagal tone (Carney et al., 1988), increased plasma noradrenaline concentrations (Rudorfer et al., 1985), and increased 24-h urinary noradrenaline excretion (Hughes, Watkins, Blumenthal, Kuhn, & Sherwood, 2004) in individuals with depression or depressive symptomatology. However, this tells us only about the tonic state. It does not indicate how the system responds to challenge. More pertinent to reactivity is the status and responsiveness of b-adrenergic receptors. There is some evidence that individuals with depression or depressive symptomatology have fewer b-adrenergic receptor binding sites (Pandey, Janicak, & Davis, 1987) and show decreased b-adrenergic receptor responsiveness (Mazzola-Pomietto, Azorin, Tramoni, & Jeanningros, 1994; Yu, Kang, Ziegler,
A.C. Phillips et al. Mills, & Dimsdale, 2008). What we might tentatively speculate is that blunted b-adrenergic receptor responsiveness, as indexed by low HR reactivity to acute stress in the present study, may be a risk marker for developing high levels of depressive symptomatology. However, mental stress tasks have been shown to be associated with vagal withdrawal as well as beta-adrenergic activation (Sloan, Korten, & Myers, 1991). There is also evidence that individuals with less vagal withdrawal during a film clip have higher levels of depressive symptoms (Gentzler, Santucci, Kovacs, & Fox, 2009). Indeed, depressed individuals with greater vagal withdrawal during film clips were more likely to subsequently recover from depression (Rottenberg, Salomon, Gross, & Gotlib, 2005). Thus, it is possible that there is a common brain mechanism regulating vagal nerve activity and depressive symptoms. However, in the absence of a measure of parasympathetic withdrawal in the present study, we are reluctant to speculate further. Another possible explanation for the observed association between HR reactivity and symptoms of depression is a common genetic pathway. For example, polymorphisms of the serotonin transporter gene (5HTTLPR), which plays a key role in determining the magnitude and duration of both the central and peripheral actions of serotonin, would appear to be implicated in emotional regulation and physiological reactivity. Activity of the 5HTTLPR long allele is almost twice that of the short allele. Those with long alleles have also been found to exhibit higher blood pressure and heart rate reactions to a laboratory stress task (Williams et al., 2001, 2008). Further, men who are homozygous for the long allele have been found to score lower on measures of anxiety and depression (Lesch et al., 1996). However, a recent meta-analysis reported that the 5HTTLPR genotype did not appear to be associated with major depression, particularly in association with high life events (Risch et al., 2009). Nevertheless, because symptoms of depression in the general population and major depressive disorder may be distinct phenomena, this avenue seems worthy of further investigation. The present study suffers from a number of limitations. First, the effect sizes were small. This, though, was our a priori expectation based on previous research and reinforces the value of large samples when we examine some of the more subtle correlates of cardiovascular reactivity. However, with such small effects, it is difficult to discern their clinical significance. Nevertheless, were this finding to be replicated, low HR reactivity might afford a further, albeit modest, risk marker for the development of depressive symptomatology. Second, there remains the possibility of residual confounding as a result of some poorly measured or unexamined variable (Christenfeld, Sloan, Carroll, & Greenland, 2004). However, we did adjust for the most likely candidates, and although the negative association between HR reactivity and subsequent symptoms of depression was attenuated, it remained statistically significant. Finally, only blood pressure and HR were measured. Although, it would have been useful to have a more comprehensive assessment of haemodynamics of the sort afforded by impedance cardiography, the large sample and the decision to test participants in their homes precluded this. In conclusion, cardiac reactions to acute psychological stress were negatively associated with depressive symptomatology 5 years later. This association withstood adjustment for symptom scores at the earlier time point as well as for sociodemographic factors and medication status. The mechanisms underlying this prospective relationship remain to be determined, but blunted b-adrenergic receptor responsiveness and common genetic pathways would seem worthy of further study.
Cardiac reactivity and depression
147 REFERENCES
al’Absi, M. (2006). Hypothalamic-pituitary-adrenocortical responses to psychological stress and risk for smoking relapse. International Journal of Psychophysiology, 59, 218–227. al’Absi, M., Hatsukami, D., & Davis, G. L. (2005). Attenuated adrenocorticotropic responses to psychological stress are associated with early smoking relapse. Psychopharmacology, 181, 107–117. Allen, M. T., Matthews, K. A., & Sherman, F. S. (1997). Cardiovascular reactivity to stress and left ventricular mass in youth. Hypertension, 30, 782–787. Barnett, P. A., Spence, J. D., Manuck, S. B., & Jennings, J. R. (1997). Psychological stress and the progression of carotid artery disease. Journal of Hypertension, 15, 49–55. Benzeval, M., Der, G., Ellaway, A., Hunt, K., Sweeting, H., West, P., et al. (2009). Cohort profile: West of Scotland Twenty-07 Study: Health in the community. International Journal of Epidemiology, 38, 1215–1223. Bjelland, I., Dahl, A. A., Haug, T. T., & Neckelmann, D. (2002). The validity of the Hospital Anxiety and Depression Scale. An updated literature review. Journal of Psychosomatic Research, 52, 69–77. Bramley, P. N., Easton, A. M., Morley, S., & Snaith, R. P. (1988). The differentiation of anxiety and depression by rating scales. Acta Psychiatrica Scandinavica, 77, 133–138. Bruce, M. L., Takeuchi, D. T., & Leaf, P. J. (1991). Poverty and psychiatric status. Longitudinal evidence from the New Haven Epidemiologic Catchment Area study. Archives of General Psychiatry, 48, 470–474. Carney, R. M., Rich, M. W., teVelde, A., Saini, J., Clark, K., & Freedland, K. E. (1988). The relationship between heart rate, heart rate variability and depression in patients with coronary artery disease. Journal of Psychosomatic Research, 32, 159–164. Carroll, D., Harrison, L. K., Johnston, D. W., Ford, G., Hunt, K., Der, G., et al. (2000). Cardiovascular reactions to psychological stress: The influence of demographic variables. Journal of Epidemiology and Community Health, 54, 876–877. Carroll, D., Phillips, A. C., Hunt, K., & Der, G. (2007). Symptoms of depression and cardiovascular reactions to acute psychological stress: Evidence from a population study. Biological Psychology, 75, 68–74. Carroll, D., Phillips, A. C., & Lovallo, W. R. (2009). Are large physiological reactions to acute psychological stress always bad for health? Social and Personality Compass, 3, 725–743. Carroll, D., Phillips, A. C., Ring, C., Der, G., & Hunt, K. (2005). Life events and hemodynamic stress reactivity in the middle-aged and elderly. Psychophysiology, 42, 269–276. Carroll, D., Ring, C., Hunt, K., Ford, G., & Macintyre, S. (2003). Blood pressure reactions to stress and the prediction of future blood pressure: Effects of sex, age, and socioeconomic position. Psychosomatic Medicine, 65, 1058–1064. Charles, S. T., Reynolds, C. A., & Gatz, M. (2001). Age-related differences and change in positive and negative affect over 23 years. Journal of Personality and Social Psychology, 80, 136–151. Christenfeld, N. J., Sloan, R. P., Carroll, D., & Greenland, S. (2004). Risk factors, confounding, and the illusion of statistical control. Psychosomatic Medicine, 66, 868–875. Crawford, J. R., Henry, J. D., Crombie, C., & Taylor, E. P. (2001). Normative data for the HADS from a large non-clinical sample. British Journal of Clinical Psychology, 40, 429–434. Der, G. (1998). A comparison of the West of Scotland Twenty-07 study sample and the 1991 census SARs. Working Paper No. 60. Glasgow: MRC Medical Sociology Unit. Dohrenwend, B. P., Levav, I., Shrout, P. E., Schwartz, S., Naveh, G., Link, B. G., et al. (1992). Socioeconomic status and psychiatric disorders: The causation-selection issue. Science, 255, 946–952. Fryers, T., Melzer, D., & Jenkins, R. (2003). Social inequalities and the common mental disorders: A systematic review of the evidence. Social Psychiatry and Psychiatric Epidemiology, 38, 229–237. Gentzler, A. L., Santucci, A. K., Kovacs, M., & Fox, N. A. (2009). Respiratory sinus arrhythmia reactivity predicts emotion regulation and depressive symptoms in at-risk and control children. Biological Psychology, 82, 156–163. Girdler, S. S., Jamner, L. D., Jarvik, M., Soles, J. R., & Shapiro, D. (1997). Smoking status and nicotine administration differentially modify hemodynamic stress reactivity in men and women. Psychosomatic Medicine, 59, 294–306.
Hemingway, H., & Marmot, M. (1999). Evidence based cardiology: Psychosocial factors in the aetiology and prognosis of coronary heart disease. Systematic review of prospective cohort studies. British Medical Journal, 318, 1460–1467. Herrmann, C. (1997). International experiences with the Hospital Anxiety and Depression ScaleFA review of validation data and clinical results. Journal of Psychosomatic Research, 42, 17–41. Hughes, J. W., Watkins, L., Blumenthal, J. A., Kuhn, C., & Sherwood, A. (2004). Depression and anxiety symptoms are related to increased 24-hour urinary norepinephrine excretion among healthy middleaged women. Journal of Psychosomatic Research, 57, 353–358. Kamarck, T. W., Everson, S. A., Kaplan, G. A., Manuck, S. B., Jennings, J. R., Salonen, R., et al. (1997). Exaggerated blood pressure responses during mental stress are associated with enhanced carotid atherosclerosis in middle-aged Finnish men: Findings from the Kuopio Ischemic Heart Disease Study. Circulation, 96, 3842–3848. Kessler, R. C., Foster, C., Webster, P. S., & House, J. S. (1992). The relationship between age and depressive symptoms in two national surveys. Psychology of Aging, 7, 119–126. Kibler, J. L., & Ma, M. (2004). Depressive symptoms and cardiovascular reactivity to laboratory behavioral stress. International Journal of Behavioural Medicine, 11, 81–87. Lesch, K. P., Bengel, D., Heils, A., Sabol, S. Z., Greenberg, B. D., Petri, S., et al. (1996). Association of anxiety-related traits with a polymorphism in the serotonin transporter gene regulatory region. Science, 274, 1527–1531. Light, K. C., Kothandapani, R. V., & Allen, M. T. (1998). Enhanced cardiovascular and catecholamine responses in women with depressive symptoms. International Journal of Psychophysiology, 28, 157– 166. Lovallo, W. R. (2006). Cortisol secretion patterns in addiction and addiction risk. International Journal of Psychophysiology, 59, 195–202. Lovallo, W. R., Dickensheets, S. L., Myers, D. A., Thomas, T. L., & Nixon, S. J. (2000). Blunted stress cortisol response in abstinent alcoholic and polysubstance-abusing men. Alcohol: Clinical Experimental Research, 24, 651–658. Lovallo, W. R., & Gerin, W. (2003). Psychophysiological reactivity: Mechanisms and pathways to cardiovascular disease. Psychosomatic Medicine, 65, 36–45. Lynch, J. W., Everson, S. A., Kaplan, G. A., Salonen, R., & Salonen, J. T. (1998). Does low socioeconomic status potentiate the effects of heightened cardiovascular responses to stress on the progression of carotid atherosclerosis? American Journal of Public Health, 88, 389–394. Maier, W., Gansicke, M., Gater, R., Rezaki, M., Tiemens, B., & Urzua, R. F. (1999). Gender differences in the prevalence of depression: A survey in primary care. Journal of Affective Disorders, 53, 241–252. Markovitz, J. H., Raczynski, J. M., Wallace, D., Chettur, V., & Chesney, M. A. (1998). Cardiovascular reactivity to video game predicts subsequent blood pressure increases in young men: The CARDIA study. Psychosomatic Medicine, 60, 186–191. Mazzola-Pomietto, P., Azorin, J. M., Tramoni, V., & Jeanningros, R. (1994). Relation between lymphocyte beta-adrenergic responsivity and the severity of depressive disorders. Biological Psychiatry, 35, 920–925. Mirowsky, J., & Reynolds, J. R. (2000). Age, depression, and attrition in the National Surveyof Families and Households. Sociological Methods Research, 28, 476–504. Moorey, S., Greer, S., Watson, M., Gorman, C., Rowden, L., Tunmore, R., et al. (1991). The factor structure and factor stability of the hospital anxiety and depression scale in patients with cancer. British Journal of Psychiatry, 158, 255–259. O’Brien, E., Waeber, B., Parati, G., Staessen, J., & Myers, M. G. (2001). Blood pressure measuring devices: Recommendations of the European Society of Hypertension. British Medical Journal, 322, 531–536. Pandey, G. N., Janicak, P. G., & Davis, J. M. (1987). Decreased betaadrenergic receptors in the leukocytes of depressed patients. Psychiatry Research, 22, 265–273. Panknin, T. L., Dickensheets, S. L., Nixon, S. J., & Lovallo, W. R. (2002). Attenuated heart rate responses to public speaking in individuals with alcohol dependence. Alcohol: Clinical and Experimental Research, 26, 841–847.
148 Phillips, A. C., Carroll, D., Burns, V. E., & Drayson, M. T. (2009). Cardiovascular activity and the antibody response to vaccination. Journal of Psychosomatic Research, 67, 37–43. Phillips, A. C., Carroll, D., Hunt, K., & Der, G. (2006). The effects of the spontaneous presence of a spouse/partner and others on cardiovascular reactions to an acute psychological challenge. Psychophysiology, 43, 633–640. Phillips, A. C., Carroll, D., Ring, C., Sweeting, H., & West, P. (2005). Life events and acute cardiovascular reactions to mental stress: A cohort study. Psychosomatic Medicine, 67, 384–392. Piccinelli, M., & Wilkinson, G. (2000). Gender differences in depression. Critical review. British Journal of Psychiatry, 177, 486–492. Registrar General’s Classification of Occupations (1980). London: HMSO. Reich, J. (1986). The epidemiology of anxiety. Journal of Nervous and Mental Disorders, 174, 129–136. Ring, C., Burns, V. E., & Carroll, D. (2002). Shifting hemodynamics of blood pressure control during prolonged mental stress. Psychophysiology, 39, 585–590. Ring, C., Carroll, D., Willemsen, G., Cooke, J., Ferraro, A., & Drayson, M. (1999). Secretory immunoglobulin A and cardiovascular activity during mental arithmetic and paced breathing. Psychophysiology, 36, 602–609. Risch, N., Herrell, R., Lehner, T., Liang, K. Y., Eaves, L., Hoh, J., et al. (2009). Interaction between the serotonin transporter gene (5HTTLPR), stressful life events, and risk of depression: A meta-analysis. Journal of the American Medical Association, 301, 2462–2471. Roberts, R. E., Kaplan, G. A., Shema, S. J., & Strawbridge, W. J. (1997). Does growing old increase the risk for depression? American Journal of Psychiatry, 154, 1384–1390. Rottenberg, J., Salomon, K., Gross, J. J., & Gotlib, I. H. (2005). Vagal withdrawal to a sad film predicts subsequent recovery from depression. Psychophysiology, 42, 277–281. Roy, M. P., Steptoe, A., & Kirschbaum, C. (1994). Association between smoking status and cardiovascular and cortisol stress responsivity in healthy young men. International Journal of Behavioral Medicine, 1, 264–283. Rudorfer, M. V., Ross, R. J., Linnoila, M., Sherer, M. A., & Potter, W. Z. (1985). Exaggerated orthostatic responsivity of plasma norepinephrine in depression. Archives of General Psychiatry, 42, 1186– 1192. Schwartz, A. R., Gerin, W., Davidson, K. W., Pickering, T. G., Brosschot, J. F., Thayer, J. F., et al. (2003). Toward a causal model of cardiovascular responses to stress and the development of cardiovascular disease. Psychosomatic Medicine, 65, 22–35. Sloan, R. P., Korten, J. B., & Myers, M. M. (1991). Components of heart rate reactivity during mental arithmetic with and without speaking. Physiology of Behavior, 50, 1039–1045. Sorocco, K. H., Lovallo, W. R., Vincent, A. S., & Collins, F. L. (2006). Blunted hypothalamic-pituitary-adrenocortical axis responsivity to
A.C. Phillips et al. stress in persons with a family history of alcoholism. International Journal of Psychophysiology, 59, 210–217. Stansfeld, S. A., Head, J., Fuhrer, R., Wardle, J., & Cattell, V. (2003). Social inequalities in depressive symptoms and physical functioning in the Whitehall II study: Exploring a common cause explanation. Journal of Epidemiology and Community Health, 57, 361–367. Treiber, F. A., Kamarck, T., Schneiderman, N., Sheffield, D., Kapuku, G., & Taylor, T. (2003). Cardiovascular reactivity and development of preclinical and clinical disease states. Psychosomatic Medicine, 65, 46–62. Turner, R. J., & Noh, S. (1988). Physical disability and depression: A longitudinal analysis. Journal of Health and Social Behavior, 29, 23–37. Weissman, M. M., & Merikangas, K. R. (1986). The epidemiology of anxiety and panic disorders: An update. Journal of Clinical Psychiatry, 47(Suppl), 11–17. Willemsen, G., Ring, C., Carroll, D., Evans, P., Clow, A., & Hucklebridge, F. (1998). Secretory immunoglobulin A and cardiovascular reactions to mental arithmetic and cold pressor. Psychophysiology, 35, 252–259. Williams, R. B., Marchuk, D. A., Gadde, K. M., Barefoot, J. C., Grichnik, K., Helms, M. J., et al. (2001). Central nervous system serotonin function and cardiovascular responses to stress. Psychosomatic Medicine, 63, 300–305. Williams, R. B., Marchuk, D. A., Siegler, I. C., Barefoot, J. C., Helms, M. J., Brummett, B. H., et al. (2008). Childhood socioeconomic status and serotonin transporter gene polymorphism enhance cardiovascular reactivity to mental stress. Psychosomatic Medicine, 70, 32–39. Winzer, A., Ring, C., Carroll, D., Willemsen, G., Drayson, M., & Kendall, M. (1999). Secretory immunoglobulin A and cardiovascular reactions to mental arithmetic, cold pressor, and exercise: Effects of beta-adrenergic blockade. Psychophysiology, 36, 591–601. Wulsin, L. R., Vaillant, G. E., & Wells, V. E. (1999). A systematic review of the mortality of depression. Psychosomatic Medicine, 61, 6–17. York, K. M., Hassan, M., Li, Q., Li, H., Fillingim, R. B., & Sheps, D. S. (2007). Coronary artery disease and depression: Patients with more depressive symptoms have lower cardiovascular reactivity during laboratory-induced mental stress. Psychosomatic Medicine, 69, 521–528. Yu, B. H., Kang, E. H., Ziegler, M. G., Mills, P. J., & Dimsdale, J. E. (2008). Mood states, sympathetic activity, and in vivo beta-adrenergic receptor function in a normal population. Depression and Anxiety, 25, 559–564. Zigmond, A. S., & Snaith, R. P. (1983). The hospital anxiety and depression scale. Acta Psychiatrica Scandinavica, 67, 361–370.
(Received August 26, 2009; Accepted January 30, 2010)